杰瑞科技汇

zoomeye api python

Of course! Here is a comprehensive guide on how to use the ZoomEye API with Python, including code examples, explanations, and best practices.

What is ZoomEye?

ZoomEye is a powerful search engine for cyberspace. It allows you to search for devices, services, and applications on the internet using specific queries.

  • Host Search: Search for specific hosts (e.g., ip:"8.8.8.8").
  • Web Search: Search for websites and web technologies (e.g., web:"WordPress").
  • API Access: ZoomEye provides a RESTful API that allows you to automate your searches and integrate its data into your own applications and tools.

Prerequisites

  1. A ZoomEye Account: You need to register for a free account on the ZoomEye website.
  2. An API Key: After logging in, go to your Profile page. You will find your API Key there. This key is essential for authenticating your requests.
  3. Python and requests library: You'll need Python installed. You'll also need the requests library to make HTTP requests. If you don't have it, install it:
    pip install requests

Authentication

Every request to the ZoomEye API must include your API key in the API-KEY HTTP header.

import requests
# --- IMPORTANT ---
# Replace with your actual ZoomEye API Key
ZOOMeye_API_KEY = "YOUR_API_KEY_HERE"
# Set up the request headers
headers = {
    "API-KEY": ZOOMeye_API_KEY
}

The Core Workflow: Search, Wait, Retrieve

The ZoomEye API follows an asynchronous pattern for large searches to avoid timeouts and manage server load.

  1. Search (/search): You submit your search query. The API immediately returns a search_id and tells you if the search is done or still pending.
  2. Poll (Wait): If the search is pending, you need to wait a bit and then poll the API using the search_id to check if the results are ready.
  3. Fetch Results (/result): Once the search is done, you can fetch the actual search results using the search_id.

Complete Example: Searching for Hosts

This script demonstrates the entire workflow for a host search.

import requests
import time
import json
# --- Configuration ---
# Replace with your actual ZoomEye API Key
ZOOMeye_API_KEY = "YOUR_API_KEY_HERE" 
# The search query. Examples:
# - Search for specific IPs: 'ip:"8.8.8.8"'
# - Search for a specific port: 'port:"22"'
# - Search for a specific service: 'service:"Apache httpd"'
# - Search for a specific OS: 'os:"Linux"'
# - Search for specific software in web search: 'web:"ThinkPHP"'
QUERY = 'port:"22"'
# ZoomEye API endpoints
SEARCH_URL = "https://api.zoomeye.org/host/search"
RESULT_URL = "https://api.zoomeye.org/host/result"
def search_zoomeye(query, page=1, facets=None):
    """
    Performs a search on ZoomEye.
    Returns the JSON response from the API.
    """
    headers = {
        "API-KEY": ZOOMeye_API_KEY
    }
    params = {
        'query': query,
        'page': page
    }
    if facets:
        params['facets'] = facets
    print(f"[*] Searching ZoomEye for query: '{query}' (Page {page})")
    try:
        response = requests.get(SEARCH_URL, headers=headers, params=params)
        response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"[!] Error during search: {e}")
        return None
def get_search_result(search_id, poll_interval=5):
    """
    Retrieves the results for a given search_id.
    Polls the API until the results are ready.
    """
    headers = {
        "API-KEY": ZOOMeye_API_KEY
    }
    params = {
        'id': search_id
    }
    print(f"[*] Waiting for results (Search ID: {search_id})...")
    while True:
        try:
            response = requests.get(RESULT_URL, headers=headers, params=params)
            response.raise_for_status()
            result = response.json()
            if result.get('matches') is not None:
                print("[+] Search complete! Fetching results...")
                return result
            elif result.get('error') == 'searching':
                print(f"[-] Search is still running... polling again in {poll_interval} seconds.")
                time.sleep(poll_interval)
            else:
                print(f"[!] An unexpected error occurred: {result}")
                return None
        except requests.exceptions.RequestException as e:
            print(f"[!] Error fetching results: {e}")
            return None
# --- Main Execution ---
if __name__ == "__main__":
    # Step 1: Perform the initial search
    search_data = search_zoomeye(QUERY)
    if not search_data:
        print("[!] Exiting due to search error.")
    elif search_data.get('total') == 0:
        print(f"[-] No results found for query: '{QUERY}'")
    else:
        # The API returns a 'matches' list with the actual data
        # and a 'total' count of all available results.
        total_results = search_data.get('total')
        print(f"[+] Found {total_results} total results.")
        # The search might be completed immediately for small result sets
        if search_data.get('matches'):
            print("[+] Search completed immediately. Displaying first 5 results.")
            for i, match in enumerate(search_data['matches'][:5]):
                print(f"--- Result {i+1} ---")
                print(json.dumps(match, indent=2))
            print("\n(To see all results, use the get_search_result function with the search_id)")
        else:
            # Step 2: If the search is pending, get the search_id and poll for results
            search_id = search_data.get('id')
            if search_id:
                all_results_data = get_search_result(search_id)
                # Step 3: Process the final results
                if all_results_data and all_results_data.get('matches'):
                    print(f"\n--- Displaying first 5 of {total_results} results ---")
                    for i, match in enumerate(all_results_data['matches'][:5]):
                        print(f"--- Result {i+1} ---")
                        # Pretty print the JSON
                        print(json.dumps(match, indent=2))
                else:
                    print("[!] No matches found in the final result.")

Handling Pagination

ZoomEye returns results in pages (default is 10 results per page). If you have more than 10 results, you need to paginate through them.

The search endpoint accepts a page parameter. You can loop through pages until you've fetched all the data.

Here's how you can modify the script to fetch all pages:

def fetch_all_pages(query, max_pages=10):
    """
    Fetches all pages of results for a given query.
    Be careful with large result sets, as this can be slow and use your API quota quickly.
    """
    all_matches = []
    page = 1
    while page <= max_pages:
        search_data = search_zoomeye(query, page=page)
        if not search_data or search_data.get('matches') is None:
            # No more results or an error occurred
            break
        matches_on_page = search_data.get('matches', [])
        all_matches.extend(matches_on_page)
        # Check if this was the last page
        if len(matches_on_page) == 0 or page * 10 >= search_data.get('total', 0):
            break
        page += 1
        # Add a small delay to be respectful to the API
        time.sleep(1) 
    return all_matches
# --- Example Usage of Pagination ---
if __name__ == "__main__":
    # (Make sure to set your API_KEY and QUERY)
    all_results = fetch_all_pages(QUERY, max_pages=5) # Fetch first 5 pages
    print(f"\n[*] Fetched a total of {len(all_results)} results across multiple pages.")
    for i, match in enumerate(all_results[:3]): # Print first 3 of the fetched results
        print(f"--- Result {i+1} ---")
        print(json.dumps(match, indent=2))

Web Search

The process for web search is almost identical. You just need to change the endpoint and the query type.

  • Endpoint: https://api.zoomeye.org/web/search
  • Endpoint for Results: https://api.zoomeye.org/web/result
  • Query: Use web-specific queries like web:"ThinkPHP" or app:"WordPress".

Here is a concise example for web search:

WEB_SEARCH_URL = "https://api.zoomeye.org/web/search"
WEB_RESULT_URL = "https://api.zoomeye.org/web/result"
def search_zoomeye_web(query):
    headers = {"API-KEY": ZOOMeye_API_KEY}
    params = {'query': query}
    print(f"[*] Searching ZoomEye Web for: '{query}'")
    try:
        response = requests.get(WEB_SEARCH_URL, headers=headers, params=params)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"[!] Error: {e}")
        return None
# --- Main Execution for Web Search ---
if __name__ == "__main__":
    web_query = 'web:"ThinkPHP"'
    search_data = search_zoomeye_web(web_query)
    if search_data and search_data.get('total') > 0:
        search_id = search_data.get('id')
        if search_data.get('matches'):
            print("[+] Web search completed immediately.")
            for match in search_data['matches'][:2]:
                print(f"--- Site: {match.get('site')} ---")
        else:
            result_data = get_search_result(search_id) # Re-using the host result fetcher function
            if result_data and result_data.get('matches'):
                print("\n--- Displaying first 2 web results ---")
                for match in result_data['matches'][:2]:
                    print(f"--- Site: {match.get('site')} ---")
                    print(json.dumps(match, indent=2))
    else:
        print("[-] No web results found.")

Best Practices and Important Notes

  1. Rate Limiting: The free API has a rate limit (e.g., 10 searches per minute). Always add small delays (time.sleep(1)) between requests, especially when paginating, to avoid being blocked.
  2. API Quota: Your account has a daily search quota. Be mindful of large or complex searches that can quickly consume it.
  3. Error Handling: Always check the status of your API responses (response.ok or response.raise_for_status()) and handle potential errors gracefully.
  4. Asynchronous Nature: Remember that large searches are not immediate. Your code must implement a polling mechanism to wait for the results.
  5. Data Usage: The amount of data you can download per search is also limited. For very large datasets, you might need to export them directly from the ZoomEye website or contact them for enterprise solutions.
分享:
扫描分享到社交APP
上一篇
下一篇