Of course! Here is a comprehensive guide on how to use the ZoomEye API with Python, including code examples, explanations, and best practices.
What is ZoomEye?
ZoomEye is a powerful search engine for cyberspace. It allows you to search for devices, services, and applications on the internet using specific queries.
- Host Search: Search for specific hosts (e.g.,
ip:"8.8.8.8"). - Web Search: Search for websites and web technologies (e.g.,
web:"WordPress"). - API Access: ZoomEye provides a RESTful API that allows you to automate your searches and integrate its data into your own applications and tools.
Prerequisites
- A ZoomEye Account: You need to register for a free account on the ZoomEye website.
- An API Key: After logging in, go to your Profile page. You will find your API Key there. This key is essential for authenticating your requests.
- Python and
requestslibrary: You'll need Python installed. You'll also need therequestslibrary to make HTTP requests. If you don't have it, install it:pip install requests
Authentication
Every request to the ZoomEye API must include your API key in the API-KEY HTTP header.
import requests
# --- IMPORTANT ---
# Replace with your actual ZoomEye API Key
ZOOMeye_API_KEY = "YOUR_API_KEY_HERE"
# Set up the request headers
headers = {
"API-KEY": ZOOMeye_API_KEY
}
The Core Workflow: Search, Wait, Retrieve
The ZoomEye API follows an asynchronous pattern for large searches to avoid timeouts and manage server load.
- Search (
/search): You submit your search query. The API immediately returns asearch_idand tells you if the search isdoneor stillpending. - Poll (Wait): If the search is
pending, you need to wait a bit and then poll the API using thesearch_idto check if the results are ready. - Fetch Results (
/result): Once the search isdone, you can fetch the actual search results using thesearch_id.
Complete Example: Searching for Hosts
This script demonstrates the entire workflow for a host search.
import requests
import time
import json
# --- Configuration ---
# Replace with your actual ZoomEye API Key
ZOOMeye_API_KEY = "YOUR_API_KEY_HERE"
# The search query. Examples:
# - Search for specific IPs: 'ip:"8.8.8.8"'
# - Search for a specific port: 'port:"22"'
# - Search for a specific service: 'service:"Apache httpd"'
# - Search for a specific OS: 'os:"Linux"'
# - Search for specific software in web search: 'web:"ThinkPHP"'
QUERY = 'port:"22"'
# ZoomEye API endpoints
SEARCH_URL = "https://api.zoomeye.org/host/search"
RESULT_URL = "https://api.zoomeye.org/host/result"
def search_zoomeye(query, page=1, facets=None):
"""
Performs a search on ZoomEye.
Returns the JSON response from the API.
"""
headers = {
"API-KEY": ZOOMeye_API_KEY
}
params = {
'query': query,
'page': page
}
if facets:
params['facets'] = facets
print(f"[*] Searching ZoomEye for query: '{query}' (Page {page})")
try:
response = requests.get(SEARCH_URL, headers=headers, params=params)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
return response.json()
except requests.exceptions.RequestException as e:
print(f"[!] Error during search: {e}")
return None
def get_search_result(search_id, poll_interval=5):
"""
Retrieves the results for a given search_id.
Polls the API until the results are ready.
"""
headers = {
"API-KEY": ZOOMeye_API_KEY
}
params = {
'id': search_id
}
print(f"[*] Waiting for results (Search ID: {search_id})...")
while True:
try:
response = requests.get(RESULT_URL, headers=headers, params=params)
response.raise_for_status()
result = response.json()
if result.get('matches') is not None:
print("[+] Search complete! Fetching results...")
return result
elif result.get('error') == 'searching':
print(f"[-] Search is still running... polling again in {poll_interval} seconds.")
time.sleep(poll_interval)
else:
print(f"[!] An unexpected error occurred: {result}")
return None
except requests.exceptions.RequestException as e:
print(f"[!] Error fetching results: {e}")
return None
# --- Main Execution ---
if __name__ == "__main__":
# Step 1: Perform the initial search
search_data = search_zoomeye(QUERY)
if not search_data:
print("[!] Exiting due to search error.")
elif search_data.get('total') == 0:
print(f"[-] No results found for query: '{QUERY}'")
else:
# The API returns a 'matches' list with the actual data
# and a 'total' count of all available results.
total_results = search_data.get('total')
print(f"[+] Found {total_results} total results.")
# The search might be completed immediately for small result sets
if search_data.get('matches'):
print("[+] Search completed immediately. Displaying first 5 results.")
for i, match in enumerate(search_data['matches'][:5]):
print(f"--- Result {i+1} ---")
print(json.dumps(match, indent=2))
print("\n(To see all results, use the get_search_result function with the search_id)")
else:
# Step 2: If the search is pending, get the search_id and poll for results
search_id = search_data.get('id')
if search_id:
all_results_data = get_search_result(search_id)
# Step 3: Process the final results
if all_results_data and all_results_data.get('matches'):
print(f"\n--- Displaying first 5 of {total_results} results ---")
for i, match in enumerate(all_results_data['matches'][:5]):
print(f"--- Result {i+1} ---")
# Pretty print the JSON
print(json.dumps(match, indent=2))
else:
print("[!] No matches found in the final result.")
Handling Pagination
ZoomEye returns results in pages (default is 10 results per page). If you have more than 10 results, you need to paginate through them.
The search endpoint accepts a page parameter. You can loop through pages until you've fetched all the data.
Here's how you can modify the script to fetch all pages:
def fetch_all_pages(query, max_pages=10):
"""
Fetches all pages of results for a given query.
Be careful with large result sets, as this can be slow and use your API quota quickly.
"""
all_matches = []
page = 1
while page <= max_pages:
search_data = search_zoomeye(query, page=page)
if not search_data or search_data.get('matches') is None:
# No more results or an error occurred
break
matches_on_page = search_data.get('matches', [])
all_matches.extend(matches_on_page)
# Check if this was the last page
if len(matches_on_page) == 0 or page * 10 >= search_data.get('total', 0):
break
page += 1
# Add a small delay to be respectful to the API
time.sleep(1)
return all_matches
# --- Example Usage of Pagination ---
if __name__ == "__main__":
# (Make sure to set your API_KEY and QUERY)
all_results = fetch_all_pages(QUERY, max_pages=5) # Fetch first 5 pages
print(f"\n[*] Fetched a total of {len(all_results)} results across multiple pages.")
for i, match in enumerate(all_results[:3]): # Print first 3 of the fetched results
print(f"--- Result {i+1} ---")
print(json.dumps(match, indent=2))
Web Search
The process for web search is almost identical. You just need to change the endpoint and the query type.
- Endpoint:
https://api.zoomeye.org/web/search - Endpoint for Results:
https://api.zoomeye.org/web/result - Query: Use web-specific queries like
web:"ThinkPHP"orapp:"WordPress".
Here is a concise example for web search:
WEB_SEARCH_URL = "https://api.zoomeye.org/web/search"
WEB_RESULT_URL = "https://api.zoomeye.org/web/result"
def search_zoomeye_web(query):
headers = {"API-KEY": ZOOMeye_API_KEY}
params = {'query': query}
print(f"[*] Searching ZoomEye Web for: '{query}'")
try:
response = requests.get(WEB_SEARCH_URL, headers=headers, params=params)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
print(f"[!] Error: {e}")
return None
# --- Main Execution for Web Search ---
if __name__ == "__main__":
web_query = 'web:"ThinkPHP"'
search_data = search_zoomeye_web(web_query)
if search_data and search_data.get('total') > 0:
search_id = search_data.get('id')
if search_data.get('matches'):
print("[+] Web search completed immediately.")
for match in search_data['matches'][:2]:
print(f"--- Site: {match.get('site')} ---")
else:
result_data = get_search_result(search_id) # Re-using the host result fetcher function
if result_data and result_data.get('matches'):
print("\n--- Displaying first 2 web results ---")
for match in result_data['matches'][:2]:
print(f"--- Site: {match.get('site')} ---")
print(json.dumps(match, indent=2))
else:
print("[-] No web results found.")
Best Practices and Important Notes
- Rate Limiting: The free API has a rate limit (e.g., 10 searches per minute). Always add small delays (
time.sleep(1)) between requests, especially when paginating, to avoid being blocked. - API Quota: Your account has a daily search quota. Be mindful of large or complex searches that can quickly consume it.
- Error Handling: Always check the status of your API responses (
response.okorresponse.raise_for_status()) and handle potential errors gracefully. - Asynchronous Nature: Remember that large searches are not immediate. Your code must implement a polling mechanism to wait for the results.
- Data Usage: The amount of data you can download per search is also limited. For very large datasets, you might need to export them directly from the ZoomEye website or contact them for enterprise solutions.
