杰瑞科技汇

Python如何对接Pushgateway?

Of course! Using Python with a Pushgateway is a common pattern for exporting metrics from short-lived, batch, or on-demand jobs (like cron jobs, serverless functions, or CI/CD pipelines) to a monitoring system like Prometheus.

Python如何对接Pushgateway?-图1
(图片来源网络,侵删)

Here's a comprehensive guide covering the "what," "why," and "how" with Python code examples.


What is a Pushgateway and Why Use It?

Prometheus is designed to scrape metrics from long-running services (like web servers or APIs) that expose an HTTP endpoint (/metrics). However, this model doesn't work well for short-lived jobs.

A Pushgateway is an intermediary component that solves this problem. Here's the workflow:

  1. Your Python Job Runs: Your script executes (e.g., a data processing script, a backup job).
  2. Push Metrics: Instead of exposing a /metrics endpoint, your Python job pushes its metrics to the Pushgateway.
  3. Store Metrics: The Pushgateway stores these metrics, associating them with a unique job name and instance label (e.g., job="my_backup_script", instance="run-2025-10-27-10-00").
  4. Prometheus Scrapes: Prometheus scrapes the Pushgateway, not your job. It retrieves all the recently pushed metrics.

Key Benefits:

Python如何对接Pushgateway?-图2
(图片来源网络,侵删)
  • Decoupling: Your job doesn't need to run a web server just to expose metrics.
  • Reliability: If your job finishes before Prometheus scrapes it, the metrics are still safe in the Pushgateway.
  • Aggregation: The Pushgateway can store the latest metrics from multiple instances of the same job, which Prometheus can then scrape as a single target.

Important Caveat: The Pushgateway does not delete metrics on its own. It's a "push" model, not a "pull" model. You must explicitly tell it to delete old metrics. This is typically done by pushing an empty metric set with the same job/instance labels.


Prerequisites

Before you start, you need:

  1. A running Pushgateway instance. The easiest way is with Docker:

    docker run -d --name pushgateway -p 9091:9091 prom/pushgateway

    This makes the Pushgateway available at http://localhost:9091.

    Python如何对接Pushgateway?-图3
    (图片来源网络,侵删)
  2. A running Prometheus instance configured to scrape the Pushgateway.

    • Add this to your prometheus.yml:
      scrape_configs:
        - job_name: 'pushgateway'
          honor_labels: true # Important! See explanation below.
          static_configs:
            - targets: ['localhost:9091']
    • honor_labels: true is crucial. It tells Prometheus to trust the labels (like job, instance) sent by the Pushgateway, which is essential for correctly identifying your metrics.
  3. Python libraries. We'll use the official Prometheus client for Python.

    pip install prometheus_client

Python Code Examples

Let's look at the most common scenarios.

Example 1: Basic Counter and Gauge

This script pushes a simple counter and a gauge to the Pushgateway.

# push_example.py
import time
from prometheus_client import Counter, Gauge, CollectorRegistry, push_to_gateway
from prometheus_client.exposition import basic_auth_handler
# Define your metrics
# The 'registry' is crucial. It isolates the metrics you're about to push.
registry = CollectorRegistry()
# A counter that can only go up
REQUEST_COUNT = Counter(
    'my_job_requests_total',
    'Total number of requests processed by my job.',
    registry=registry
)
# A gauge that can go up or down
ITEMS_IN_QUEUE = Gauge(
    'my_job_queue_size',
    'Current number of items in the processing queue.',
    registry=registry
)
# --- Main logic of your job ---
print("Starting job...")
# Simulate some work
for i in range(5):
    REQUEST_COUNT.inc()  # Increment the counter
    ITEMS_IN_QUEUE.set(10 - i) # Decrement the gauge
    print(f"Processed request. Total: {REQUEST_COUNT._value.get()}. Queue: {ITEMS_IN_QUEUE._value.get()}")
    time.sleep(1)
print("Job finished. Pushing metrics to Pushgateway...")
# --- Pushing to the Pushgateway ---
# Define the job and instance labels
# 'job' is a mandatory label. 'instance' is good practice to identify different runs.
JOB_NAME = 'my_python_batch_job'
INSTANCE_NAME = 'run-12345' # Could be a timestamp, a unique ID, etc.
# Push the metrics
# The 'handler' is for authentication if your Pushgateway is protected.
push_to_gateway(
    'localhost:9091',
    job=JOB_NAME,
    registry=registry,
    handler=basic_auth_handler('username', 'password') # Optional: if auth is enabled
)
print("Metrics pushed successfully!")
# To demonstrate cleanup, we would push an empty registry in a real script
# push_to_gateway('localhost:9091', job=JOB_NAME, registry=CollectorRegistry(), ...)

To run this:

python push_example.py

To view the metrics in Prometheus:

  1. Open your Prometheus UI (usually http://localhost:9090).
  2. Go to the "Graph" tab.
  3. In the query box, try:
    • my_job_requests_total (should be 5)
    • my_job_queue_size (should be 5, the last value set)

Example 2: Histogram

Histograms are useful for measuring the distribution of values (e.g., request durations).

# push_histogram_example.py
import time
import random
from prometheus_client import Histogram, CollectorRegistry, push_to_gateway
registry = CollectorRegistry()
# A histogram for measuring request durations in seconds
REQUEST_DURATION = Histogram(
    'my_job_request_duration_seconds',
    'Duration of requests in my job.',
    registry=registry
)
print("Starting job with histogram...")
# Simulate work with varying durations
for _ in range(10):
    # Simulate a request taking some time
    start_time = time.time()
    time.sleep(random.uniform(0.1, 0.5))
    REQUEST_DURATION.observe(time.time() - start_time)
print("Job finished. Pushing histogram metrics to Pushgateway...")
push_to_gateway(
    'localhost:9091',
    job='my_python_histogram_job',
    registry=registry,
    instance='run-histogram-67890'
)
print("Histogram metrics pushed successfully!")

To query the histogram in Prometheus:

  • rate(my_job_request_duration_seconds_sum[5m]) (sum of observed values)
  • rate(my_job_request_duration_seconds_count[5m]) (total number of observations)
  • histogram_quantile(0.95, rate(my_job_request_duration_seconds_bucket[5m])) (95th percentile)

Best Practices and Important Considerations

Always Use a Custom Registry

Never use the default global registry (prometheus_client.REGISTRY) when pushing. It contains all the default metrics (like process and Python memory stats) that you probably don't want to push to the Pushgateway for your specific job.

# Correct
registry = CollectorRegistry()
my_counter = Counter(..., registry=registry)
# Incorrect (will push a lot of extra metrics)
my_counter = Counter(...)

Labeling is Key

The job and instance labels are your primary tools for organizing metrics in Prometheus.

  • job: Should describe the type of job (e.g., my_daily_backup, ci_pipeline_runner).
  • instance: Should uniquely identify a specific run of the job. A good practice is to use a timestamp or a unique ID from your job scheduler (e.g., run-<timestamp>, pipeline-<id>).

Cleaning Up Old Metrics (Crucial!)

The Pushgateway will keep your metrics forever until you explicitly delete them. This can lead to a massive buildup of stale data.

The standard way to clean up is to push an empty metric set with the same job and instance labels at the end of your job.

# At the very end of your script...
print("Cleaning up old metrics from Pushgateway...")
# Push an empty registry to delete the metrics for this specific job/instance
push_to_gateway(
    'localhost:9091',
    job=JOB_NAME,
    registry=CollectorRegistry(), # Empty registry!
    instance=INSTANCE_NAME
)
print("Cleanup complete.")

Error Handling

Your job should not fail if it can't push metrics, but it should log the error.

try:
    push_to_gateway('localhost:9091', job=JOB_NAME, registry=registry)
except Exception as e:
    print(f"Warning: Failed to push metrics to Pushgateway: {e}")

Pushing to Multiple Jobs

If your job produces metrics for different logical components, you can push them in separate calls, each with a different instance label.

registry1 = CollectorRegistry()
task1_counter = Counter('task1_items', registry=registry1)
# ... do work for task 1
push_to_gateway('localhost:9091', job='my_complex_job', registry=registry1, instance='task1-run-1')
registry2 = CollectorRegistry()
task2_counter = Counter('task2_items', registry=registry2)
# ... do work for task 2
push_to_gateway('localhost:9091', job='my_complex_job', registry=registry2, instance='task2-run-1')

Authentication

If your Pushgateway is protected with basic HTTP authentication, you must provide a handler.

from prometheus_client.exposition import basic_auth_handler
# This will add a Authorization header
push_to_gateway(
    'my-secure-gateway.com:9091',
    job='my_job',
    registry=registry,
    handler=basic_auth_handler('my_user', 'my_secret')
)
分享:
扫描分享到社交APP
上一篇
下一篇