杰瑞科技汇

Python OpenCV box怎么用?

Of course! The term "OpenCV box" can refer to a few different, but related, concepts in computer vision. I'll cover the most common ones, from the basic geometric representation to the powerful object detection model.

Here's a breakdown:

  1. The Geometric RotatedRect: A "box" as a shape with a center, size, and rotation angle.
  2. The Bounding Box: A rectangle that outlines an object in an image.
  3. The OpenCV DNN Module: Using a pre-trained object detection model (like YOLO) which outputs "boxes".

Let's dive into each with code examples.


The Geometric RotatedRect

This is the fundamental "box" object in OpenCV. It's not an axis-aligned rectangle; it's a rectangle that can be rotated at any angle. It's represented by the cv2.RotatedRect class, which is essentially a tuple containing:

  • (center_x, center_y): The center of the rectangle.
  • (width, height): The size of the rectangle.
  • angle: The rotation angle in degrees.

Key Properties and Methods:

  • center: The center point.
  • size: The width and height.
  • angle: The rotation angle.
  • points(): Returns the four corner points of the rectangle. This is very useful for drawing.
  • boundingRect(): Returns a normal, axis-aligned cv2.Rect that tightly encloses the RotatedRect.

Example: Creating and Drawing a RotatedRect

import cv2
import numpy as np
# Create a blank image
image = np.zeros((500, 500, 3), dtype=np.uint8)
# Define the properties of the rotated rectangle
center = (250, 250)
size = (200, 100)  # (width, height)
angle = 30  # degrees
# Create the RotatedRect object
box = cv2.RotatedRect(center, size, angle)
# --- Drawing the RotatedRect ---
# Method 1: Using cv2.minAreaRect (this is how you get one)
# This is often the result of an algorithm like cv2.findContours
points = cv2.boxPoints(box) # Find the four corner points
points = np.int0(points)    # Convert points to integer
# Draw the rectangle on the image
cv2.drawContours(image, [points], 0, (0, 255, 0), 3)
# Method 2: Draw the bounding box (axis-aligned)
bounding_rect = box.boundingRect()
x, y, w, h = bounding_rect
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 255), 2)
# Display the image
cv2.imshow("RotatedRect and its Bounding Box", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output: You'll see a green rotated rectangle with a red, non-rotated bounding box around it.


The Bounding Box (Axis-Aligned)

This is the most common type of "box" in object detection. It's a simple rectangle defined by its top-left corner (x, y) and its width and height (w, h). It's not rotated and is used to quickly outline an object of interest.

You'll often get these from functions like cv2.boundingRect() after finding contours.

Example: Finding and Drawing Bounding Boxes

Let's find a colored object and draw a box around it.

import cv2
import numpy as np
# Load an image
image = cv2.imread('shapes.png') # Replace with your image path
if image is None:
    print("Error: Could not load image.")
    exit()
# Convert to HSV for better color segmentation
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# Define a range for the color blue in HSV
lower_blue = np.array([100, 50, 50])
upper_blue = np.array([130, 255, 255])
# Create a mask for the blue color
mask = cv2.inRange(hsv_image, lower_blue, upper_blue)
# Find contours in the mask
contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Loop through the contours and draw a bounding box for each
for contour in contours:
    # Get the bounding rectangle (x, y, w, h)
    x, y, w, h = cv2.boundingRect(contour)
    # Draw the bounding rectangle on the original image
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
# Display the result
cv2.imshow("Bounding Boxes", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output: You'll see the original image with green rectangles drawn around all the blue objects.


The "Box" from an Object Detection Model (e.g., YOLO)

This is what most people mean when they talk about "OpenCV boxes" in the context of modern AI. You use OpenCV's cv2.dnn module to load a pre-trained model (like YOLO, SSD, etc.) which processes an image and outputs a list of detected objects. For each object, it provides:

  • Class ID: An integer representing the object's class (e.g., 0 for 'person', 2 for 'car').
  • Confidence: A float from 0.0 to 1.0 indicating how confident the model is.
  • Bounding Box Coordinates: The (x, y, w, h) of the box, but often normalized (i.e., values between 0 and 1).

Example: Using YOLO with OpenCV dnn

This is a more advanced but very practical example.

Step 1: Get the files You need three files:

  1. YOLOv3 weights: yolov3.weights
  2. YOLOv3 config: yolov3.cfg
  3. COCO class names: coco.names

You can download them from the official Darknet repository or find many tutorials online that provide direct links.

Step 2: The Python Code

import cv2
import numpy as np
# --- 1. Load the YOLO model ---
# Load class names
with open("coco.names", "r") as f:
    CLASS_NAMES = [line.strip() for line in f.readlines()]
# Load the network
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
# Set the preferred backend and target
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU) # Or DNN_TARGET_CUDA if you have a GPU
# --- 2. Load and prepare the image ---
image = cv2.imread("sample.jpg") # Replace with your image path
if image is None:
    print("Error: Could not load image.")
    exit()
# Get image dimensions
height, width = image.shape[:2]
# Create a "blob" from the image to feed into the network
# The blob normalizes the image and resizes it to 416x416
blob = cv2.dnn.blobFromImage(image, 1/255.0, (416, 416), swapRB=True, crop=False)
# --- 3. Run detection ---
net.setInput(blob)
layer_outputs = net.forward() # Get the output layers
# --- 4. Process the detections ---
# Lists to store detected boxes, confidences, and class IDs
boxes = []
confidences = []
class_ids = []
# Loop over each of the layer outputs
for output in layer_outputs:
    # Loop over each detection
    for detection in output:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]
        # Filter out weak detections
        if confidence > 0.5: # Confidence threshold
            # Scale the bounding box coordinates back to the original image size
            box = detection[0:4] * np.array([width, height, width, height])
            (centerX, centerY, box_width, box_height) = box.astype("int")
            # Use the center (x, y)-coordinates to derive the top and left corner
            x = int(centerX - (box_width / 2))
            y = int(centerY - (box_height / 2))
            # Update the list of bounding box coordinates, confidences, and class IDs
            boxes.append([x, y, int(box_width), int(box_height)])
            confidences.append(float(confidence))
            class_ids.append(class_id)
# --- 5. Apply Non-Maximum Suppression (NMS) ---
# This suppresses weak, overlapping boxes
indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
# --- 6. Draw the final boxes ---
if len(indices) > 0:
    for i in indices.flatten():
        (x, y, w, h) = boxes[i]
        label = CLASS_NAMES[class_ids[i]]
        confidence = confidences[i]
        # Draw a bounding box rectangle
        cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
        # Draw label background
        label_text = f"{label}: {confidence:.2f}"
        (label_w, label_h), _ = cv2.getTextSize(label_text, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
        cv2.rectangle(image, (x, y - label_h - 10), (x + label_w, y), (0, 255, 0), -1)
        # Draw label text
        cv2.putText(image, label_text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 1)
# Display the final image
cv2.imshow("YOLO Object Detection", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output: You'll see your image with green bounding boxes drawn around detected objects like "person", "car", "dog", etc., along with the class name and confidence score.

Summary

Type of "Box" Representation How to Get It Common Use Case
RotatedRect (center, size, angle) Manually created, or from cv2.minAreaRect() on contours. Representing objects with a clear orientation, like text or a credit card.
Bounding Box (x, y, w, h) cv2.boundingRect() on contours, or manually drawn. Outlining objects for simple analysis, cropping, or as an intermediate step.
Detection Box (x, y, w, h), class_id, confidence Output of an object detection model via cv2.dnn. The final result of an AI-powered object detection system.
分享:
扫描分享到社交APP
上一篇
下一篇