Of course! Creating a photomosaic in Python is a fantastic project that combines image processing, data structures, and algorithmic thinking. I'll guide you through the entire process, from the concept to a complete, runnable Python script.
What is a Photomosaic?
A photomosaic is a larger image made up of many smaller images. When you look at it from a distance, you see the main image. But when you get closer, you see that it's composed of hundreds or thousands of different photos.
The Core Concept
The algorithm works by following these steps:
- Target Image: This is the main image you want to recreate (e.g., a portrait of a person).
- Tile Images (or "Database"): This is a large collection of smaller images that will be used as the "pixels" of the final mosaic.
- Divide and Conquer:
- The target image is divided into a grid of small, rectangular sections (tiles).
- For each tile in the target image, we need to find the best matching tile from our database of tile images.
- The "Best Match" Problem: How do we define "best match"? The most common method is to compare the average color of the target tile with the average color of each tile in the database.
- Calculate the average color (e.g., Red, Green, Blue values) for the current target tile.
- Calculate the average color for every image in your tile database.
- Find the tile from the database whose average color is "closest" to the target tile's average color. "Closest" is usually measured using the Euclidean distance in the RGB color space.
- Assemble the Mosaic:
- Once the best match is found for a target tile, resize that database image to fit the target tile's dimensions.
- Place this resized image in the corresponding position in the final output image.
- Repeat: Do this for every single tile in the target image's grid.
Step-by-Step Implementation with Code
We'll use the Pillow (PIL) library for image manipulation and NumPy for efficient numerical operations, especially for calculating average colors.
Step 1: Install Necessary Libraries
First, make sure you have the required libraries installed. If not, open your terminal or command prompt and run:
pip install pillow numpy
Step 2: Prepare Your Images
- Target Image: Choose a high-resolution image. Let's call it
target.jpg. - Tile Images: Create a folder named
tilesand fill it with hundreds or thousands of smaller images. The more images you have, and the more varied their colors, the better your final mosaic will look. These can be photos, icons, patterns—anything! For best results, they should be roughly square (e.g., 100x100 pixels), but the script can handle different sizes.
Your project structure should look like this:
photomosaic_project/
├── main.py # Our Python script
├── target.jpg # The image to recreate
└── tiles/ # Folder with all the small images
├── tile_001.jpg
├── tile_002.png
├── tile_003.gif
└── ... (many more)
Step 3: The Python Script (main.py)
Here is the complete, commented script. You can copy and paste this into your main.py file.
import os
import random
from PIL import Image
import numpy as np
# --- Configuration ---
TARGET_IMAGE_PATH = 'target.jpg'
TILES_DIRECTORY = 'tiles'
OUTPUT_IMAGE_PATH = 'mosaic.jpg'
TILE_SIZE = (50, 50) # The size of each tile in the final mosaic (width, height)
TILE_DATABASE_SIZE = 200 # Number of tiles to use from the directory for speed
# --- Helper Functions ---
def get_average_color(image):
"""
Calculates the average color of a PIL Image object.
Returns a tuple of (R, G, B) values.
"""
# Convert image to numpy array for fast computation
img_array = np.array(image)
# Calculate the mean across the height and width, leaving us with the mean R, G, B
avg_color = np.mean(img_array, axis=(0, 1))
return tuple(avg_color.astype(int))
def create_tile_database(tile_paths, size):
"""
Creates a database of pre-processed tiles.
Each entry is a dictionary with the path, image, and average color.
"""
print("Building tile database...")
tile_db = []
# Use a random sample of tiles for performance if the directory is huge
sampled_paths = random.sample(tile_paths, min(TILE_DATABASE_SIZE, len(tile_paths)))
for path in sampled_paths:
try:
img = Image.open(path)
# Ensure all tiles are the same size for consistency
img = img.resize(size)
avg_color = get_average_color(img)
tile_db.append({
'path': path,
'image': img,
'avg_color': avg_color
})
except Exception as e:
print(f"Could not process image {path}: {e}")
print(f"Database created with {len(tile_db)} tiles.")
return tile_db
def find_best_match(target_tile_avg, tile_db):
"""
Finds the tile in the database with the closest average color.
"""
min_distance = float('inf')
best_tile = None
for tile in tile_db:
# Calculate Euclidean distance between colors
# d = sqrt((r1-r2)^2 + (g1-g2)^2 + (b1-b2)^2)
distance = np.sqrt(sum((c1 - c2)**2 for c1, c2 in zip(target_tile_avg, tile['avg_color'])))
if distance < min_distance:
min_distance = distance
best_tile = tile
return best_tile['image']
# --- Main Photomosaic Function ---
def create_photomosaic(target_path, tile_dir, output_path, tile_size):
"""
The main function to generate the photomosaic.
"""
# 1. Open the target image
target_image = Image.open(target_path)
target_width, target_height = target_image.size
# 2. Calculate grid dimensions
grid_width = target_width // tile_size[0]
grid_height = target_height // tile_size[1]
# 3. Create a blank canvas for the final mosaic
mosaic_image = Image.new('RGB', (target_width, target_height))
# 4. Build the tile database
tile_paths = [os.path.join(tile_dir, f) for f in os.listdir(tile_dir) if f.lower().endswith(('.png', '.jpg', '.jpeg', '.gif'))]
if not tile_paths:
print(f"Error: No tile images found in '{tile_dir}'.")
return
tile_db = create_tile_database(tile_paths, tile_size)
# 5. Process each tile in the grid
print("Generating mosaic...")
for y in range(grid_height):
for x in range(grid_width):
# Define the bounding box for the current tile in the target image
left = x * tile_size[0]
top = y * tile_size[1]
right = left + tile_size[0]
bottom = top + tile_size[1]
# Crop the target tile
target_tile = target_image.crop((left, top, right, bottom))
# Get the average color of this target tile
target_avg_color = get_average_color(target_tile)
# Find the best matching tile from our database
best_match_tile = find_best_match(target_avg_color, tile_db)
# Paste the best match onto our mosaic canvas
mosaic_image.paste(best_match_tile, (left, top))
# Print progress
progress = ((y * grid_width) + x) / (grid_width * grid_height) * 100
print(f"\rProgress: {progress:.2f}%", end='', flush=True)
print("\nMosaic generation complete!")
# 6. Save the final image
mosaic_image.save(output_path)
print(f"Photomosaic saved to {output_path}")
# --- Run the script ---
if __name__ == "__main__":
create_photomosaic(
target_path=TARGET_IMAGE_PATH,
tile_dir=TILES_DIRECTORY,
output_path=OUTPUT_IMAGE_PATH,
tile_size=TILE_SIZE
)
Step 4: Run the Script
Navigate to your photomosaic_project directory in the terminal and run the script:
python main.py
You will see progress as it builds the database and generates the mosaic. Once it's done, you'll find a new file named mosaic.jpg in your project folder. Open it to see your creation!
How to Improve the Results (Next Steps)
The basic script works, but you can make it much more sophisticated and visually appealing.
Use a Better Color Matching Algorithm
Instead of just the average color, you can compare the entire color histogram of the tiles. A histogram represents the distribution of colors in an image. Images with similar color distributions will look more similar, even if their average colors are slightly different.
- How to implement: You would need a library like
scikit-imageto calculate histograms. The comparison would be done using metrics like the Chi-Squared distance or Histogram Intersection.
Use Hashing for Speed
The current find_best_match function is slow because it has to loop through the entire database for every single tile in the target image. For a large database, this is very inefficient.
- Solution: Use a hash map or a k-d tree.
- k-d Tree: A data structure that allows for very fast nearest-neighbor searches. You would build a k-d tree from the average colors of your tile database. Then, for each target tile's average color, you can query the k-d tree to find the nearest neighbor in milliseconds instead of looping through thousands of items.
- How to implement: The
scipylibrary has ascipy.spatial.KDTreethat is perfect for this.
Allow for Rotation and Flipping
To make the mosaic less repetitive, you can allow the chosen tile to be rotated (90, 180, 270 degrees) or flipped (horizontally or vertically) before being placed. You would calculate the average color for each of these variations and add them all to your database.
Edge Blending
Sharp edges between tiles can make the mosaic look artificial. You can apply a slight blur or gradient at the edges of each tile to create a smoother transition between them.
Recursion (Recursive Mosaics)
A very cool advanced technique is to make the mosaic recursive. If a tile in the database is itself a good candidate to be a larger tile in the final image, the algorithm could use it as-is instead of breaking it down further. This creates a beautiful effect where some areas are sharp and detailed while others are more "pixelated."
