Of course! Merging PDF files in Python is a common task, and there are several excellent libraries to choose from. The most popular and user-friendly one is PyPDF2.

Here’s a complete guide, starting with the simplest method and moving to more advanced and robust options.
Method 1: Using PyPDF2 (Recommended for Simple Merging)
PyPDF2 is a pure-Python library for working with PDF documents. It's easy to install and use for basic PDF merging tasks.
Installation
First, you need to install the library using pip:
pip install PyPDF2
Basic Script to Merge PDFs
This script will take a list of PDF file paths and merge them into a single output file.

import PyPDF2
import os
def merge_pdfs(pdf_paths, output_path):
"""
Merges multiple PDF files into a single PDF.
:param pdf_paths: A list of paths to the PDF files to be merged.
:param output_path: The path where the merged PDF will be saved.
"""
# Create a PDF file writer object
pdf_writer = PyPDF2.PdfWriter()
for pdf_path in pdf_paths:
# Open each PDF file in read-binary mode
with open(pdf_path, 'rb') as pdf_file:
# Create a PDF reader object
pdf_reader = PyPDF2.PdfReader(pdf_file)
# Add each page from the reader to the writer
for page_num in range(len(pdf_reader.pages)):
page = pdf_reader.pages[page_num]
pdf_writer.add_page(page)
# Write the merged content to a new PDF file
with open(output_path, 'wb') as output_file:
pdf_writer.write(output_file)
print(f"Successfully merged {len(pdf_paths)} files into {output_path}")
# --- Example Usage ---
if __name__ == "__main__":
# List of PDF files you want to merge
# Make sure these files exist in the same directory or provide full paths
pdf_files_to_merge = [
'document1.pdf',
'document2.pdf',
'document3.pdf'
]
# The name of the output merged file
output_filename = 'merged_document.pdf'
# Check if all input files exist
for file in pdf_files_to_merge:
if not os.path.exists(file):
print(f"Error: File not found - {file}")
exit()
merge_pdfs(pdf_files_to_merge, output_filename)
How to Run the Script
- Save the code above as a Python file (e.g.,
merger.py). - Place the PDF files you want to merge (
document1.pdf,document2.pdf, etc.) in the same directory. - Run the script from your terminal:
python merger.py
- A new file named
merged_document.pdfwill be created.
Method 2: Using pypdf (The Modern Successor to PyPDF2)
The original PyPDF2 library is now in maintenance mode. Its successor, pypdf, is more actively developed and has a slightly cleaner API. It's recommended for new projects.
Installation
pip install pypdf
Script to Merge PDFs
The logic is very similar, but the class names are a bit shorter.
from pypdf import PdfMerger
import os
def merge_pdfs_pypdf(pdf_paths, output_path):
"""
Merges multiple PDF files into a single PDF using pypdf.
:param pdf_paths: A list of paths to the PDF files to be merged.
:param output_path: The path where the merged PDF will be saved.
"""
merger = PdfMerger()
for pdf_path in pdf_paths:
# Append each PDF to the merger object
merger.append(pdf_path)
# Write the merged PDF to a file
merger.write(output_path)
# Always close the Merger object to free resources
merger.close()
print(f"Successfully merged {len(pdf_paths)} files into {output_path}")
# --- Example Usage ---
if __name__ == "__main__":
pdf_files_to_merge = [
'document1.pdf',
'document2.pdf',
'document3.pdf'
]
output_filename = 'merged_document_pypdf.pdf'
for file in pdf_files_to_merge:
if not os.path.exists(file):
print(f"Error: File not found - {file}")
exit()
merge_pdfs_pypdf(pdf_files_to_merge, output_filename)
As you can see, merger.append() is a very convenient and readable method.
Method 3: Using pdfrw (Good for Modifying PDFs)
pdfrw is another great library, especially if you need to do more than just merge, like modifying pages or forms. It's known for being robust.

Installation
pip install pdfrw
Script to Merge PDFs
from pdfrw import PdfReader, PdfWriter, IndirectPdfDict
import os
def merge_pdfs_pdfrw(pdf_paths, output_path):
"""
Merges multiple PDF files into a single PDF using pdfrw.
"""
# Create a PDF writer object
pdf_writer = PdfWriter()
for pdf_path in pdf_paths:
# Read the pages from each PDF
pdf_reader = PdfReader(pdf_path)
for page in pdf_reader.pages:
# Add the page to the writer
pdf_writer.addpage(page)
# Write the output PDF
pdf_writer.write(output_path)
print(f"Successfully merged {len(pdf_paths)} files into {output_path}")
# --- Example Usage ---
if __name__ == "__main__":
pdf_files_to_merge = [
'document1.pdf',
'document2.pdf',
'document3.pdf'
]
output_filename = 'merged_document_pdfrw.pdf'
for file in pdf_files_to_merge:
if not os.path.exists(file):
print(f"Error: File not found - {file}")
exit()
merge_pdfs_pdfrw(pdf_files_to_merge, output_filename)
Comparison and Recommendation
| Feature | PyPDF2 |
pypdf |
pdfrw |
|---|---|---|---|
| Ease of Use | Good | Excellent | Good |
| API | Standard | Clean, modern | Standard |
| Maintenance | Low (Maintenance Mode) | High (Active) | High (Active) |
| Key Strength | Simplicity, pure Python | Modern, actively developed | Robust, good for modification |
| Best For | Quick, simple scripts. | New projects. General purpose PDF manipulation. | Advanced tasks like modifying forms, page splitting. |
Recommendation:
- For new projects, use
pypdf. It's the modern standard and will be actively supported. - For quick and simple tasks,
PyPDF2is still perfectly fine and widely used. - If you need to do more than merge (e.g., rotate, extract, or modify PDFs),
pdfrwis an excellent choice.
Advanced Tip: Creating a Command-Line Tool
You can make your script much more useful by allowing it to accept file paths as command-line arguments. The argparse module is perfect for this.
Here's an enhanced version of the pypdf script that does this:
# merge_cli.py
import argparse
import os
from pypdf import PdfMerger
def main():
# Set up the argument parser
parser = argparse.ArgumentParser(
description="Merge multiple PDF files into a single PDF."
)
# Add arguments for input files and output file
parser.add_argument(
'output',
help='The name of the output merged PDF file.'
)
parser.add_argument(
'pdf_files',
nargs='+', # This means one or more arguments
help='A list of PDF files to merge.'
)
# Parse the arguments
args = parser.parse_args()
# Check if all input files exist
for file in args.pdf_files:
if not os.path.exists(file):
print(f"Error: File not found - {file}")
return
# Perform the merge
try:
merger = PdfMerger()
for pdf_file in args.pdf_files:
print(f"Adding {pdf_file}...")
merger.append(pdf_file)
merger.write(args.output)
merger.close()
print(f"\nSuccess! Files merged into '{args.output}'")
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
main()
How to Use the Command-Line Tool
- Save the code as
merge_cli.py. - Make sure
pypdfis installed (pip install pypdf). - Run it from your terminal, passing the output file name
