Python如何合并PDF文件？-杰瑞科技汇

Of course! Merging PDF files in Python is a common task, and there are several excellent libraries to choose from. The most popular and user-friendly one is PyPDF2.

（图片来源网络，侵删）

Here’s a complete guide, starting with the simplest method and moving to more advanced and robust options.

Method 1: Using `PyPDF2` (Recommended for Simple Merging)

PyPDF2 is a pure-Python library for working with PDF documents. It's easy to install and use for basic PDF merging tasks.

Installation

First, you need to install the library using pip:

pip install PyPDF2

Basic Script to Merge PDFs

This script will take a list of PDF file paths and merge them into a single output file.

（图片来源网络，侵删）

import PyPDF2
import os
def merge_pdfs(pdf_paths, output_path):
    """
    Merges multiple PDF files into a single PDF.
    :param pdf_paths: A list of paths to the PDF files to be merged.
    :param output_path: The path where the merged PDF will be saved.
    """
    # Create a PDF file writer object
    pdf_writer = PyPDF2.PdfWriter()
    for pdf_path in pdf_paths:
        # Open each PDF file in read-binary mode
        with open(pdf_path, 'rb') as pdf_file:
            # Create a PDF reader object
            pdf_reader = PyPDF2.PdfReader(pdf_file)
            # Add each page from the reader to the writer
            for page_num in range(len(pdf_reader.pages)):
                page = pdf_reader.pages[page_num]
                pdf_writer.add_page(page)
    # Write the merged content to a new PDF file
    with open(output_path, 'wb') as output_file:
        pdf_writer.write(output_file)
    print(f"Successfully merged {len(pdf_paths)} files into {output_path}")
# --- Example Usage ---
if __name__ == "__main__":
    # List of PDF files you want to merge
    # Make sure these files exist in the same directory or provide full paths
    pdf_files_to_merge = [
        'document1.pdf',
        'document2.pdf',
        'document3.pdf'
    ]
    # The name of the output merged file
    output_filename = 'merged_document.pdf'
    # Check if all input files exist
    for file in pdf_files_to_merge:
        if not os.path.exists(file):
            print(f"Error: File not found - {file}")
            exit()
    merge_pdfs(pdf_files_to_merge, output_filename)

How to Run the Script

Save the code above as a Python file (e.g., merger.py).
Place the PDF files you want to merge (document1.pdf, document2.pdf, etc.) in the same directory.
Run the script from your terminal:
```
python merger.py
```
A new file named merged_document.pdf will be created.

Method 2: Using `pypdf` (The Modern Successor to `PyPDF2`)

The original PyPDF2 library is now in maintenance mode. Its successor, pypdf, is more actively developed and has a slightly cleaner API. It's recommended for new projects.

Installation

pip install pypdf

Script to Merge PDFs

The logic is very similar, but the class names are a bit shorter.

from pypdf import PdfMerger
import os
def merge_pdfs_pypdf(pdf_paths, output_path):
    """
    Merges multiple PDF files into a single PDF using pypdf.
    :param pdf_paths: A list of paths to the PDF files to be merged.
    :param output_path: The path where the merged PDF will be saved.
    """
    merger = PdfMerger()
    for pdf_path in pdf_paths:
        # Append each PDF to the merger object
        merger.append(pdf_path)
    # Write the merged PDF to a file
    merger.write(output_path)
    # Always close the Merger object to free resources
    merger.close()
    print(f"Successfully merged {len(pdf_paths)} files into {output_path}")
# --- Example Usage ---
if __name__ == "__main__":
    pdf_files_to_merge = [
        'document1.pdf',
        'document2.pdf',
        'document3.pdf'
    ]
    output_filename = 'merged_document_pypdf.pdf'
    for file in pdf_files_to_merge:
        if not os.path.exists(file):
            print(f"Error: File not found - {file}")
            exit()
    merge_pdfs_pypdf(pdf_files_to_merge, output_filename)

As you can see, merger.append() is a very convenient and readable method.

Method 3: Using `pdfrw` (Good for Modifying PDFs)

pdfrw is another great library, especially if you need to do more than just merge, like modifying pages or forms. It's known for being robust.

（图片来源网络，侵删）

Installation

pip install pdfrw

Script to Merge PDFs

from pdfrw import PdfReader, PdfWriter, IndirectPdfDict
import os
def merge_pdfs_pdfrw(pdf_paths, output_path):
    """
    Merges multiple PDF files into a single PDF using pdfrw.
    """
    # Create a PDF writer object
    pdf_writer = PdfWriter()
    for pdf_path in pdf_paths:
        # Read the pages from each PDF
        pdf_reader = PdfReader(pdf_path)
        for page in pdf_reader.pages:
            # Add the page to the writer
            pdf_writer.addpage(page)
    # Write the output PDF
    pdf_writer.write(output_path)
    print(f"Successfully merged {len(pdf_paths)} files into {output_path}")
# --- Example Usage ---
if __name__ == "__main__":
    pdf_files_to_merge = [
        'document1.pdf',
        'document2.pdf',
        'document3.pdf'
    ]
    output_filename = 'merged_document_pdfrw.pdf'
    for file in pdf_files_to_merge:
        if not os.path.exists(file):
            print(f"Error: File not found - {file}")
            exit()
    merge_pdfs_pdfrw(pdf_files_to_merge, output_filename)

Comparison and Recommendation

Feature	`PyPDF2`	`pypdf`	`pdfrw`
Ease of Use	Good	Excellent	Good
API	Standard	Clean, modern	Standard
Maintenance	Low (Maintenance Mode)	High (Active)	High (Active)
Key Strength	Simplicity, pure Python	Modern, actively developed	Robust, good for modification
Best For	Quick, simple scripts.	New projects. General purpose PDF manipulation.	Advanced tasks like modifying forms, page splitting.

Recommendation:

For new projects, use pypdf. It's the modern standard and will be actively supported.
For quick and simple tasks, PyPDF2 is still perfectly fine and widely used.
If you need to do more than merge (e.g., rotate, extract, or modify PDFs), pdfrw is an excellent choice.

Advanced Tip: Creating a Command-Line Tool

You can make your script much more useful by allowing it to accept file paths as command-line arguments. The argparse module is perfect for this.

Here's an enhanced version of the pypdf script that does this:

# merge_cli.py
import argparse
import os
from pypdf import PdfMerger
def main():
    # Set up the argument parser
    parser = argparse.ArgumentParser(
        description="Merge multiple PDF files into a single PDF."
    )
    # Add arguments for input files and output file
    parser.add_argument(
        'output', 
        help='The name of the output merged PDF file.'
    )
    parser.add_argument(
        'pdf_files', 
        nargs='+',  # This means one or more arguments
        help='A list of PDF files to merge.'
    )
    # Parse the arguments
    args = parser.parse_args()
    # Check if all input files exist
    for file in args.pdf_files:
        if not os.path.exists(file):
            print(f"Error: File not found - {file}")
            return
    # Perform the merge
    try:
        merger = PdfMerger()
        for pdf_file in args.pdf_files:
            print(f"Adding {pdf_file}...")
            merger.append(pdf_file)
        merger.write(args.output)
        merger.close()
        print(f"\nSuccess! Files merged into '{args.output}'")
    except Exception as e:
        print(f"An error occurred: {e}")
if __name__ == "__main__":
    main()

How to Use the Command-Line Tool

Save the code as merge_cli.py.
Make sure pypdf is installed (pip install pypdf).
Run it from your terminal, passing the output file name

Python如何合并PDF文件？

Method 1: Using `PyPDF2` (Recommended for Simple Merging)

Installation

Basic Script to Merge PDFs

How to Run the Script

Method 2: Using `pypdf` (The Modern Successor to `PyPDF2`)

Installation

Script to Merge PDFs

Method 3: Using `pdfrw` (Good for Modifying PDFs)

Installation

Script to Merge PDFs

Comparison and Recommendation

Advanced Tip: Creating a Command-Line Tool

How to Use the Command-Line Tool

99ANYc3cd6

FileFilter在Java中如何使用？

华为C8817D换屏教程难不难？

Java FileFilter如何实现文件过滤？

移动20元20G流量怎么办理？

TestNG有Python版本吗？如何用？

Java Socket编程实例具体怎么实现？

Java jxl如何高效读取Excel数据？

CAD2008三维制图教程怎么学？

Java如何读取property文件？

python aspcheduler

mac版Photoshop怎么用？

Python 2.6在Windows下如何安装配置？

ArcGIS Python脚本编程如何上手？

OpenCV Python如何绘制直方图？

兄弟连HTML5教程视频适合新手学吗？

Linux下如何快速找到Java安装路径？

Python如何合并PDF文件？

Method 1: Using PyPDF2 (Recommended for Simple Merging)

Installation

Basic Script to Merge PDFs

How to Run the Script

Method 2: Using pypdf (The Modern Successor to PyPDF2)

Installation

Script to Merge PDFs

Method 3: Using pdfrw (Good for Modifying PDFs)

Installation

Script to Merge PDFs

Comparison and Recommendation

Advanced Tip: Creating a Command-Line Tool

How to Use the Command-Line Tool

相关推荐

Java Socket编程实例具体怎么实现？

Method 1: Using `PyPDF2` (Recommended for Simple Merging)

Method 2: Using `pypdf` (The Modern Successor to `PyPDF2`)

Method 3: Using `pdfrw` (Good for Modifying PDFs)