杰瑞科技汇

Python如何调用GhostScript实现PDF转换?

Of course! Using Python to interact with Ghostscript is a common and powerful task, primarily for converting PostScript (.ps) and PDF files. This can be done in two main ways:

  1. Using a dedicated Python library (Recommended): This is the easiest, safest, and most reliable method. It abstracts away the command-line calls and handles paths and arguments correctly.
  2. Using the subprocess module (Manual): This gives you full control by directly calling the gs (or gswin64c.exe on Windows) command-line executable. It's more work but offers maximum flexibility.

Method 1: Using the pygs Library (Recommended)

The pygs library is a simple, lightweight wrapper around Ghostscript. It's the best place to start.

Step 1: Install Ghostscript

First, you must have Ghostscript installed on your system. Python is just a tool to control it; it doesn't contain Ghostscript itself.

  • Windows: Download the installer from the Ghostscript official website. Make sure to add it to your system's PATH during installation or note the installation path.
  • macOS (using Homebrew): brew install ghostscript
  • Linux (Debian/Ubuntu): sudo apt-get update && sudo apt-get install ghostscript
  • Linux (Fedora/CentOS): sudo dnf install ghostscript

Step 2: Install the pygs Library

You can install it using pip:

pip install pygs

Step 3: Basic Usage with pygs

Here are some common examples.

Example 1: Convert a PDF to Images (PNG)

This will take each page of a PDF and save it as a separate PNG file.

import pygs
# --- Input and Output ---
input_pdf = 'input.pdf'
output_png_template = 'output_page_{page:04d}.png'
# --- Ghostscript Arguments ---
# -sDEVICE=pngalpha:  Use a high-quality PNG device with alpha channel.
# -r300:               Set the resolution to 300 DPI.
# -dNOPAUSE:           Don't pause between pages.
# -dBATCH:             Exit after processing the last file.
# -dSAFER:             Run in a safer mode.
# -sOutputFile=...:    Specifies the output file name template.
#                      The {page} part is replaced by the page number.
args = [
    f'-sDEVICE=pngalpha',
    f'-r300',
    '-dNOPAUSE',
    '-dBATCH',
    '-dSAFER',
    f'-sOutputFile={output_png_template}',
    input_pdf
]
try:
    print(f"Converting '{input_pdf}' to PNG images...")
    pygs.main(args)
    print("Conversion successful!")
except pygs.GhostscriptError as e:
    print(f"An error occurred: {e}")

Example 2: Convert a PostScript File to a PDF

This is a very common use case.

import pygs
input_ps = 'document.ps'
output_pdf = 'document.pdf'
args = [
    f'-sDEVICE=pdfwrite',  # Use the PDF writing device
    f'-dNOPAUSE',
    f'-dBATCH',
    f'-dSAFER',
    f'-sOutputFile={output_pdf}',
    input_ps
]
try:
    print(f"Converting '{input_ps}' to '{output_pdf}'...")
    pygs.main(args)
    print("Conversion successful!")
except pygs.GhostscriptError as e:
    print(f"An error occurred: {e}")

Example 3: Compress an Existing PDF

You can use Ghostscript to optimize and reduce the file size of a PDF.

import pygs
input_pdf = 'large_document.pdf'
output_pdf = 'large_document_compressed.pdf'
# -dPDFSETTINGS=/screen:  Lower quality, smaller file.
# -dPDFSETTINGS=/ebook:   Good quality for eBooks.
# -dPDFSETTINGS=/printer: Intended for high-quality printing.
# -dPDFSETTINGS=/prepress: High quality, color preserving.
# -dPDFSETTINGS=/default:  Almost identical to /printer.
args = [
    f'-sDEVICE=pdfwrite',
    f'-dNOPAUSE',
    f'-dBATCH',
    f'-dSAFER',
    f'-dPDFSETTINGS=/ebook', # Choose a setting
    f'-sOutputFile={output_pdf}',
    input_pdf
]
try:
    print(f"Compressing '{input_pdf}'...")
    pygs.main(args)
    print("Compression successful!")
except pygs.GhostscriptError as e:
    print(f"An error occurred: {e}")

Method 2: Using the subprocess Module

This method is useful if you can't or don't want to install a third-party library. It gives you direct control over the gs command.

Step 1: Ensure Ghostscript is Installed and in your PATH

The subprocess module will try to find the gs executable. If it's not in your system's PATH, you'll need to provide the full path to the executable (e.g., C:/Program Files/gs/gs10.02.1/bin/gswin64c.exe).

Step 2: Basic Usage with subprocess

The key is to use subprocess.run() and pass the arguments as a list of strings.

Example 1: Convert a PDF to Images (PNG)

import subprocess
import os
input_pdf = 'input.pdf'
output_png_template = 'output_page_{page:04d}.png'
# The arguments are passed as a list of strings
# Note: The output file format is handled by Ghostscript's internal logic
# when using %d for page numbers.
args = [
    'gs',  # The Ghostscript executable name
    '-sDEVICE=pngalpha',
    '-r300',
    '-dNOPAUSE',
    '-dBATCH',
    '-dSAFER',
    f'-sOutputFile={output_png_template}', # The {page} placeholder is not standard here
                                          # Ghostscript uses %d for page numbers.
                                          # Let's correct this.
    input_pdf
]
# Let's use the correct %d placeholder for page numbers
args_corrected = [
    'gs',
    '-sDEVICE=pngalpha',
    '-r300',
    '-dNOPAUSE',
    '-dBATCH',
    '-dSAFER',
    f'-sOutputFile=output_page_%04d.png', # Correct placeholder
    input_pdf
]
print(f"Converting '{input_pdf}' using subprocess...")
try:
    # The 'check=True' argument will raise a CalledProcessError
    # if the command returns a non-zero exit code (i.e., an error).
    result = subprocess.run(args_corrected, check=True, capture_output=True, text=True)
    print("Conversion successful!")
    print("Ghostscript output:")
    print(result.stdout)
except FileNotFoundError:
    print("Error: 'gs' command not found. Is Ghostscript installed and in your PATH?")
except subprocess.CalledProcessError as e:
    print(f"An error occurred during conversion.")
    print(f"Return Code: {e.returncode}")
    print(f"Error Output:\n{e.stderr}")

Example 2: Convert a PostScript File to a PDF (Windows Specific)

On Windows, the executable is often gswin64c.exe instead of gs.

import subprocess
import platform
input_ps = 'document.ps'
output_pdf = 'document.pdf'
# Determine the correct executable name based on the OS
if platform.system() == 'Windows':
    gs_executable = 'gswin64c.exe'
else:
    gs_executable = 'gs'
args = [
    gs_executable,
    '-sDEVICE=pdfwrite',
    '-dNOPAUSE',
    '-dBATCH',
    '-dSAFER',
    f'-sOutputFile={output_pdf}',
    input_ps
]
print(f"Converting '{input_ps}' to '{output_pdf}' using subprocess...")
try:
    subprocess.run(args, check=True, capture_output=True, text=True)
    print("Conversion successful!")
except FileNotFoundError:
    print(f"Error: '{gs_executable}' command not found. Is Ghostscript installed and in your PATH?")
except subprocess.CalledProcessError as e:
    print(f"An error occurred during conversion.")
    print(f"Return Code: {e.returncode}")
    print(f"Error Output:\n{e.stderr}")

Comparison and Recommendation

Feature pygs (Recommended) subprocess
Ease of Use Excellent. Simple function call. Good. Requires careful argument formatting.
Reliability High. Handles paths and OS differences well. Medium. You are responsible for correct paths and executable names.
Dependencies Requires pip install pygs. None, but requires Ghostscript to be installed correctly.
Flexibility Good. Covers most common use cases. Excellent. You can use any Ghostscript command-line option.
Error Handling Built-in GhostscriptError. Requires manual checking of returncode and stderr.

Recommendation:

  • For 99% of use cases, use the pygs library. It's simpler, safer, and less error-prone.
  • Use the subprocess method only if you need to use a very specific or advanced Ghostscript feature that isn't easily accessible through pygs, or if you have strict dependency rules that prevent installing pygs.
分享:
扫描分享到社交APP
上一篇
下一篇