Java如何实现Word转PDF？-杰瑞科技汇

Of course! Converting a Word document (.docx) to a PDF in Java is a common requirement. There are several excellent libraries to achieve this, each with its own strengths.

（图片来源网络，侵删）

Here’s a comprehensive guide covering the most popular and effective methods.

Summary of Options

Library	License	Ease of Use	Performance	Key Feature
Apache POI	Apache 2.0 (Permissive)	Medium	Good	The standard for Office file manipulation in Java.
iText 7	AGPL (Free for open source)	Medium	Excellent	Powerful, feature-rich, but AGPL can be restrictive.
Docx4j	LGPL v2.1 (Permissive)	Medium	Good	Excellent choice if you also work with other OOXML formats.
Aspose.Words	Commercial (Free trial)	Easiest	Excellent	The most robust and feature-rich, but requires a paid license.

Method 1: Apache POI with `pdfbox` (Recommended Free & Open Source)

This is a very popular combination. Apache POI is the de-facto standard for reading/writing Office files. While POI itself doesn't have a built-in PDF writer, it integrates seamlessly with Apache PDFBox to render the Word content into a PDF.

How it works: POI parses the .docx file, extracts the text, paragraphs, and basic formatting, and then PDFBox is used to lay out this content on a PDF page.

Add Dependencies

You need both poi and pdfbox in your pom.xml:

（图片来源网络，侵删）

<dependencies>
    <!-- Apache POI for .docx file handling -->
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi</artifactId>
        <version>5.2.5</version>
    </dependency>
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>5.2.5</version>
    </dependency>
    <!-- Apache PDFBox for PDF creation -->
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>pdfbox</artifactId>
        <version>3.0.2</version>
    </dependency>
</dependencies>

Java Code

This example demonstrates a basic conversion. Important: This method is best for simple documents with plain text. Complex layouts, headers, footers, and images may not be perfectly preserved.

import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class WordToPdfConverter {
    public static void main(String[] args) {
        // Input and output file paths
        String inputWordPath = "path/to/your/document.docx";
        String outputPdfPath = "path/to/your/output.pdf";
        try (FileInputStream fis = new FileInputStream(inputWordPath);
             XWPFDocument document = new XWPFDocument(fis);
             PDDocument pdfDocument = new PDDocument()) {
            // Create a new page for the PDF
            PDPage page = new PDPage();
            pdfDocument.addPage(page);
            try (PDPageContentStream contentStream = new PDPageContentStream(pdfDocument, page)) {
                // Set font and font size
                contentStream.setFont(PDType1Font.HELVETICA, 12);
                contentStream.beginText();
                contentStream.newLineAtOffset(50, 750); // x, y coordinates
                // Iterate through paragraphs of the Word document
                for (XWPFParagraph paragraph : document.getParagraphs()) {
                    // Add paragraph text to the PDF
                    contentStream.showText(paragraph.getText());
                    // Move to the next line
                    contentStream.newLineAtOffset(0, -15); // Negative value moves down
                }
                contentStream.endText();
            }
            // Save the PDF
            pdfDocument.save(outputPdfPath);
            System.out.println("PDF created successfully at: " + outputPdfPath);
        } catch (IOException e) {
            System.err.println("Error during Word to PDF conversion: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Method 2: iText 7 (Powerful, but Check License)

iText is a powerful library for creating and manipulating PDFs. The itext-7 version has a new module, itext7-pdf-html, which can convert HTML to PDF. For Word, the best approach is to first convert the .docx to HTML and then use iText to render the HTML into a PDF.

How it works: Docx4j (or another library) is used to convert .docx to HTML. Then, iText's HTMLWorker (or a more modern converter) renders the HTML onto a PDF canvas.

Add Dependencies

You'll need itext7-core and docx4j for the .docx to HTML conversion.

（图片来源网络，侵删）

<dependencies>
    <!-- iText 7 Core -->
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>itext7-core</artifactId>
        <version>7.2.5</version>
        <type>pom</type>
    </dependency>
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>html2pdf</artifactId>
        <version>4.0.3</version>
    </dependency>
    <!-- Docx4j for .docx to HTML conversion -->
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-core</artifactId>
        <version>11.4.4</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-export-fo</artifactId>
        <version>11.4.4</version>
    </dependency>
</dependencies>

Java Code

This example uses Docx4j to convert the Word document to an HTML string, which is then passed to iText.

import com.itextpdf.html2pdf.HtmlConverter;
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import java.io.File;
import java.io.OutputStream;
import java.io.StringWriter;
public class ITextWordToPdfConverter {
    public static void main(String[] args) {
        String inputWordPath = "path/to/your/document.docx";
        String outputPdfPath = "path/to/your/output.pdf";
        try {
            // 1. Load the Word document using Docx4j
            WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(inputWordPath));
            // 2. Convert the Word document to HTML (as a string)
            //    This conversion is quite good at preserving formatting.
            StringWriter htmlWriter = new StringWriter();
            Docx4J.convert(wordMLPackage, htmlWriter, Docx4J.FLAG XHTML);
            String html = htmlWriter.toString();
            // 3. Convert the HTML string to PDF using iText
            OutputStream outputStream = new File(outputPdfPath).toPath().newOutputStream();
            HtmlConverter.convertToPdf(html, outputStream);
            System.out.println("PDF created successfully at: " + outputPdfPath);
        } catch (Exception e) {
            System.err.println("Error during Word to PDF conversion: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Method 3: Docx4j (Good for OOXML Workflows)

Docx4j is another excellent library specifically for working with OOXML formats (.docx, .xlsx, .pptx). It has built-in conversion capabilities to PDF via its docx4j-export-fo module, which uses Apache FOP.

How it works: Docx4j converts the .docx into an XSL-FO (Formatting Objects) document. Apache FOP then takes this XSL-FO and renders it into a PDF.

Add Dependencies

<dependencies>
    <!-- Docx4j Core -->
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-core</artifactId>
        <version>11.4.4</version>
    </dependency>
    <!-- Docx4j Export to FO (Formatting Objects) -->
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-export-fo</artifactId>
        <version>11.4.4</version>
    </dependency>
</dependencies>

Java Code

This is a very direct approach with Docx4j.

import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.Wordprocessing.MainDocumentPart;
import java.io.File;
public class Docx4jConverter {
    public static void main(String[] args) {
        String inputWordPath = "path/to/your/document.docx";
        String outputPdfPath = "path/to/your/output.pdf";
        try {
            // 1. Load the Word document
            WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(inputWordPath));
            // 2. Convert the Word document to PDF
            //    The "FO" stands for Formatting Objects.
            Docx4J.toPDF(wordMLPackage, new File(outputPdfPath));
            System.out.println("PDF created successfully at: " + outputPdfPath);
        } catch (Exception e) {
            System.err.println("Error during Word to PDF conversion: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Method 4: Aspose.Words (Easiest & Most Robust, Commercial)

Aspose.Words is a commercial library widely regarded as the most robust and feature-rich solution for Word processing and conversion. It has a free trial version that places a watermark on the output PDF, making it perfect for evaluation.

How it works: Aspose.Words has a highly optimized rendering engine that converts the Word document's internal structure directly to PDF with high fidelity, preserving almost all formatting, layouts, and even complex elements like tables, headers, footers, and images.

Add Dependencies

You need to download the JAR from the Aspose website and add it to your project, or use their Maven repository.

<repositories>
    <repository>
        <id>aspose-java-releases</id>
        <name>Aspose Java API Repository</name>
        <url>https://repository.aspose.com/repo/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-words</artifactId>
        <version>23.8</version> <!-- Use the latest version -->
    </dependency>
</dependencies>

Java Code

The code is remarkably simple.

import com.aspose.words.Document;
import com.aspose.words.SaveFormat;
import java.io.File;
public class AsposeWordsConverter {
    public static void main(String[] args) {
        String inputWordPath = "path/to/your/document.docx";
        String outputPdfPath = "path/to/your/output.pdf";
        try {
            // 1. Load the Word document
            Document doc = new Document(inputWordPath);
            // 2. Save the document as PDF
            doc.save(outputPdfPath, SaveFormat.PDF);
            System.out.println("PDF created successfully at: " + outputPdfPath);
        } catch (Exception e) {
            System.err.println("Error during Word to PDF conversion: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Which One Should You Choose?

For a simple, free, and open-source solution: Use Apache POI + PDFBox. It's great if you only need basic text conversion and want to stick with Apache-licensed libraries.
For high-fidelity conversion with good open-source options: Use Docx4j or iText 7 (with Docx4j). They preserve formatting much better than the POI/PDFBox combination.
For the best possible results with minimal code: Use Aspose.Words. If your project can afford a commercial license, this is the top choice for reliability, performance, and feature support. Always test with the free trial first.

Java如何实现Word转PDF？

Summary of Options

Method 1: Apache POI with `pdfbox` (Recommended Free & Open Source)

Add Dependencies

Java Code

Method 2: iText 7 (Powerful, but Check License)

Add Dependencies

Java Code

Method 3: Docx4j (Good for OOXML Workflows)

Add Dependencies

Java Code

Method 4: Aspose.Words (Easiest & Most Robust, Commercial)

Add Dependencies

Java Code

Which One Should You Choose?

99ANYc3cd6

Python人生苦短，为何非要用Python？

Java中case switch何时比if-else更高效？

WordPress制作教程从哪开始学？

CorelDraw文字教程，从入门到精通怎么学？

PS字体教程，如何快速掌握字体设计技巧？

Java Socket编程实例具体怎么实现？

Unity3D教程PDF哪里找？新手如何学？

Vector与List有何区别？

Python中unix时间戳如何转本地时间？

Virtual DJ教程怎么学？新手入门指南看这里！

Java Collections如何选择和使用？

Java double转int会丢失精度吗？

Java中Math.random()如何生成随机数？

哪里能下载到CAD2007教程视频？

VMware虚拟机安装步骤有哪些？

Python sendmessage如何实现消息发送？

Java如何实现Word转PDF？

Summary of Options

Method 1: Apache POI with pdfbox (Recommended Free & Open Source)

Add Dependencies

Java Code

Method 2: iText 7 (Powerful, but Check License)

Add Dependencies

Java Code

Method 3: Docx4j (Good for OOXML Workflows)

Add Dependencies

Java Code

Method 4: Aspose.Words (Easiest & Most Robust, Commercial)

Add Dependencies

Java Code

Which One Should You Choose?

相关推荐

Java Socket编程实例具体怎么实现？

Method 1: Apache POI with `pdfbox` (Recommended Free & Open Source)