Of course! Converting a Word document (.docx) to a PDF in Java is a common requirement. There are several excellent libraries to achieve this, each with its own strengths.

Here’s a comprehensive guide covering the most popular and effective methods.
Summary of Options
| Library | License | Ease of Use | Performance | Key Feature |
|---|---|---|---|---|
| Apache POI | Apache 2.0 (Permissive) | Medium | Good | The standard for Office file manipulation in Java. |
| iText 7 | AGPL (Free for open source) | Medium | Excellent | Powerful, feature-rich, but AGPL can be restrictive. |
| Docx4j | LGPL v2.1 (Permissive) | Medium | Good | Excellent choice if you also work with other OOXML formats. |
| Aspose.Words | Commercial (Free trial) | Easiest | Excellent | The most robust and feature-rich, but requires a paid license. |
Method 1: Apache POI with pdfbox (Recommended Free & Open Source)
This is a very popular combination. Apache POI is the de-facto standard for reading/writing Office files. While POI itself doesn't have a built-in PDF writer, it integrates seamlessly with Apache PDFBox to render the Word content into a PDF.
How it works: POI parses the .docx file, extracts the text, paragraphs, and basic formatting, and then PDFBox is used to lay out this content on a PDF page.
Add Dependencies
You need both poi and pdfbox in your pom.xml:

<dependencies>
<!-- Apache POI for .docx file handling -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>5.2.5</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.5</version>
</dependency>
<!-- Apache PDFBox for PDF creation -->
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>3.0.2</version>
</dependency>
</dependencies>
Java Code
This example demonstrates a basic conversion. Important: This method is best for simple documents with plain text. Complex layouts, headers, footers, and images may not be perfectly preserved.
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class WordToPdfConverter {
public static void main(String[] args) {
// Input and output file paths
String inputWordPath = "path/to/your/document.docx";
String outputPdfPath = "path/to/your/output.pdf";
try (FileInputStream fis = new FileInputStream(inputWordPath);
XWPFDocument document = new XWPFDocument(fis);
PDDocument pdfDocument = new PDDocument()) {
// Create a new page for the PDF
PDPage page = new PDPage();
pdfDocument.addPage(page);
try (PDPageContentStream contentStream = new PDPageContentStream(pdfDocument, page)) {
// Set font and font size
contentStream.setFont(PDType1Font.HELVETICA, 12);
contentStream.beginText();
contentStream.newLineAtOffset(50, 750); // x, y coordinates
// Iterate through paragraphs of the Word document
for (XWPFParagraph paragraph : document.getParagraphs()) {
// Add paragraph text to the PDF
contentStream.showText(paragraph.getText());
// Move to the next line
contentStream.newLineAtOffset(0, -15); // Negative value moves down
}
contentStream.endText();
}
// Save the PDF
pdfDocument.save(outputPdfPath);
System.out.println("PDF created successfully at: " + outputPdfPath);
} catch (IOException e) {
System.err.println("Error during Word to PDF conversion: " + e.getMessage());
e.printStackTrace();
}
}
}
Method 2: iText 7 (Powerful, but Check License)
iText is a powerful library for creating and manipulating PDFs. The itext-7 version has a new module, itext7-pdf-html, which can convert HTML to PDF. For Word, the best approach is to first convert the .docx to HTML and then use iText to render the HTML into a PDF.
How it works: Docx4j (or another library) is used to convert .docx to HTML. Then, iText's HTMLWorker (or a more modern converter) renders the HTML onto a PDF canvas.
Add Dependencies
You'll need itext7-core and docx4j for the .docx to HTML conversion.

<dependencies>
<!-- iText 7 Core -->
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext7-core</artifactId>
<version>7.2.5</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>html2pdf</artifactId>
<version>4.0.3</version>
</dependency>
<!-- Docx4j for .docx to HTML conversion -->
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-core</artifactId>
<version>11.4.4</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-export-fo</artifactId>
<version>11.4.4</version>
</dependency>
</dependencies>
Java Code
This example uses Docx4j to convert the Word document to an HTML string, which is then passed to iText.
import com.itextpdf.html2pdf.HtmlConverter;
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import java.io.File;
import java.io.OutputStream;
import java.io.StringWriter;
public class ITextWordToPdfConverter {
public static void main(String[] args) {
String inputWordPath = "path/to/your/document.docx";
String outputPdfPath = "path/to/your/output.pdf";
try {
// 1. Load the Word document using Docx4j
WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(inputWordPath));
// 2. Convert the Word document to HTML (as a string)
// This conversion is quite good at preserving formatting.
StringWriter htmlWriter = new StringWriter();
Docx4J.convert(wordMLPackage, htmlWriter, Docx4J.FLAG XHTML);
String html = htmlWriter.toString();
// 3. Convert the HTML string to PDF using iText
OutputStream outputStream = new File(outputPdfPath).toPath().newOutputStream();
HtmlConverter.convertToPdf(html, outputStream);
System.out.println("PDF created successfully at: " + outputPdfPath);
} catch (Exception e) {
System.err.println("Error during Word to PDF conversion: " + e.getMessage());
e.printStackTrace();
}
}
}
Method 3: Docx4j (Good for OOXML Workflows)
Docx4j is another excellent library specifically for working with OOXML formats (.docx, .xlsx, .pptx). It has built-in conversion capabilities to PDF via its docx4j-export-fo module, which uses Apache FOP.
How it works: Docx4j converts the .docx into an XSL-FO (Formatting Objects) document. Apache FOP then takes this XSL-FO and renders it into a PDF.
Add Dependencies
<dependencies>
<!-- Docx4j Core -->
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-core</artifactId>
<version>11.4.4</version>
</dependency>
<!-- Docx4j Export to FO (Formatting Objects) -->
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-export-fo</artifactId>
<version>11.4.4</version>
</dependency>
</dependencies>
Java Code
This is a very direct approach with Docx4j.
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.Wordprocessing.MainDocumentPart;
import java.io.File;
public class Docx4jConverter {
public static void main(String[] args) {
String inputWordPath = "path/to/your/document.docx";
String outputPdfPath = "path/to/your/output.pdf";
try {
// 1. Load the Word document
WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(inputWordPath));
// 2. Convert the Word document to PDF
// The "FO" stands for Formatting Objects.
Docx4J.toPDF(wordMLPackage, new File(outputPdfPath));
System.out.println("PDF created successfully at: " + outputPdfPath);
} catch (Exception e) {
System.err.println("Error during Word to PDF conversion: " + e.getMessage());
e.printStackTrace();
}
}
}
Method 4: Aspose.Words (Easiest & Most Robust, Commercial)
Aspose.Words is a commercial library widely regarded as the most robust and feature-rich solution for Word processing and conversion. It has a free trial version that places a watermark on the output PDF, making it perfect for evaluation.
How it works: Aspose.Words has a highly optimized rendering engine that converts the Word document's internal structure directly to PDF with high fidelity, preserving almost all formatting, layouts, and even complex elements like tables, headers, footers, and images.
Add Dependencies
You need to download the JAR from the Aspose website and add it to your project, or use their Maven repository.
<repositories>
<repository>
<id>aspose-java-releases</id>
<name>Aspose Java API Repository</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>23.8</version> <!-- Use the latest version -->
</dependency>
</dependencies>
Java Code
The code is remarkably simple.
import com.aspose.words.Document;
import com.aspose.words.SaveFormat;
import java.io.File;
public class AsposeWordsConverter {
public static void main(String[] args) {
String inputWordPath = "path/to/your/document.docx";
String outputPdfPath = "path/to/your/output.pdf";
try {
// 1. Load the Word document
Document doc = new Document(inputWordPath);
// 2. Save the document as PDF
doc.save(outputPdfPath, SaveFormat.PDF);
System.out.println("PDF created successfully at: " + outputPdfPath);
} catch (Exception e) {
System.err.println("Error during Word to PDF conversion: " + e.getMessage());
e.printStackTrace();
}
}
}
Which One Should You Choose?
- For a simple, free, and open-source solution: Use Apache POI + PDFBox. It's great if you only need basic text conversion and want to stick with Apache-licensed libraries.
- For high-fidelity conversion with good open-source options: Use Docx4j or iText 7 (with Docx4j). They preserve formatting much better than the POI/PDFBox combination.
- For the best possible results with minimal code: Use Aspose.Words. If your project can afford a commercial license, this is the top choice for reliability, performance, and feature support. Always test with the free trial first.
