Cross-Platform Word to PDF Conversion

View and Convert Microsoft Word Documents Anywhere

We’re very pleased to announce the launch of the newest addition to PDFNet SDK: built-in Word conversion.  Now you can go straight from .docx to .pdf,  free from the shackles of Microsoft Word or any other 3rd party software.  Conversions are accurate and fast; they also work on any platform supported by PDFNet SDK  (and there are a lot of them! see the SDK download page for more details).

docx. It took a long time to get the text to flow around shapes correctly in the docx engine.

Dependency-free Word conversion enables a couple of great use cases: you can  perform reliable conversions in a server environment, or pair it with our PDF Viewer for seamless viewing of .docx files on Android, iOS, and Windows Phone/RT.

Easy to Use, but Still Flexible

We’ve tried to strike a balance between power and simplicity in the API. The best way to demonstrate this is through a quick example: this small snippet demonstrates how to convert a Word document to a PDF (in Java).

// Start with a PDFDoc (the conversion destination)
PDFDoc pdfdoc = new PDFDoc();
// perform the conversion with no optional parameters
Convert.wordToPdf(pdfdoc, "input_file.docx", null);
// save the result
pdfdoc.save("output_file.pdf", SDFDoc.e_remove_unused, null);

That’s it, just 2.5 lines of code. Of course, maybe you would rather have more control over the conversion process. That’s possible too: the interface allows for cancellation, progress reporting, page-by-page conversion, and diagnostic messages (for example, information on font substitutions). Here is the same conversion, performed page-by-page and with progress reporting.

// get a DocumentConversion object, which encapsulates and controls
// the conversion process
DocumentConversion conversion = Convert.wordToPdfConversion(
    pdfdoc, "input_file.docx", options);
// convert each page, one-by-one, with progress reporting
while(conversion.getConversionStatus() == DocumentConversion.e_incomplete)
{
    conversion.convertNextPage();
    System.out.println("Progress: " + (conversion.getProgress()*100.0) + "%");
}
// save the result
pdfdoc.save("output_file.pdf", SDFDoc.e_linearized, null);

To see these snippets as part of a fully working application, take a look at the WordToPDFTest sample project in the PDFNet SDK trial package, available here.

No Fonts? No Problem

While fonts can be embedded within .docx documents, they typically aren’t. On a typical Windows system this isn’t a problem: the most common fonts in Word documents (Calibri, Times New Roman, Arial, Cambria, etc. ) are installed by default on every Windows system, and they can be used while converting or viewing the document.

On other systems, such as Linux servers or Android phones, these fonts are only available in special circumstances, and without them you would normally be limited to two options: a) distributing the original fonts alongside your app, or b) settling for poor conversion results.

With the PDFNet SDK, this is no longer the case: we employ a number of strategies to ensure that conversion remains faithful to the original — with content in the right place and on the right page — even when supplied with no external fonts at all. For a more practical and in-depth look at font-handling, see this  knowledge base article.

Good, and Getting Better

We put a lot of work into our .docx converter. We’re very proud of this product, and we’re committed to improving it.

For the vast majority of documents created in a recent version of Word (Word 2010 or Word 2013, for example), the converter will yield excellent results, often indistinguishable from Word itself. Unfortunately, the .docx format is extensive and underspecified: the specification is more than 5000 pages, and is riddled with omissions and exceptions. There are bound to be features or behaviours that we have not quite nailed down. But that’s ok! We’re a small dev team and we move fast. If you’ve got a use case in mind, download our SDK here and try it out. If something isn’t working for you, then let us know, and chances are we’ll get it cleared up right away.

What About Powerpoint and Excel?

Work continues — Powerpoint to PDF and Excel to PDF are both on the way. We’ve accumulated a lot of great underlying technology while making the Word engine, and we plan to put that tech to use as we tackle the .pptx and .xlsx formats over the next year.

Give it a Try!

The built-in .docx conversion module is available as part of the PDFNet SDK for Windows, Linux, Mac, Android, iOS, and Windows Phone/RT. To obtain a free trial, visit our downloads page. Interfaces to the module are available in   C#, Java, Objective-C, Visual Basic, Python, Ruby, PHP,  C, and C++ (subject to platform availability: there is no PHP available on Android, for example).

The SDK download contains fully functional sample applications which demonstrate how to use the converter. These samples are available in C#, Java, Objective-C, C++, and Visual Basic. Look for the WordToPDFTest directory under Samples in the SDK package.

Want More Information?

If you have technical questions or would like information regarding licensing, please contact us. Your inquiry will be directed to a developer or our sales team, as appropriate.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s