Home>Knowledge Base>Articles

File Structure of PDF

PDF has been a mystery for most of us for a long time. Its advantages over other formats attract a lot of attention. Then many people would be interested in what a PDF file is and what consist of it. This article will show you the specific structure of PDF to help you get to know more about it.


In general, it is composed of four main parts:
A header: The first line of the Portable Document Format specifies the version of a PDF file format. These headers are the topmost portion of a document. A user can also customize the text and the fonts of a header. A header contains information on the PDF-specifications the file adheres to. This line looks like this: '%PDF-1.2'.


The body: The body of a PDF file consists of objects that compose the contents of the document. These objects include image data, fonts, annotations, text streams and much more. A user can also integrate invisible objects or elements. These objects embed the interactive features in a document like animation or graphics. A user can also implement logical structure in the document. One can also make the content of a document more secure by implementing security features. A user can protect the content of a document from unauthorized printing, viewing, editing or modifying. The body of a PDF also supports two types of numbers called integers and real numbers.

A cross-reference table: A cross-reference table refers to all the elements from the body that are used on the pages of the PDF-file. It consists of links to all the objects or elements in a file. A user can deploy this feature to navigate to other pages or content in a document. When a person updates a PDF file, it automatically gets updated in the cross-reference table. One can also trace the updated changes in the cross-reference table.


A trailer: A trailer tells applications or RIPs where to find the cross-reference table and always ends with '%%EOF'. If this line is missing, the PDF-file is not complete and can probably not be processed by any RIP or application. This is not the case with PostScript files. If the last few lines of a PostScript file are missing, you can often still print most of the pages. With a PDF-file, you'll lose everything. The trailer enables a user to navigate to the next page by clicking on the link provided.

PDF Structure


It is not difficult to apprehend a PDF file if we are familiar with its structure. Of course, it is not enough. To better make use of PDF, we have to learn more. Get to know more about PDF here.