PDF GLOSSARY

PDF glossary: terms and formats

What every PDF term and format actually means, in plain language. The jargon you run into, explained.

Formats

PDF (Portable Document Format) is a file format created by Adobe in 1993 and standardized as ISO 32000. Its whole point is that a document looks exactly the same everywhere: same fonts, same layout, same page breaks, whether you open it on a phone, print it, or email it across the world. That reliability is why contracts, invoices, manuals, and forms still travel as PDF.

PDF/A

PDF/A is a version of PDF built for long-term archiving, defined by the ISO 19005 standard. The idea is simple: a document you save today should open and look identical decades from now, even when the original software is long gone. Archives, courts, and government bodies often require it for exactly that reason.

PDF/UA

PDF/UA (Universal Accessibility), standardized as ISO 14289, is the version of PDF designed to work for everyone, including people who rely on screen readers, braille displays, or keyboard navigation. UA stands for universal accessibility, and the standard spells out what a truly accessible PDF needs.

PDF/X

PDF/X is the version of PDF made for professional printing and graphic arts, defined under ISO 15930. When a file goes to a commercial press, small surprises like a missing font or the wrong color space turn into expensive reprints. PDF/X exists to remove those surprises before the job runs.

Concepts

OCR

OCR (Optical Character Recognition) turns a picture of text into real, selectable text. A scanned page or a photo of a document is, to a computer, just a grid of pixels: you cannot search it, copy from it, or have a screen reader read it. OCR analyzes those pixels, recognizes the letters and words, and produces an actual text layer.

AcroForm

AcroForm is the native, built-in form system in PDF. When a PDF has fillable fields, text boxes, checkboxes, radio buttons, dropdowns, you fill in without any special software, those are almost always AcroForm fields. It is the original PDF form technology and the one supported pretty much everywhere.

XFA

XFA (XML Forms Architecture) is an alternative form technology Adobe layered on top of PDF, where the form is described in XML rather than as native PDF objects. It was built for complex, dynamic forms, the kind that grow extra rows, recalculate totals, or change layout based on what you enter.

Metadata

Metadata is the information about a PDF that is not part of the visible page: title, author, subject, keywords, the software that created it, and the dates it was made or last changed. It travels inside the file alongside the content, even though you never see it on the page.

Compression

Compression makes a PDF smaller without changing what it looks like, or at least without changing it noticeably. PDFs balloon in size mainly because of images: a report full of high-resolution scans or photos can be tens of megabytes, too big to email and slow to load.

Embedded fonts

Embedded fonts are typefaces packaged inside the PDF itself, rather than relied upon from whatever computer happens to open the file. This is the mechanism behind one of PDF's core promises: the document looks the same everywhere.

Text layer

The text layer is the part of a PDF that holds real, selectable characters, as opposed to a picture of text. When you can click and drag to highlight words, search for a phrase, or copy a sentence out of a PDF, you are interacting with its text layer.

Watermark

A watermark is text or an image laid over the pages of a PDF, usually faint and repeated, to mark the document's status or ownership. Think of stamps like DRAFT, CONFIDENTIAL, a company logo, or a copyright notice sitting behind or above the content.

Linearization (Fast Web View)

Linearization, also called Fast Web View, is a way of reorganizing a PDF so it can start displaying before the whole file has downloaded. A normal PDF often needs to load completely before the first page appears; a linearized one is structured so page one shows up almost immediately.

Security

AES encryption

AES (Advanced Encryption Standard) is the encryption that protects a password-secured PDF. When you lock a PDF, the actual page content is scrambled with AES, typically a 256-bit key, so the data is unreadable without the password. It is the same proven, government-grade cipher used to protect everything from banking to messaging.

Electronic signature

An electronic signature is any way of signing a document electronically to show intent to agree, from typing your name to drawing a signature with your finger. Under the EU's eIDAS regulation, these come in tiers: a Simple Electronic Signature (SES), an Advanced one, and a Qualified Electronic Signature (QES), with QES carrying the strongest legal weight.

Digital signature

A digital signature is the cryptographic technology that proves a PDF is genuine and has not been altered. Using a certificate and a private key, it binds the signer's identity to the exact bytes of the document and seals them. If even a single character changes afterward, the signature breaks and the viewer flags it.

Images

Vector graphic

A vector graphic describes an image as shapes, lines, curves, and fills defined by math, rather than as a grid of pixels. Because it is math, a vector can scale to any size and stay perfectly sharp: a logo drawn as vectors looks crisp on a business card and on a billboard.

Raster image

A raster image is made of pixels, a fixed grid of tiny colored dots. Photographs, scans, and screenshots are all raster: zoom in far enough and you see the individual squares. This is the opposite approach to vector graphics, which are defined by math and scale infinitely.

JPG / JPEG

JPG (also written JPEG, for Joint Photographic Experts Group) is the most common format for photographs and one of the most common image types embedded in PDFs. It uses lossy compression, throwing away detail the eye is least likely to notice, to make photo files dramatically smaller.

PNG

PNG (Portable Network Graphics) is a lossless image format, meaning it compresses without throwing any detail away. What goes in comes back out pixel-for-pixel, which makes it ideal for screenshots, logos, icons, charts, and anything with sharp edges or flat areas of color.

WebP

WebP is a modern image format from Google that aims to replace both JPG and PNG by doing what each does well. It supports lossy compression for photos and lossless compression with transparency for graphics, and at comparable quality it usually produces noticeably smaller files than the older formats.

TIFF

TIFF (Tagged Image File Format) is a high-quality raster format long favored for scanning, archiving, and professional imaging. It can store images losslessly at full fidelity and supports features that matter in document workflows, like multiple pages in a single file and color profiles for accurate print.

SVG

SVG (Scalable Vector Graphics) is an open, web-native vector format that describes images as shapes and paths in XML text. Like any vector graphic, it scales to any size without losing sharpness, so an SVG logo or icon stays crisp at every resolution, from a tiny favicon to a full-page header.

DPI / PPI

DPI (dots per inch), sometimes called PPI for pixels per inch, measures how densely packed an image's detail is, how many dots fit into each inch when it is printed or scanned. It is the number that decides whether an image looks sharp or soft at a given size.