Regardless of your political party, you probably recognize the image to the left of this text, particularly in reference to the claims of image manipulation and tampering that erupted on the blogosphere immediately following its release. President Obama posted the PDF of his long form birth certificate online in April, 2011, and it was quickly dissected by experts and amateurs alike. (You can view the Administration’s comments pertaining to the release of President Obama’s birth certificates, and download a copy of the long form birth certificate here.)
Although the White House stands by the validity of the document, the questions in the blogosphere don’t appear to be dissipating. Experts in digital image editing and document forensics have criticized the PDF, and publicly issued reports such as this one, by McGraw-Hill Technical Editor Mara Zebest. Each report and interview is replete with terms such as ‘pixelation’, ‘anti-aliasing’ and ‘flattening’, but since most of us aren’t forensic document experts, or Adobe gurus, much of the commentary is less than illuminating. In addition, according to the metadata in the document itself, the birth certificate was created on a computer running the Mac OS X operating system, using Quartz PDFContext, rather than any Adobe product.
Decoded Science does not take a position on the legitimacy of any of these claims, but we will endeavor to dissect the terminology used in the commentary in order to clarify the situation as much as possible.
Terms Used in the Birth Certificate Debate
OCR: OCR is an acronym that stands for Optical Character Reader. OCR software translates scanned images of text into words and characters. It works by comparing each character to a set of characters in a database and choosing the character that matches. Once the characters have been translated into information that the computer understands, it can be copied and edited as text in other files. OCR software, for example, can be used to scan a page of text from a book, and then edit the text on that page in a word processing program.
Anti-aliasing/aliasing: As you can see in the image to the left, an image that is reproduced with low-resolution techniques looks choppy and has uneven edges. This image is aliased.
Anti-aliased images, however, such as the image on the right, have smoother edges and look more even, if a little blurry. Although these letters look virtually identical when reproduced in the small sizes used in normal text, the differences are noticeable when the text is magnified.
Pixelation: Pixelation occurs when an image is magnified to the point where the individual pixels are visible. At that degree of magnification (‘zoom’) it is possible to compare characters on a pixel-by-pixel basis to find similarities and differences in size, shape and color.
Noise: On a digital image, ‘noise’ is caused by a number of factors, and can even be added deliberately. These random spots scattered across the image result in roughly the same effect as the grainy film on an analog camera.
Chromatic Aberration: Also known as color fringing, chromatic aberration is a change in color that is visible around the edges of contrasting portions of an image. The fringe of color can appear in scanned images as well as photographs, but does not appear in text that is created in a word-processor. As you can see in the image to the right, green and red shading is visible around the edges of this scanned text, but the fringe is no longer visible as the image is minimized in size.
In this example, you can also see noise – the random splotches of color scattered around the letters, and pixelation – each spot is a single pixel.
Curvature: Curvature refers to the rate of change in the direction of an edge or line. For example, scanning a straight line on a flat page will not show a curve. If you scan an image from the page of a book, however, there will likely be curvature. The degree of curvature will be affected by the tightness of the page’s binding, and the pressure with which the scanner is closed on the book.
Layers: According to Adobe, layers “are like stacked, transparent sheets of glass on which you can create images. You can see through the transparent areas of a layer to the layers below. You can work on each layer independently.” Basically, when an image is separated into layers, you can edit portions of the image in one layer without affecting the other layers. This is how artists manipulate images to, for example, place one person’s head on another person’s body without changing the background of the image.
Flattening: Flattening is a term used when the various layers of an image or document are merged into a single layer.
Analyzing PDF documents
Analyzing PDF documents is a detailed and highly complex process, but the terminology can make it even more complicated than it needs to be. Learn the jargon associated with image manipulation, and you will be better able to understand and evaluate claims of tampering and ‘doctoring’ as they pertain to online documentation.