Bilevel, grayscale, color — three TIFF flavors
Office scanners typically offer three or four modes: black-and-white (bilevel), grayscale, color, and sometimes "auto-detect". The choice has order-of-magnitude impact on TIFF file size. TIFF2PDF preserves whatever mode the input used, so understanding the modes is the easiest way to predict output PDF size.
Bilevel — 1 bit per pixel
Each pixel is one bit: black or white. The scanner applies a threshold (usually around 50% gray) to its analog readings, snapping every value above to white and below to black.
Best suited for:
- Printed text: laser-printed, typewriter, ink-jet, or older offset printing produces high-contrast pages where bilevel discards no information.
- Line drawings: technical schematics, architectural plans, comic strips.
- Fax-style content: anything originally meant for fax transmission already assumes bilevel.
The CCITT G4 codec (see the dedicated post) compresses bilevel scans to typically 0.5–2% of raw pixel size. A US Letter page at 300 DPI bilevel is ~45 KB. A 100-page contract is ~5 MB. During conversion, the bitonal pixels are decoded and re-embedded with a PDF-native compression filter; the resulting PDF is typically 2–3× larger than the source TIFF (see the size comparison below).
Worst suited for:
- Photographs (lose all tonal detail).
- Handwriting in pencil (pencil strokes have varying intensity that thresholds away).
- Halftone-printed images from magazines or newspapers (the dot pattern interferes with the threshold).
Grayscale — 8 bits per pixel
256 levels of gray per pixel. Captures shading, anti-aliasing, soft edges. The same Letter page at 300 DPI grayscale is 8.4 MB raw, roughly 600 KB after LZW compression, ~400 KB after deflate.
Best for:
- Handwritten content: pencil and pen strokes have intensity variation that grayscale captures.
- Black-and-white photographs: scanned old photos, microfilm.
- Documents with light/uneven printing: faded carbon copies, mimeographed pages, where the threshold for bilevel would lose readability.
- Documents with subtle stamps or seals: gray rubber-stamp impressions.
Grayscale TIFFs land in the PDF as /DeviceGray at 8 bits per component. The source compression doesn't survive into the output as-is — pixels are decoded during conversion and re-embedded with a PDF-native filter, typically /FlateDecode.
Color — 24 bits per pixel
Full RGB color, 8 bits per channel. Same Letter page is 25 MB raw, around 3 MB lossless (LZW or deflate), or ~600 KB lossy (JPEG-in-TIFF).
Best for:
- Documents with colored highlighting or stamps: colored sticky notes, signature stamps in red ink.
- Marketing materials, brochures: where color is part of the content.
- Photographic content: scanned color photos, art reproduction.
- Anything where color carries information: highlighted passages, color-coded charts.
RGB TIFFs land in the PDF as /DeviceRGB at 8 bits per component. When the source TIFF has an ICC profile (often the case from scanner drivers), it is preserved in the resulting PDF as /ColorSpace [/ICCBased N R] for color-accurate rendering.
Auto-detect modes — the pitfall
Many scanners have an "auto" mode that picks bilevel, grayscale, or color per page based on content analysis. The promise is "small files when possible, full color when needed". The reality is that auto-detect is unreliable on mixed content:
- A page with a small color stamp triggers full color for the whole page → 25× size.
- A grayscale page with a red ribbon at the top stays color → 25× size.
- A page with light-gray printing or pencil annotations triggers grayscale → 13× size — usually fine.
- A clean printed page stays bilevel → smallest.
Multi-page TIFFs from auto-detect mode can have wildly different sizes per page. If your input is unexpectedly large, opening it in an image viewer (Photoshop, GIMP, Preview) and stepping through the pages will quickly show which pages went to color when they didn't need to.
Pre-processing to reduce size
If you have a color or grayscale TIFF that should have been bilevel (most pages are pure printed text, no real color content), convert it to bilevel before uploading. In Photoshop: Image → Mode → Bitmap (with the 50% Threshold method). In GIMP: Image → Mode → Indexed with 1-bit black-and-white palette, then export as TIFF with CCITT Group 4 compression. The output is dramatically smaller — text stays crisp, halftones disappear, any colored content collapses to black or white. Acceptable for archival of textual records.
Predicting PDF output size
The conversion decodes TIFF pixels and re-embeds them in the PDF — typically with a Flate-compressed filter regardless of the source codec. Source codec affects output size roughly like this:
- Bilevel + G4 input → 1-bit grayscale, Flate-compressed in the PDF. Output 2–3× larger than input.
- Grayscale + LZW input → 8-bit grayscale, Flate-compressed. Output similar size to input.
- Color + JPEG-in-TIFF input → DCT-compressed color in the PDF. Output similar to input.
- None / PackBits input → Flate. Output 30–60% of input.
Plus 1–3 KB framing per page for PDF dictionaries and the Pages tree. For maximum size efficiency on bitonal archives, specialist archival tools that preserve CCITT compression all the way into the PDF produce significantly smaller output; ask your IT department or look for "TIFF to PDF" software that explicitly mentions CCITT pass-through.