Home/Blog/002
002 · Guide · March 2026

OCR,
plainly explained.

~5 minute read Optical character recognition turns pictures of text into text. It sounds boring until you need it, at which point it's magic.

You have a PDF. You try to copy a sentence. Nothing highlights. That PDF is a picture — somebody scanned a page and saved the scan inside a PDF wrapper. The computer sees pixels, not letters.

OCR (optical character recognition) is the thing that reads the pixels and gives you letters back.

What OCR actually does

An OCR engine slices an image into lines, lines into words, words into glyphs, then matches each glyph to a character. The better engines also handle layout — columns, headers, tables — so the output looks vaguely like the input instead of a single mashed paragraph.

Modern OCR is extremely good at printed English. It's good at most other printed languages. It's passable at clean handwriting. It's terrible at messy handwriting, stylized fonts, and text on busy photographic backgrounds.

When you need it

When it won't help (much)

How to run OCR in Formatly

  1. Drop your image or PDF on the home page.
  2. In the target format dropdown, pick OCR (Extract Text).
  3. Convert, then download the resulting .txt.

Tips for better results

What you get back

Formatly outputs a plain .txt file. No formatting is preserved — just the text, in reading order. If you need structure, paste it into a Word doc or Google Doc and re-format from there.

Related