good read!

How to Extract Text From Images and Scanned PDFs for Free

Updated
6 min
How to Extract Text From Images and Scanned PDFs for Free
enjoy!

You've got a photo of a worksheet, a screenshot of a slide, or a scanned PDF that's really just a picture of a page. You try to select the words and nothing happens, because there's no text there to select, only an image of text.

This guide shows you how to extract text from any image or scanned PDF for free, using OCR right in your browser. Once the text is out, you can copy it, edit it, search it, translate it, or have it read aloud.


What is OCR, and why would I need it?

OCR stands for optical character recognition. It looks at a picture of text, recognises the letters and words, and turns them into real, selectable text you can use anywhere.

You need it whenever the words are locked inside an image: a photo you took of a page, a screenshot, a scanned handout, or a PDF that was scanned rather than typed. Without OCR, those words can't be copied, searched, or read aloud. With it, they behave like any other text on the web.

How do I know if a PDF is scanned?

There's a quick test. Try to select a few words with your mouse or highlight them. If a clean text selection appears, the PDF already has real text and you don't need OCR. If you can only draw a box over the page and nothing highlights as words, it's a scanned image and OCR is what you need.

The same is true for anything you screenshot or photograph. If it's a picture, the text inside it needs extracting first.

How to extract text from any image, PDF, or website

The simplest free way is to let your browser do the recognition for you. With Helperbird you can extract text from any image, PDF, or website in a few steps:

  1. Open the image or scanned PDF, or have it ready to upload.
  2. Run the extract-text (OCR) tool on it.
  3. Let it recognise the words and return them as plain, editable text.
  4. Copy the text, or send it straight to read-aloud.

That's the whole process. No retyping, no specialist software, and it works on photos, screenshots, and scanned documents alike.

What can I do with the text once it's out?

This is where OCR really pays off, because extracted text plugs into every other reading and writing tool.

Have it read aloud. Once the words are real text, you can use text-to-speech on any website to listen to a scanned handout instead of squinting at a blurry photo of it. Our full guide to having any website or PDF read aloud covers every option.

Open it in a clean reading view. Strip away the clutter and read the extracted text in a calm, adjustable layout with immersive reader on any website.

Translate it. If the document isn't in your first language, you can translate a whole page or selected text once the words are extracted.

Simplify or summarise it. Dense text becomes easier to follow when you simplify text on any website or summarise text on any website to get the key points first.

Working with scanned PDFs directly

If you're dealing with a full scanned document rather than a single image, it helps to open it in a proper reader. Helperbird's PDF support lets you open the file, extract its text, and read it aloud in one place, and our guide to reading long PDFs is worth a read if the document runs to many pages.

Making scanned material usable is one of the most practical accessibility wins there is. Our post on how to make scans and copies accessible walks through it for teachers and parents preparing materials for students.

Tips for better OCR results

A few small things make recognition more accurate:

  • Use the clearest source you have. A sharp, well-lit photo beats a dark or blurry one every time.
  • Keep the page straight. Text that's tilted or curved is harder to recognise, so flatten the page and line it up squarely.
  • Crop to the text. If you only need one paragraph, capture just that part rather than the whole page.
  • Check the result. OCR is very good but not perfect, so glance over the extracted text and fix the odd stray character before you rely on it.

Who benefits most from OCR?

Just about everyone, but a few groups especially. Students handed photocopied or scanned worksheets can finally have them read aloud. People with dyslexia or low vision can move text out of fixed images and into a format they can resize, recolour, and listen to. Anyone learning a new language can extract text and translate it on the spot.

If reading is the hard part for you, pairing OCR with read-aloud is a powerful combination. Our roundup of free ways to make reading online easier and our best free writing support tools in your browser guide show how the pieces fit together.

The short version

If you can't select the words, they're trapped in an image, and OCR sets them free. Extract the text once, then read it aloud, translate it, simplify it, or drop it into your own document. Try it on the next photo or scanned PDF that won't let you copy a single word.

Helperbird logo: Stylized owl with large yellow eyes and a beige face, against a green background.

Download Helperbird

Make browsing easier and more accessible with tools like Text to Speech, Immersive Reader, and more.