recognizing text from memes

26 Jul 2017

The goal of this project is to recognize a string of text in an image of text.

In a previous project, images of memes were partitioned into a text component and an image component:

We may want to convert the text component into a string so that we can reuse, modify or analyze it.

Follow the instructions here to install tesseract optical character recognition.

Clone: git clone https://github.com/evmarts/meme-text-ocr.git

Run the script: python meme-text-ocr.py

Suppose we have the following meme:

Assume we have some way of partition the memes into two components:

A text component and an image component:

After placing the image of the text in the in/ directory, we can run the script:

evmarts$ python meme-text-ocr.py
Image of text to recognize: in/text.jpg

The script then outputs the following:

Text recognized as: 'When you call shotgun but end up in the back'