Home > Software

OCR software for Hindi, Marathi, Gujarati, Tamil, and Sanskrit

Our OCR programs for Indian scripts process Devanagari (Hindi, Marathi, Sanskrit), Gujarati, and Tamil texts. Use OCR programs for converting printed books, letters, or newspapers into digital text documents. OCR programs are valuable tools for a modern paperless office, because they help to transform printed content into digital data.

An OCR or optical character recognition program can be thought of as a "computer typist": You scan a page of text, and the OCR program will take care of typing the page. After a few seconds, the OCR program has produced a digital and searchable version of the printed Devanagari, Gujarati, or Tamil. This digital text can be edited with any office program.

Using OCR software makes digitization much more efficient: Digitizing a page of Hindi text takes just a few seconds, and you can concentrate on the content instead of typing the page manually.

OCR software is useful for ...

  • Publishing houses, data entry companies and libraries: Digitize Hindi or Tamil books and newspapers
  • Companies and administration: Create digital text documents from printed business letters, or convert printed into digital records
  • and, of course, for everybody interested in generating digital, computer readable text documents.
ind.senz OCR programs recognize Devanagari (Hindi, Marathi, Sanskrit), Gujarati, and Tamil documents at high speed and accuracy:
  • HindiOCR is designed for typed texts written in Hindi.
  • MarathiOCR is designed for typed Marathi texts.
  • TamilOCR is designed for printed or typed Tamil texts.
  • GujaratiOCR is our latest OCR tool.
  • SanskritOCR is suited for anyone who explores the vast Sanskrit literature, and especially the scientific community.
Download the PDF fact-sheet about ind.senz and its OCR engines.  
Download the PDF info-sheet about how ind.senz OCR programs work.  
Using Hindi OCR and Sanskrit OCR for digitizing scanned texts

How OCR works

Only three steps are necessary to digitize a Hindi document:

1. Scan a Hindi document or open a scanned document:
Scanned Hindi text
2. Let HindiOCR recognize the document.
Applying the Hindi OCR program
3. Export the digitized and editable Hindi text to an office program (click to select text).
पर खंडहर अपने-आपमें खंडहर है रजनीकांत का मन उसकी
खोखली दीवारों में जाकर भले ही छिप ले, पर उनके पैर उधर नहीं उठते
हैं वह अतीत है मन में करुण-मधुर भावुकता को जगाने का अदृष्ट
उपादान उसे साकार करना गलती होगी वर्तमान का उद्दाम नाटक
नष्ट हो जाएगा चेहरों की नकाबें गिर जाएंगी बिजली की चौंधियाने-


January 5th, 2016: Gujarati OCR ( released

August 4th, 2015: Marathi OCR ( released

All news and additional content