SanskritTagger - OCR and digitization software for Hindi and Sanskrit

Home
- About
- Resellers
Software
OCR SDK
Downloads
- OCR demos
- Purchase OCR programs
Tools
- Sanskrit at your fingertip
- SanskritTagger

Home > Software > SanskritTagger

The program SanskritTagger

SanskritTagger generates lexical and part-of-speech analyses of digital Sanskrit texts using a stochastic language model. SanskritTagger has been employed to build the annotated text corpus from which the Digital Corpus of Sanskrit (DCS) has been extracted. Please note that some parts of the program interface and of the help system of SanskritTagger are still in German! They will be rewritten in English during the next months.

SanskritTagger is described in the following publications:

Oliver Hellwig: SanskritTagger, a stochastic lexical and POS tagger for Sanskrit. In: Proceedings of the First International Sanskrit Computational Linguistics Symposium, pp. 37-46.
Oliver Hellwig: Performance of a lexical and POS tagger for Sanskrit. In: Proceedings of the Fourth International Sanskrit Computational Linguistics Symposium, pp. 162-172.

License

SanskritTagger is distributed as freeware under a permissive license. License terms are displayed during installation.

You are encouraged to share annotated data created using SanskritTagger with the scientific community. Please refer to the description of data synchronisation in the help file of SanskritTagger.

Downloading SanskritTagger

A comprehensive explanation of how to download and install the program SanskritTagger is found on the download page.

News

January 5th, 2016: Gujarati OCR (1.0.0.1) released

August 4th, 2015: Marathi OCR (1.0.0.4) released

All news and additional content

ind.senz software:

HindiOCR
Free demo
Buy HindiOCR

HindiOCR SDK

MarathiOCR
Free demo
Buy MarathiOCR

TamilOCR
Free demo
Buy TamilOCR

GujaratiOCR
Free demo
Buy GujaratiOCR

SanskritOCR
Free demo
Buy SanskritOCR