The program SanskritTagger
SanskritTagger generates lexical and part-of-speech analyses of digital Sanskrit texts
using a stochastic language model.
SanskritTagger has been employed to build the annotated text corpus from which the
Digital Corpus of Sanskrit (DCS) has been
extracted.
Please note that some parts of the program interface and of the help system of SanskritTagger are still
in German! They will be rewritten in English during the next months.
SanskritTagger is described in the following publications:
- Oliver Hellwig: SanskritTagger, a stochastic lexical and POS tagger for Sanskrit. In: Proceedings of the First International Sanskrit Computational Linguistics Symposium, pp. 37-46.
- Oliver Hellwig: Performance of a lexical and POS tagger for Sanskrit. In: Proceedings of the Fourth International Sanskrit Computational Linguistics Symposium, pp. 162-172.