The program SanskritTagger
SanskritTagger generates lexical and part-of-speech analyses of digital Sanskrit texts using a stochastic language model. SanskritTagger has been employed to build the annotated text corpus from which the Digital Corpus of Sanskrit (DCS) has been extracted. Please note that some parts of the program interface and of the help system of SanskritTagger are still in German! They will be rewritten in English during the next months.
SanskritTagger is described in the following publications:
- Oliver Hellwig: SanskritTagger, a stochastic lexical and POS tagger for Sanskrit. In: Proceedings of the First International Sanskrit Computational Linguistics Symposium, pp. 37-46.
- Oliver Hellwig: Performance of a lexical and POS tagger for Sanskrit. In: Proceedings of the Fourth International Sanskrit Computational Linguistics Symposium, pp. 162-172.