The Russian Legislative Corpus

arXiv:2406.04855v3 Announce Type: replace Abstract: We present a comprehensive corpus of Russian primary and secondary legislation adopted between 1991 and 2025, comprising 304,382 texts (194,425,905 tokens). The corpus is available in two versions: the basic version contains texts with simple metadata, while the detailed version includes both the original texts and their equivalents converted to the Universal Dependencies CoNLL-U format, annotated with parts of speech, morphological features, and syntactic dependencies.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top