Excited to share that we have released RuBLiMP (Russian Benchmark of Linguistic Minimal Pairs), a novel benchmark for evaluating Russian language models (LMs).
❓RuBLiMP consists of 45,000 minimal pairs and includes 12 grammatical phenomena well-represented in Russian linguistics, covering morphology, syntax, and semantics. A minimal pair consists of a grammatical and an ungrammatical sentence (e.g., The cat is on the mat / *The cat are on the mat), and an LM is expected to prefer the grammatical one based on the scoring function.
Our approach allows to:
🔸generate minimal pairs at scale from any text domain
🔸estimate if a grammatical sentence appears in the LM's pretraining corpus
💡RuBLiMP can be used for evaluating the sensitivity of LMs to grammatical phenomena in Russian and for developing ranking and grammatical error detection methods.
🔸 Read more in our pre-print: https://arxiv.org/abs/2406.19232
🔸 HuggingFace: https://huggingface.co/datasets/RussianNLP/rublimp
🔸 GitHub: https://github.com/RussianNLP/RuBLiMP
>>Click here to continue<<
