Skip to content

fixed marathi dataset#39

Merged
master-wayne7 merged 1 commit into
developfrom
fix/marathi-dataset
May 14, 2026
Merged

fixed marathi dataset#39
master-wayne7 merged 1 commit into
developfrom
fix/marathi-dataset

Conversation

@master-wayne7

@master-wayne7 master-wayne7 commented May 14, 2026

Copy link
Copy Markdown
Owner

Description

What does this PR do?
Fixes the Dataset of Marathi language

Summary by CodeRabbit

  • Chores
    • Updated language data file with new entries, alternate spellings, and expanded terminology coverage. Reorganized existing terms and variants for improved organization.

Review Change Stack

@master-wayne7 master-wayne7 self-assigned this May 14, 2026
@master-wayne7 master-wayne7 merged commit 987190e into develop May 14, 2026
2 of 3 checks passed
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1d490405-b65f-462b-8990-f119e5fc3a1c

📥 Commits

Reviewing files that changed from the base of the PR and between 920fe1b and b3c1432.

📒 Files selected for processing (1)
  • assets/data/mr.txt

📝 Walkthrough

Walkthrough

This PR updates the Marathi word list in assets/data/mr.txt by replacing and expanding vocabulary across eight scattered line ranges. The changes rewrite early romanized slang entries, add transliteration variants and Devanagari mappings, and perform substantial rewrites of two large Devanagari vocabulary blocks, removing 75 lines and adding 36 new ones.

Changes

Marathi vocabulary list updates

Layer / File(s) Summary
Early romanized slang updates
assets/data/mr.txt
Early word-list segments (lines 3–35, 48–52, 63–65, 75) are rewritten with new romanized slang terms, alternate spellings, and a single new entry, replacing prior entries throughout the initial portion of the file.
Mid-file transliteration and Devanagari variants
assets/data/mr.txt
The transliteration section (lines 91–114) is rewritten with new romanized variants and additional Devanagari terms, replacing the prior "tuzya bapacha" block with a broader set of mappings.
Large Devanagari vocabulary block rewrites
assets/data/mr.txt
Two substantial Devanagari sections (lines 123–163 and 175–238) are comprehensively rewritten with many new terms and removed entries, spanning from "कार्पेट मुन्चर" through "हेल हिटलर" and beyond.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 A tome of words, both old and new,
Marathi phrases in every hue,
Lines replaced with flourished care,
From Devanagari to romanics fair,
The vocabulary grows, fresh and true! ✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/marathi-dataset

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant