this was my 2024 project and it is very minimalistic as I overestimated my coding skills, as of 2025 I intend to work on other things hence archived.
NOTE TO SELF: Get your basics right 🤦♂️
Why CSI? | Features | Project Status | Getting Started | Contributing | Contact
The Code Similarity Index (CSI) is designed to simplify the detection of code similarity across extensive codebases. CSI is tailored to address the challenges of plagiarism detection and code quality assessment by recognizing not only exact matches but also structural and logical similarities in code. Whether you're an educator, a software developer, or a company, CSI provides a comprehensive solution to manage and assess code integrity.
- Advanced Code Matching: CSI leverages abstract syntax trees (AST) and machine learning models to identify code similarity, even if the code has been obfuscated or refactored.
- Multi-Language Support: Initially supporting Python and Java, with plans to extend compatibility to additional programming languages.
- Customizable Thresholds: Adjust similarity detection settings to fine-tune between strict plagiarism detection and permissible code reuse.
- User-Friendly Interface: An intuitive web interface for easy code uploads and detailed similarity reports.
We're in the early stages of development, with a goal to establish core functionality within the next month. Stay tuned for updates as we make progress!
To get started with CSI, clone the repository and follow the setup instructions provided in the README:
git clone https://github.com/ryoari/Code.Similarity.Index.git
cd Code.Similarity.IndexFor detailed setup and usage instructions, refer to the README file.
CSI is an open, collaborative project, and contributions are highly encouraged. Whether you're interested in coding, providing feedback, or simply exploring, we welcome your involvement!
- Join the Discussion: Participate in our GitHub Discussions.
- First-Time Contributors: Check out issues labeled "good first issue" to get started.
For questions, feedback, or collaboration inquiries, please reach out to us at 1syf04lap@mozmail.com.
This project is licensed under the Apache License 2.0 License - see the LICENSE file for details.