Entify is a high-performance research laboratory designed to benchmark and compare two fundamental architectural shifts in Natural Language Processing: Statistical (CRF) vs Neural (spaCy) Named Entity Recognition.
- Dual-Engine Comparison: Real-time side-by-side analysis of CRF and spaCy models.
- Advanced Confidence Metrics: Probabilistic scores for both engines (CRF Marginals vs Neural Heuristics).
- Localized Presets: Specialized inference samples for News, Tech, Finance, and Business from a Pakistani perspective.
- Context-Aware Analytics: Intelligent metrics board that toggles between "Quality Leader" and "Unique Labels" based on mode.
- Export Capabilities: Seamlessly export findings in JSON or TXT formats.
- Backend: Flask (Python 3.9+)
- NLP Core:
- Statistical: Custom Conditional Random Fields (CRF) trained on CoNLL2003 (90.3% F1).
- Neural: spaCy
en_core_web_sm(Efficient ~12MB CNN architecture).
- Frontend: Modern Glassmorphism UI (Tailwind CSS, Alpine.js, AOS).
- Deployment: AWS EC2 with Gunicorn + Apache.
- Clone the repository:
git clone https://github.com/softdevhassan/entify.git
- Install dependencies:
pip install -r requirements.txt
- Configure environment:
- Create a
.envfile based on project metadata.
- Create a
- Run the lab locally:
python app.py
- Institution: ILM College Sargodha (UOS Affiliated)
- Course: Artificial Intelligence (BSCS Semester 3)
- Mentor: Sir Abdur-Rehman
- Lead Developer: Hassan Ali
- Core Team: Mudassir Ali (Inference Analysis), Saad Ilyas (Design Systems).
For more details, visit the official documentation.