Skip to content

softdevhassan/entify

Repository files navigation

Entify | Comparative NER Engine Lab

Status Deployment Python

Entify is a high-performance research laboratory designed to benchmark and compare two fundamental architectural shifts in Natural Language Processing: Statistical (CRF) vs Neural (spaCy) Named Entity Recognition.

🚀 Key Features

  • Dual-Engine Comparison: Real-time side-by-side analysis of CRF and spaCy models.
  • Advanced Confidence Metrics: Probabilistic scores for both engines (CRF Marginals vs Neural Heuristics).
  • Localized Presets: Specialized inference samples for News, Tech, Finance, and Business from a Pakistani perspective.
  • Context-Aware Analytics: Intelligent metrics board that toggles between "Quality Leader" and "Unique Labels" based on mode.
  • Export Capabilities: Seamlessly export findings in JSON or TXT formats.

🛠️ Technical Stack

  • Backend: Flask (Python 3.9+)
  • NLP Core:
    • Statistical: Custom Conditional Random Fields (CRF) trained on CoNLL2003 (90.3% F1).
    • Neural: spaCy en_core_web_sm (Efficient ~12MB CNN architecture).
  • Frontend: Modern Glassmorphism UI (Tailwind CSS, Alpine.js, AOS).
  • Deployment: AWS EC2 with Gunicorn + Apache.

📖 Getting Started

  1. Clone the repository:
    git clone https://github.com/softdevhassan/entify.git
  2. Install dependencies:
    pip install -r requirements.txt
  3. Configure environment:
    • Create a .env file based on project metadata.
  4. Run the lab locally:
    python app.py

🎓 Project Credit

  • Institution: ILM College Sargodha (UOS Affiliated)
  • Course: Artificial Intelligence (BSCS Semester 3)
  • Mentor: Sir Abdur-Rehman
  • Lead Developer: Hassan Ali
  • Core Team: Mudassir Ali (Inference Analysis), Saad Ilyas (Design Systems).

For more details, visit the official documentation.

About

A comparative NER engine benchmarking a custom CRF model (90.32% F1, trained on CoNLL2003) against spaCy's en_core_web_sm — built as an AI semester project exploring the gap between statistical and neural NLP.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages