NLP Text Processor

A powerful web application for natural language text analysis and preprocessing.

Features

The NLP Text Processor offers a comprehensive suite of tools for analyzing and transforming text data:

Tokenization

Word Tokenization: Split text into individual words using NLTK, spaCy, TextBlob, or simple whitespace splitting.
Sentence Tokenization: Break down text into sentences for granular analysis.

Text Normalization

Clean and standardize your text with multiple operations:

Lowercasing: Convert all text to lowercase.
Contraction Correction: Expand contractions (e.g., "don't" -> "do not").
Punctuation Removal: Strip all punctuation marks.
Whitespace Cleanup: Remove multiple spaces and trim text.
Spelling Correction: Automatically correct spelling errors.
Emoji Conversion: Convert emojis to their text description or remove them.

Advanced Analysis

Stop Words Removal: Filter out common non-informative words.
POS Tagging: Identify parts of speech (Nouns, Verbs, Adjectives, etc.).
Stemming: Reduce words to their root form (e.g., "running" -> "run").
Lemmatization: Convert words to their base dictionary form (e.g., "better" -> "good").

📈 Visualizations & Sentiment

Sentiment Analysis: Detect if text is Positive, Negative, or Neutral with polarity and subjectivity metrics.
Word Cloud: Visualize the most frequent words in your text.

Real-time Statistics

Instant text metrics including sentence count, token count, and average tokens per sentence.
Interactive progress tracking during processing.

Tech Stack

Frontend: Streamlit
NLP Libraries:
- NLTK (Natural Language Toolkit)
- spaCy
- TextBlob
Utilities: emoji, contractions, demoji

Project Structure

nlp-text-processor/
├── app.py              # Core NLP logic and helper functions
├── gui.py              # Main Streamlit interface application
├── requirements.txt    # Project dependencies
└── README.md           # Project documentation

Installation & Setup

Clone the repository

git clone https://github.com/utachicodes/nlp_text_processor.git
cd nlp-text-processor

Create a virtual environment (Recommended)

# Windows
python -m venv venv
venv\Scripts\activate

# macOS/Linux
python3 -m venv venv
source venv/bin/activate

Install dependencies
```
pip install -r requirements.txt
```
Download required NLP data The app will attempt to download NLTK data automatically, but you may need to install the spaCy model manually if it fails:
```
python -m spacy download en_core_web_sm
```
Run the application
```
streamlit run gui.py
```

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the project
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
gui.py		gui.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Text Processor

Features

Tokenization

Text Normalization

Advanced Analysis

📈 Visualizations & Sentiment

Real-time Statistics

Tech Stack

Project Structure

Installation & Setup

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

utachicodes/nlp_text_processor

Folders and files

Latest commit

History

Repository files navigation

NLP Text Processor

Features

Tokenization

Text Normalization

Advanced Analysis

📈 Visualizations & Sentiment

Real-time Statistics

Tech Stack

Project Structure

Installation & Setup

Contributing

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages