Computer Vision Practice

📖 Overview

A comprehensive repository containing 11 foundational OpenCV tutorials and 13 real-world computer vision applications, designed to take you from basic image processing to advanced AI-powered vision systems. This collection demonstrates the full spectrum of computer vision capabilities, enabling machines to perceive, interpret, and interact with visual data.

✨ Features
🏗️ Repository Structure
🎯 OpenCV Tutorials (1-11)
🚀 Computer Vision Applications (1-13)
🛠️ Technologies Stack
⚙️ Environment Setup
🚀 Getting Started
🔧 Customization
📊 Expected Outputs
🤝 Contributing
🙏 Acknowledgments
📄 License

✨ Features

🧠 Foundational Learning: 11 step-by-step OpenCV tutorials covering core concepts

🚀 Real-World Applications: 13 production-ready computer vision projects

🛠️ Multi-Platform: Local processing, cloud services (AWS), and edge computing

📊 Diverse Domains: Medical imaging, gesture control, OCR, object tracking, and more

⚡ Performance Optimized: Efficient implementations with various AI frameworks

🏗️ Repository Structure

📁 Computer-Vision-OpenCV-Mastery/
│
├── # 🎬 OPENCV TUTORIALS (11 Foundation Files)
├── 1)_OpenCV_Image_Inp_Outp.py
├── 2)_OpenCV_Video_Inp_Outp.py
├── 3)_OpenCV_Webcam_Inp_Outp.py
├── 4)_OpenCV_Resize_and_Crop.py
├── 5)_OpenCV_ColorSpaces.py
├── 6)_OpenCV_Blurs.py
├── 7)_OpenCV_Global_Threshold.py
├── 8)_OpenCV_Adaptive_Threshold.py
├── 9)_OpenCV_Edge_Detection.py
├── 10)_OpenCV_Drawing.py
└── 11)_OpenCV_Contours.py
│
├── 📁 # 🚀 APPLICATIONAL PROJECTS (13 Real-World Applications)
│   ├── 1)_Color_Detection_of_Objects/
│   ├── 2)_Face_Anonymizer_Image_Video_Webcam/
│   ├── 3)_Text_Detection_OCR/
│   ├── 4)_Image_Classifier_Empty_or_Not_Parking_Lot/
│   ├── 5)_Feature_Extraction_with_Inference/
│   ├── 6)_Emotion_Recognition_with_Face_Mask/
│   ├── 7)_Sign_Language_Detection_for_N_Alphabets/
│   ├── 8)_Pneumonia_Classifier_XRayIMGs/
│   ├── 9)_YoloV11Nano_Object_Tracking/
│   ├── 10)_AWS_Rekognition_FullAccess_IAM/
│   ├── 11)_Parking_Spot_Counter/
│   ├── 12)_AWS_Lambda_and_API_Gateway/
│   └── 13)_Hand_Gesture_Volume_Control/
|
├── 📁 # 🖼️ DEMO IMAGES (Visual Results Gallery)
│   │
│   ├── # Representative Image (1 Image)
│   │   └── Logo.png   → Image for This Readme 
│   │
│   ├── # OpenCV Tutorial Demos (11 Images)
│   │   ├── OpenCV1.png    → Image Input/Output Demo
│   │   ├── OpenCV2.png    → Video Processing Demo
│   │   ├── OpenCV3.png    → Webcam Capture Demo
│   │   ├── OpenCV4.png    → Resize & Crop Demo
│   │   ├── OpenCV5.png    → Color Spaces Demo
│   │   ├── OpenCV6.png    → Blur Effects Demo
│   │   ├── OpenCV7.png    → Global Thresholding Demo
│   │   ├── OpenCV8.png    → Adaptive Thresholding Demo
│   │   ├── OpenCV9.png    → Edge Detection Demo
│   │   ├── OpenCV10.png   → Drawing Functions Demo
│   │   └── OpenCV11.png   → Contour Detection Demo
│   │
│   └── # Applicational Project Demos (20 Images)
│       ├── AppProj1.png      → Color Detection Results
│       ├── AppProj2.png      → Face Anonymizer Output
│       ├── AppProj3.png      → Text Detection & OCR
│       ├── AppProj4.png      → Parking Lot Classification
│       ├── AppProj5.png      → Feature Extraction
│       ├── AppProj6.1.png    → Emotion Recognition - Happy
│       ├── AppProj6.2.png    → Emotion Recognition - Sad
│       ├── AppProj6.3.png    → Emotion Recognition - Angry
│       ├── AppProj7.1.png    → Sign Language - Letter A
│       ├── AppProj7.2.png    → Sign Language - Letter B
│       ├── AppProj7.3.png    → Sign Language - Letter C
│       ├── AppProj8.1.png    → Pneumonia X-Ray - Normal
│       ├── AppProj8.2.png    → Pneumonia X-Ray - Positive
│       ├── AppProj8.3.png    → Pneumonia X-Ray - Heatmap
│       ├── AppProj9.png      → YOLO Object Tracking
│       ├── AppProj10.png     → AWS Rekognition Dashboard
│       ├── AppProj11.png     → Parking Spot Counter UI
│       ├── AppProj12.png     → AWS Lambda API Response
│       ├── AppProj13.1.png   → Hand Gesture - Volume Up
│       └── AppProj13.2.png   → Hand Gesture - Volume Down
│
├── 📁 # 📥 INPUTS (8 Sample Input Files)
│   ├── dragon.jpg              → Colorful dragon for tutorials
│   ├── sample_video.mp4        → Sample video for processing
│   ├── cow_salt_pepper.png     → Noisy image for denoising
│   ├── bear.jpg                → Image for segmentation
│   ├── handwritten_text.png    → Handwritten notes for OCR
│   ├── messi.jpg               → Portrait for edge detection
│   ├── whiteboard.png          → Whiteboard for drawing demo
│   └── birds.jpg              → Multiple objects for contours
│
├── 📁 # 📤 OUTPUTS (13 Generated Output Files)
│   ├── dragon_bgr.jpg                    → BGR color space
│   ├── dragon_rgb.jpg                    → RGB color space  
│   ├── dragon_gray.jpg                   → Grayscale conversion
│   ├── dragon_hsv.jpg                    → HSV color space
│   ├── cleaned_cow_salt_pepper.png       → Denoised image
│   ├── bear_segmented.jpg                → Segmented bear
│   ├── handwritten_text_extracted_global.png → Global threshold OCR
│   ├── handwritten_text_extracted_adaptive.png → Adaptive threshold OCR
│   ├── messi_edge.jpg                    → Canny edge detection
│   ├── messi_edge_dilated.jpg            → Dilated edges
│   ├── messi_edge_eroded.jpg             → Eroded edges
│   ├── drawing_on_whiteboard.png         → Annotations demo
│   └── contoured_birds.jpg              → Detected contours
│
├── .gitignore             → Git ignore configuration
├── environment.yml        → Conda environment specification
├── requirements.txt       → Python package dependencies
└── README.md             → This documentation file

🎯 OpenCV Tutorials (1-11)

1. Image Input/Output

Learn to read, display, and save images in various formats using OpenCV's core functions.

2. Video Input/Output

Process video files frame-by-frame with efficient streaming techniques.

3. Webcam Processing

Real-time webcam capture and processing for interactive applications.

4. Image Manipulation

Resizing, cropping, and geometric transformations for image preprocessing.

5. Color Space Conversions

BGR, RGB, HSV, Grayscale conversions and their practical applications.

6. Image Blurring Techniques

Gaussian, Median, and Bilateral filtering for noise reduction and smoothing.

7. Global Thresholding

Binary and Otsu's thresholding methods for image segmentation.

8. Adaptive Thresholding

Local thresholding techniques for uneven lighting conditions.

9. Edge Detection

Canny, Sobel, and Laplacian edge detection algorithms.

10. Drawing Functions

Annotations, shapes, and text overlays on images and videos.

11. Contour Detection

Finding and analyzing object boundaries in images.

🚀 Computer Vision Applications (1-13)

1. Color Detection 🎨

Real-time object detection based on color ranges with adjustable HSV sliders.

2. Face Anonymizer 🎭

Privacy-preserving face blurring/masking for images, videos, and live streams.

3. Text Detection & OCR 📝

Multi-engine OCR (Tesseract, EasyOCR) with text localization and extraction.

4. Parking Lot Classifier 🅿️

Binary classification for parking space occupancy using custom CNN.

5. Feature Extraction 🔍

Deep feature extraction with pre-trained models for image retrieval.

6. Emotion Recognition 😷

Facial emotion classification with mask detection using MediaPipe.

Happy Emotion	Sad Emotion	Surprised Emotion

7. Sign Language Detection 🤟

Real-time ASL alphabet recognition with custom dataset and CNN.

Letter K	Letter R	Letter A

8. Pneumonia Classifier 🏥

Medical image analysis for pneumonia detection from chest X-rays.

Site Header	Normal X-Ray	Positive Case

9. YOLOv11 Object Tracking 🎯

Real-time object detection and tracking with Ultralytics YOLO.

10. AWS Rekognition Integration ☁️

Cloud-based face analysis and comparison using AWS services.

11. Parking Spot Counter 🚗

Automated counting of available parking spaces with perspective correction.

12. Serverless CV API ⚡

AWS Lambda + API Gateway deployment for scalable computer vision.

13. Gesture Volume Control 🔊

Hand gesture recognition for system volume control using MediaPipe.

Volume Up Gesture	Volume Down Gesture

🛠️ Technologies Stack

Complete Stack: OpenCV, TensorFlow, PyTorch, AWS (Lambda, S3, Rekognition, IAM, API Gateway), Tesseract OCR, Easy OCR, Skimage OCR, MediaPipe, Streamlit, Detectron2, Ultralytics YOLO v11, Pillow, NumPy, Pandas, Matplotlib, Seaborn, Scikit-Learn

⚙️ Environment Setup

Option 1: Using Conda (Recommended)

# Create environment from YAML file
conda env create -f environment.yml

# Activate the environment
conda activate opencvenv

# Verify installation
python -c "import cv2; print(f'OpenCV Version: {cv2.__version__}')"

Option 2: Using Pip/Virtualenv

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Linux/Mac)
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Option 3: Docker (Advanced)

FROM python:3.9-slim
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . /app
WORKDIR /app
CMD ["python", "1)_OpenCV_Image_Inp_Outp.py"]

🚀 Getting Started

Running Tutorials

# Run any tutorial
python "1)_OpenCV_Image_Inp_Outp.py"
python "9)_OpenCV_Edge_Detection.py"

Running Applications

# Navigate to application directory
cd "Applicational_Projects/1)_Color_Detection_of_Objects"

# Run the application
python main.py  # or specific script name

🔧 Customization

Each application is modular and can be easily customized:

# Example: Modify color detection ranges
LOWER_HSV = [20, 50, 50]  # Adjust for different colors
UPPER_HSV = [40, 255, 255]

# Example: Change model paths
MODEL_PATH = "custom_model.pth"
CONFIG_PATH = "custom_config.yaml"

📊 Expected Outputs

Each tutorial generates corresponding output files in the Outputs/ directory. The Demo_Images/ folder contains screenshots of expected results for all tutorials and applications.

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit changes (git commit -m 'Add AmazingFeature')
Push to branch (git push origin feature/AmazingFeature)
Open a Pull Request

🙏 Acknowledgments

OpenCV Community for the incredible computer vision library
AWS for cloud infrastructure and AI services
Ultralytics for YOLO implementations
Google Research for MediaPipe and TensorFlow
All Contributors who have helped improve this repository

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

⭐ Star this repo if you find it helpful!

"The eye sees only what the mind is prepared to comprehend." - Henri Bergson

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Applicational_Projects		Applicational_Projects
Demo Images		Demo Images
Inputs		Inputs
Outputs		Outputs
1)_OpenCV_Image_Inp_Outp.py		1)_OpenCV_Image_Inp_Outp.py
10)_OpenCV_Drawing.py		10)_OpenCV_Drawing.py
11)_OpenCV_Contours.py		11)_OpenCV_Contours.py
2)_OpenCV_Video_Inp_Outp.py		2)_OpenCV_Video_Inp_Outp.py
3)_OpenCV_Webcam_Inp_Outp.py		3)_OpenCV_Webcam_Inp_Outp.py
4)_OpenCV_Resize_and_Crop.py		4)_OpenCV_Resize_and_Crop.py
5)_OpenCV_ColorSpaces.py		5)_OpenCV_ColorSpaces.py
6)_OpenCV_Blurs.py		6)_OpenCV_Blurs.py
7)_OpenCV_Global_Threshold.py		7)_OpenCV_Global_Threshold.py
8)_OpenCV_Adaptive_Threshold.py		8)_OpenCV_Adaptive_Threshold.py
9)_OpenCV_Edge_Detection.py		9)_OpenCV_Edge_Detection.py
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Kratugautam99/Computer-Vision-Practice

Folders and files

Latest commit

History

Repository files navigation