To use your images instead of the provided ones:
-
Fork and clone this repository to your local machine.
git@github.com:hrushikesh009/TensorFlow-OCR-Invoice-Extractor.git -
Install required libraries.
pip3 install -r requirements.txt -
Step 1: Annotate Images
- Save photos with your objects to
./data/raw. - Resize images to
(800, 600)with:python resize_images.py --raw-dir ./data/raw --save-dir ./data/images --ext jpg --target-size "(800, 600)" - Train/test split into
./data/images/trainand./data/images/test. - Annotate resized images with labelImg and generate
xmlfiles.
- Save photos with your objects to
-
Step 2: Open Colab Notebook
- Open the Python Notebook (Tensorflow-object-detection-training), which includes a comprehensive guide with a markdown on each stage of building a Tensorflow OCR model.
Enjoy the journey!