Making a Custom Object Detector using Pre-trained Model in Tensorflow

Recent development in computer vision has enabled exciting new technologies like self-driving cars, gesture recognition, and machine vision. The processing power required to create computer vision models was a barrier of entry for those interested in exploring this technology. However, this is no longer the case with pre-trained models today.

Instead of training your own model from scratch, you can build on existing models and fine-tune them for your own purpose without requiring as much computing power.

In this tutorial, we’re going to get our hands dirty and train our own corgi detector using a pre-trained SSD MobileNet V2 model.

1. Installation

This tutorial is based on Anaconda virtual environment with Python 3.6.

1.1 Tensorflow

Install Tensorflow using the following command:

$ pip install tensorflow

If you have a GPU that you can use with Tensorflow:

$ pip install tensorflow-gpu

1.2 Other dependencies

$ pip install pillow Cython lxml jupyter matplotlib

# Install protobuf using Home Brew
$ brew install protobuf

For protobuf installation on other OS, follow the instructions here.

1.3 Clone the Tensorflow models directory

In this tutorial, we're going to use resources in the Tensorflow models directory. Since it does not come with the Tensorflow installation, we need to clone it from their Github repo:

First change into the Tensorflow directory

# For example: ~/anaconda/envs/<your_env_name>/lib/python3.6/site-packages/tensorflow

$ cd <path_to_your_tensorflow_installation>

Clone the tensorflow models repository

$ git clone https://github.com/tensorflow/models.git

From this point on, this directory will be referred to as the models directory

1.4 Setting up the environment

Every time you start a new terminal window to work with the pre-trained models, it is important to compile Protobuf and change your PYTHONPATH.

$ cd <path_to_your_tensorflow_installation>/models/research/

$ protoc object_detection/protos/*.proto --python_out=.

$ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Run a quick test to confirm that the Object Detection API is working properly:

$ python object_detection/builders/model_builder_test.py

If the result looks like the following, you're ready to proceed to the next steps!

...............
----------------------------------------------------------------------
Ran 15 tests in 0.123s

OK

To make this tutorial easier to follow along, make sure you have the following folder structure within the models/ directory you just cloned:

models 
    ├── annotations
    |   └── xmls    
    ├── images
    ├── checkpoints
    ├── tf_record
    ├── research
    ...

These folders will be used to store required components for our model as we proceed.

2. Collect images

Data preparation is the most important part of training your own model. Since we're going to train a corgi detector, the first step is gathering pictures of corgis! About 200 of them would be sufficient.

I recommend using google-images-download to download images. It searches Google Images and then download images based on the inputs you provided. In the inputs, you can specify search parameters such as keywords, number of images, image format, image size, and usage rights.

Since we're downloading more than 100 images at a time, we need a chromedriver in the models directory (download here). You could use this sample command to download images. Make sure all your images are in the jpg format:

# From tensorflow/models

$ googleimagesdownload --keywords 'welsh corgi dog' \
    --limit 200 \
    --size medium \
    --chromedriver chromedriver \
    --format jpg

After downloading, save all images to models/images/. To make subsequent processes easier, let's rename the images as numbers (e.g. 1.jpg, 2.jpg) by running the following script:

3. Label your data set

Once you've collected all the images you need, you need to label them manually. There are many packages that serve this purpose. labelImg is a popular choice.

labelImg provides a user-friendly GUI. Plus, it saves label files (.xml) in the Pascal VOC format, which makes subsequent data conversion easier. Here's what a labelled image looks like in labelImg:

corgi_screenshot.png

Double check that every image has a corresponding .xml file and save them in models/annotations/xmls/.

4. Create Label Map (.pbtxt)

Classes need to be listed in the label map. Since we're only detecting corgis, the label map should contain only one item like the following:

Note that id should start from 1, because 0 is a reserved id. Save this file as label_map.pbtxt in models/annotations/

5. Create trainval.txt

trainval.txt is a list of image names without file extensions. Since we have sequential numbers for image names, the list should look like this:

Save this file as trainval.txt in models/annotations/

6. Create TFRecord (.record)

TFRecord is an important data format designed for Tensorflow. (Read more about it here). Before you can train your custom object detector, you must convert your data into the TFRecord format.

Since we need to train as well as validate our model, the data set will be split into training (train.record) and validation sets (val.record). The purpose of training set is straight forward - it is the set of examples the model learns from. The validation set is a set of examples used DURING TRAINING to assess model accuracy.

We're going to use create_tf_record.py to convert our data set into train.record and val.record. Download here and save it to models/research/object_detection/dataset_tools/.

This script is preconfigured to do 70-30 train-val split. Execute it by running:

# From tensorflow/models

$ python research/object_detection/dataset_tools/create_tf_record.py

If the script was successful, train.record and val.record should appear in your models/research/ directory. Move them into the models/tf_record/ directory.

7. Download pre-trained model

There are many pre-trained object detection models available in the model zoo. In order to train them using our custom data set, we need to restore them in Tensorflow using their checkpoints (.ckpt files), which are records of a previous model state.

For this tutorial, we're going to download ssd_mobilenet_v2_coco here and save its model checkpoint files (model.ckpt.meta, model.ckpt.index, model.ckpt.data-00000-of-00001) to our models/checkpoints/ directory.

8. Modify Config (.config) File

Each of the pretrained models has a config file that contains details about the model. To detect our custom class, the config file needs to be modified accordingly.

The config files are included in the models directory you cloned in the very beginning. You could find them in:

models/research/object_detection/samples/configs

In our case, we'll modify the config file for ssd_mobilenet_v2_coco. Make a copy of it first and save it in the models/ directory.

Here are the items we need to change:

  1. Since we're only trying to detect corgis, change num_classes to 1
  2. fine_tune_checkpoint tells the model which checkpoint file to use. Set this to checkpoints/model.ckpt
  3. The model also needs to know where the TFRecord files and label maps are for both training and validation sets. Since our train.record and val.record are saved in tf_record folder, our config should reflect that:
train_input_reader: {
  tf_record_input_reader {
    input_path: "tf_record/train.record"
  }
  label_map_path: "annotations/label_map.pbtxt"
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "tf_record/val.record"
  }
  label_map_path: "annotations/label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

9. Train

At this point, your models directory should look like this:

models 
    ├── annotations
    |   ├── label_map.pbtxt
    |   ├── trainval.txt
    |   └── xmls
    |       ├── 1.xml
    |       ├── 2.xml
    |       ├── ...
    |
    ├── images
    |   ├── 1.jpg
    |   ├── 2.jpg
    |   ├── ...    
    |
    ├── checkpoints
    |   ├── model.ckpt.data-00000-of-00001
    |   ├── model.ckpt.index
    |   └── model.ckpt.meta
    |
    ├── tf_record
    |   ├── train.record
    |   └── val.record
    |
    ├── research
    |   ├── ...
    ...

If you have successfully completed previous steps, you're ready to start training!

Follow the steps below:

# Change into models directory
$ cd tensorflow/models

# Make directory to store training progress
$ mkdir train

# Make directory to store validation results
$ mkdir eval

# Begin training
$ python research/object_detection/train.py \
    --logtostderr \
    --train_dir=train \
    --pipeline_config_path=ssd_mobilenet_v2_coco.config

Training time varies depending on the computing power of your machine.

10. Evaluation

Evaluation can be run in parallel with training. The eval.py script checks the train directory for progress and evaluate the model based on the most recent checkpoint.

# From the tensorflow/models/ directory
$ python research/object_detection/eval.py \
    --logtostderr \
    --pipeline_config_path=ssd_mobilenet_v2_coco.config \
    --checkpoint_dir=train \
    --eval_dir=eval

You can visualize the model training progress using Tensorboard:

# From .../tensorflow/models directory 

$ tensorboard --logdir=./

11. Model export

Once you finish training your model, you can export your model to be used for inference. If you've been following the folder structure, use the following command:

# From .../tensorflow/models

$ mkdir fine_tuned_model

$ python research/object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path ssd_mobilenet_v2_coco.config \
    --trained_checkpoint_prefix  train/model.ckpt-<the_highest_checkpoint_number> \
    --output_directory fine_tuned_model

12. Classify images

Now that you have a model, you can use it to detect corgis in pictures and videos! For the purpose of demonstration, we're going to detect corgis in an image. Before you proceed, pick an image you want to test the model with.

The models directory came with a notebook file (.ipynb) that we can use to get inference with a few tweaks. It is located at models/research/object_detection/object_detection_tutorial.ipynb. Follow the steps below to tweak the notebook:

  1. MODEL_NAME = 'ssd_mobilenet_v2_coco_2018_03_29'
  2. PATH_TO_CKPT = 'path/to/your/frozen_inference_graph.pb'
  3. PATH_TO_LABELS = 'models/annotations/label_map.pbtxt'
  4. NUM_CLASSES = 1
  5. Comment out cell #5 completely (just below Download Model)
  6. Since we're only testing on one image, comment out PATH_TO_TEST_IMAGES_DIR and TEST_IMAGE_PATHS in cell #9 (just below Detection)
  7. In cell #11 (the last cell), remove the for-loop, unindent its content, and add path to your test image:
image_path = 'path/to/image_you_want_to_test.jpg'

After following through the steps, run the notebook and you should see the corgi in your test image highlighted by a bounding box!

denny_found.png

There you have your custom corgi detector! In the next tutorial, I'll be walking you through the set up of real-time object detection on your webcam. Stay tuned!

More details

Tensorflow Object Detection Model Documentation