Tutorial Use Case: COCO Detection and Segmentation
=======================

Dataset
-------

The COCO dataset is a pioneering benchmark dataset in computer vision for object detection and segmentation.
The input is an image sourced from the public Web. 
It has a variety of targets but for this tutorial's use case, the target is a complex data involving each 
object instance's category label (out of 80 categories), bounding box, and segmentation mask (set of pixels). 

The raw dataset files are sourced from 
`the COCO dataset website <https://cocodataset.org/#detection-2017>`_.

There are 2 top-level directories: 

* data: This has all the raw image files stored under the "train2017" and "val2017 subdirectories, e.g., "000000000009.jpg".

* metadata: This has the Example Structure Files (ESFs) for both train and validation partitions and the original COCO annotations file for the latter alone. Please read :doc:`API: Data Ingestion and Locators <ingestion>` for more ESF-related details. 

The specific metadata files are as follows:

* coco-train.csv: A labeled set with the following column names: (id, height, width, file_name, annotations).

* coco-val.csv: Also a labeled set with the same column names: (id, height, width, file_name, annotations).

* instances_val2017.json: The raw annotations file in its original format. It has 3 tables stored as nested dictionaries: Annotations, Images, and Categories. This file is used as is to enable the use of the :code:`pycocotools` library to compute evaluation metrics.

The following image depicts the schemas of all the tables involved. 
The ESF is derived by a simple join-project query over the Images and Annotations tables. 
This use case's training process requires all annotations of a given image to be used together to calculate the loss.
Thus, they are stored as a list of dictionaries within the :code:`annotations` column.

.. image:: /images/coco-datasets-schema.png


Model
-------

This tutorial notebook illustrates hyperparameter tuning in a single :func:`run_fit()`
with a popular model architecture for detection+segmentation tasks: 
Mask-RCNN with a ResNet-50 backbone from the :code:`torchvision` library.

This model was already trained on this dataset. But since we are continuing training on a smaller sample 
with potentially different hyperparameters, you are likely to see diverse learning behaviors. 
We also enlist a user knob to potentially reinitialize its weights in the :func:`create_model()` function
to show from-scratch training behavior. 
Such diversity is useful to help understand the utility of our :doc:`Interactive Control operations <icops>`. 


Config Knobs
-----------

We perform a simple hyperparameter tuning with :func:`GridSearch()`. 
We compare 2 different values each for the :code:`batch_size` and :code:`lr` (learning rate).
Feel free to modify the value of the "pretrained" user knob to compare more runs.s


Step by Step Code in our API
----------------------------

Please check out the notebook named :code:`rf-tutorial-coco-detseg.ipynb` on the Jupyter home directory in your 
RapidFire AI cluster.