Tutorial Use Case: COCO Detection and Segmentation ======================= Dataset ------- The COCO dataset is a pioneering benchmark dataset in computer vision for object detection and segmentation. The input is an image sourced from the public Web. It has a variety of targets but for this tutorial's use case, the target is a complex data involving each object instance's category label (out of 80 categories), bounding box, and segmentation mask (set of pixels). The raw dataset files are sourced from `the COCO dataset website `_. There are 2 top-level directories: * data: This has all the raw image files stored under the "train2017" and "val2017 subdirectories, e.g., "000000000009.jpg". * metadata: This has the Example Structure Files (ESFs) for both train and validation partitions and the original COCO annotations file for the latter alone. Please read :doc:`API: Data Ingestion and Locators ` for more ESF-related details. The specific metadata files are as follows: * coco-train.csv: A labeled set with the following column names: (id, height, width, file_name, annotations). * coco-val.csv: Also a labeled set with the same column names: (id, height, width, file_name, annotations). * instances_val2017.json: The raw annotations file in its original format. It has 3 tables stored as nested dictionaries: Annotations, Images, and Categories. This file is used as is to enable the use of the :code:`pycocotools` library to compute evaluation metrics. The following image depicts the schemas of all the tables involved. The ESF is derived by a simple join-project query over the Images and Annotations tables. This use case's training process requires all annotations of a given image to be used together to calculate the loss. Thus, they are stored as a list of dictionaries within the :code:`annotations` column. .. image:: /images/coco-datasets-schema.png Model ------- This tutorial notebook illustrates hyperparameter tuning in a single :func:`run_fit()` with a popular model architecture for detection+segmentation tasks: Mask-RCNN with a ResNet-50 backbone from the :code:`torchvision` library. This model was already trained on this dataset. But since we are continuing training on a smaller sample with potentially different hyperparameters, you are likely to see diverse learning behaviors. We also enlist a user knob to potentially reinitialize its weights in the :func:`create_model()` function to show from-scratch training behavior. Such diversity is useful to help understand the utility of our :doc:`Interactive Control operations `. Config Knobs ----------- We perform a simple hyperparameter tuning with :func:`GridSearch()`. We compare 2 different values each for the :code:`batch_size` and :code:`lr` (learning rate). Feel free to modify the value of the "pretrained" user knob to compare more runs.s Step by Step Code in our API ---------------------------- Please check out the notebook named :code:`rf-tutorial-coco-detseg.ipynb` on the Jupyter home directory in your RapidFire AI cluster.