Dense classification requires a dataset of images with all their pixels labeled. Build it maybe sounds annoying, but there is a trick to quickly label each pixel; Just need to paint each kind of object that you want to segment on an image-copy using a solid color, then the color code can be turned into a label using code.

Although there is software available to create this kind of dataset, I preferred to search an existing dataset online and found the LFW part labels dataset which contains labeling of 2927 face images into hair, skin, and background labels.

Understand how to use it was very difficult due to the lack of instructions. The page gives you the links to download the images and the label documents, so you need to process that data to create a dataset that can be used to train the models. Find this information was the difficult part, searched a lot until I found a repository that was using it, and exploring on the questions section, found the link to a processed dataset and the code used (although no the code to create the labels).

The label documents were created using superpixels this means the labels were created automatically, so they are no precise as you can see in the following image.

LFW sample

The problem with this is how to interpret the evaluation scores of the AI model. For example, imagine a cookie with a perfect circle shape that has as an automatic label a cookie with a deformed shape, like someone bit it. Also, suppose we are using Intersetion Over Union as a metric to evaluate the model. If it gets an almost perfect IoU score, it will mean that the labels generated are so similar with the ground truth labels, which are not precise, in this case, the deformed cookie, so our model will be generating no quality labels. On the other hand, if the model gets good, but no almost perfect, IoU, it will be generating better labels than the ground truth labels, in this case, the cookie with a perfect circle shape. Therefore, the question is how to choose the IoU threshold without a way to evaluate it automatically?

To start experimenting, I will be using IoU and pixel accuracy as evaluation metrics.