Finally! The topic that I have been aching to learn since I started doing my research problem. Haha. The truth is, I've already tried learning this on my own since the summer of 2013. I had to present meaningful results to my advisers at the time and I found out that color segmentation is one of the topics taught in AP186. I was actually ranting why it has to be taken during the fifth year and not earlier. Well, that of course, is the selfish side of me talking. I knew there was a reason why it had to be taught on our last year.
My research is on the dynamics of granular collapse. The final configurations and the topology of the flow of granular particles are my main concerns. For that reason, I have to find the final position of all the particles resulting from the collapse of a granular column. In order for me to fully visualize and understand their dynamics, I had the grains color-coded and layered into three, both vertically and horizontally. An example of my raw data (compressed) is shown below:
Figure 1. Raw shots from my experiment on granular collapse
As can be observed from the pictures above, the red and yellow grains appear more apparent than the blue ones. My problem lies with the fact that the blue grains camouflage with the background. When I did the experiment, my initial plan was to make the background white or green. However, I have another experiment involving starch (colored white) which uses the same setup, so a white background was not a good option. My other experiment also involve the use of yellow green grains so a green background was not a good option either. I had to settle to black because it was the only color available at the time too.
Anyways, enough with my rants and stories. Let's get back to the main topic of this blog post -- the color segmentation. From the root word alone, segment, defined by the thefreedictionary.com as "to divide or become divided into segments", image segmentation is the partitioning of a digital image into smaller regions and separating a certain region of interest (ROI). There a number of processes in which one can segment an image. The simplest example of this is thresholding method where the desired regions of a certain image is characterized by particular gray-level value. The cut-off graylevel value is then chosen and pixel values not belonging to the corresponding chosen range of graylevel values are considered 0, otherwise, it's 1. Thus, thresholding converts a grayscale image to binary. This process is considered to be the simplest because it's very straightforward to perform especially if the image involve consists of a bimodal histogram wherein the foreground and the background can be easily separated.
If however, we are faced with images that include shading variations (for example: 3D objects), it is best to perform other methods that can separate the pure chromaticity information from brightness. In an RGB image, each pixel has a corresponding value for red, blue and green. If we then let I be equal to the sum of the values R, G, and B, then the normalized chromaticity coordinates are:
(NCC)
Thus, 1 = r +g+b so that the normalized blue coordinate would just be:
We can just therefore represent the chromaticity using two coordinates which are the r and g. We note that the brightness information is stored in the I value. Since we are now dealing with the normalized chromaticity coordinates, we were able to reduce the color information from 3-dimensions (RGB) to 2-dimensions (rgI) where I is just equal to unity. The normalized chromaticity space is shown in the following figure, where the y-axis and the x-axis corresponds to the r and g values, respectively.
Figure 2. Normalized chromaticity space
From the chromaticity space, a red colored pixel is therefore at the lower right corner with values (1, 0), and the value (0,0) corresponds to a pure blue pixel since b = 1-0-0 = 1. One can also notice that the white pixel value appears at the coordinate (0.33, 0.33), so that the r, g and b, all have the same value.
We now proceed to the discussion of the different methods of segmentation based on color -- parametric and nonparametric. In performing these methods, one begin by choosing and cropping a region of interest (ROI) from their image. In parametric segmentation, the color histogram of the chosen ROI is normalized to obtain the probability distribution function of the color. This PDF would then serve as the basis in determining if a certain pixel in the image belongs to the region of interest. A joint probability p(r)p(g) for green and red coordinates corresponds to the likelihood of a pixel belonging to the ROI where p(r) is given by the following equation:
In the above equation, we assume a Gaussian distribution independently along r and g values. The mean and standard deviations are therefore calculated first from the chosen ROI for both the red and green values. The same equation is computed for the green values. This method was performed in the image below and the result is displayed along with it.
Figure 3. A copy of the original image obtained from [2]
Figure 4. Parametric segmentation of the image from figure 3
Since the probability distribution function of the color is dependent on the chosen ROI, it is safe to assume that the precision of the segmentation is dependent on the number of pixels present in the ROI. If say we increase the number of pixels in the ROI, there is a greater chance that a higher variety of shade will be present in the ROI and the standard deviation would be higher. Consequently, the probability of a random pixel to belong to the ROI would also increase. If however, there is uniformity in the chosen ROI even though you increase the number of pixels (that is, if no increase in standard dev), no difference should be observed. This can be best explained by the following results.
Figure 5. Effect of reducing the number of pixels included in the ROI
Notice that the decrease in the ROI resulted to the loss of some of the segmented images that appears when the bigger ROI is used. However, not all part of the concerned sections (red colored paint, bottom image) are not completely filled as compared to the result bigger ROI.
Another method of segmenting a colored image is by obtaining the 2D histogram of the region of interest and from it, the pixel is tagged whether or not it belongs to the obtained histogram. The 2D histogram of the red ROI is shown below.
'
Figure 6. 2D histogram of the region of interest
We then employ the method of histogram backprojection wherein a pixel location at the original image is given a certain value corresponding to its histogram value. Supposed a pixel colored blue appears at (0,0), then its corresponding value would be 0 since at (0,0), the 2D histogram is 0. The result of the nonparametric segmentation is shown in the following figure.
Figure 7. Nonparametric segmentation of the image from figure 3
The effect of increasing the number of pixels of an ROI for the nonparametric segmentation is also shown in the next figure.
Figure 8. Effect of reducing the number of pixels included in the ROI for nonparametric (reduce patch size)
It is intuitive that if we decrease the patch size that we use as ROI, some shades would also be removed from the histogram. This is apparent in Figure 8.
If we now compare the result of the nonparametric to the parametric segmentation, we can easily say that the nonparametric segmentation shows a more accurate result.
In conclusion, I would like to say that the key in segmentation is to find the right size and area for the ROI. I've learned this because I have used different sizes of ROI in my research. An ROI that is too big would include other unnecessary sections, while a very small ROI would exclude those that are supposed to be included.
For this activity, I give myself a grade of 12/10 for exploring the parameters that could further affect the result.
[1] Maricor Soriano, Color Image Segmentation Activity Manual
[2] Color Oil paints in opened paint buckets from http://www.flashcoo.com/cartoon/colorful_objects_and_designs/Colorful_oil_paints_Opened_paint_bucket.html on August 10, 2013