I've started getting my first segmentation results for my honours project. To sum it up in a few words, image segmentation is about breaking an image up into regions. This has many applications and is often the first step in image processing, with the end results heavily relying on the segmentation quality.
The problem is that it's very difficult to define what a good segmentation is. It depends a lot on, amongst other things, the nature of the image. Lighting conditions, noise, texture - these can all have a large impact on the results. A key problem in segmentation is that of splitting up into too few (undersegmentation) or too many regions (oversegmentation). This is easily demonstrated by my first results as shown below.
The first image is the original we're trying to segment. It's the famous Lenna image used throughout image processing as the standard test image. Notice the noise in the background, texture in the hair and hat, smooth texture of the skin and many other features that one aims to handle well in image processing.
These next three images are segmentation results I obtained from my implementation of the Watershed algorithm. The first result comes out of the Watershed algorithm with no improvements to the original algorithm besides some linear filtering to reduce noise. A region is displayed by averaging out the colours in that region and assigning a single colour to the entire region. As you can see there isn't much difference between this and the original. Click on the image for a larger view and you should notice the small regions.
This is an example of severe oversegmentation. It was the major problem with the Watershed algorithm when it was first developed. Since then there have been several improvements, one of which you'll see in my subsequent results.
These next two results make use of a very simple smoothing technique to reduce the oversegmentation problem. Gradients form a key role in the algorithm and here I threshold the gradient to reduce the effect of minor differences. The two results below use different threshold values and you should easily be able to spot the differences.
The first is more ideal, with the second being undersegmented. Look at how a large chunk of the hat is grouped together with the background region. Both, however, oversegment the hair and this is where the problem with textures comes in.
There's still a very long way to go here and this is just the beginning of the results. I haven't even tried running with a different image. My eventual goal is to try produce a single segmentation algorithm to cover a wide variety of images. That's where genetic algorithms will come into play to help with the uncertainty of the image.
I must say it's nice to be able to show visual results that probably explain things better than words can explain.
UPDATE: Applying a median filter to the gradient image produces much better results. See the results in the image below. The textured areas are segmented into far fewer regions. The boundary of the hat is fully in tact. The finer details such as the eyes and mouth have less impact on the segmentation. Also, the number of tiny regions resulting from noise is greatly reduced. This shows how how much of an impact such a small change can make.
Thoughts on Rust
4 years ago
Hey Marco,
ReplyDeleteI would like to ask you, if your segmentation method can be adapted to segment handwritten letters?
I am using opencv for all my image tasks - but I am not an excellent mathematican :(
Do you have any suggestions for me? Would be great!
Thank you,
Qazi