The way we work, we try to define regions whose boundaries are somewhat regular, and in a way such that the pixel distribution of each region has distinct statistical properties.
In this library, we use level-set functions to represent regions. They are very flexible in that they allow for arbitrary topology of regions, are easy to modify and as (differentiable) functions they fit well into a PDE framework.
A level-set function is a scalar function defined over the whole image domain, and defines a separation of the image into two parts: those parts of the image where the level-set function takes on a negative value is defined as being a separate region from those parts where the level-set function takes on a positive value. In LSSEG, we choose the convention that all parts of the image domain where the level-set function is negative is called the interior of the region represented by the level-set function (all the rest thus constituting the exterior). Also, those parts of the image where the level-set function is zero is called the boundary of the domain.
in LSSEG, the level-set function is represented by the class LevelSetFunction. This class inherits from the Image class and can thus be regarded as a (greyscale) image itself; each of its pixels representing the value taken on by the level-set function at the position of that pixel. (NB: the LevelSetFunction has a restriction not present in its base class Image: it always has exactly one channel).
For a very good book on level-set functions and their use, refer to [Osher03].
. Remember our convention that the interior of the region described by our level-set function is the domain on which the level-set function is negative, and that the boundary of this region is described by the domain on which the level-set function is exactly zero (a curve, except in degenerate cases). Again, refer to the book by Osher for formal explanations and additional detail.
. Discretized, this describes the change in
from one timestep to the next. In the LSSEG library, the quantity
can be directly obtained from a LevelSetFunction through use of the member function LevelSetFunction::curvatureTimesGradXD(). Note that movement by mean curvature motion is completely determined by the shape of the level-set function itself, and is not influenced by any external factors.
Here,
may vary over the domain, hence the effect on
is not the same everywhere, which allows for controlling its development according to external factors. The incremental change in the level-set function due to normal direction flow can be computed in the LSSEG library by using the functions normal_direction_flow_XD(). In order to define a
function to use in segmentation, we use ForceGenerators.
is quite computationally expensive. However, we are only interested in the development of
around its current zero-set, where the current region boundary is located. Therefore, we only apply the above equations to a small band along this boundary. Since this boundary will move over time, the band will move along with it. All parts of the level-set function away from this band will be left as it is (except that it may be occationally reinitialized to the signed distance function).
In the above formula,
represents the whole image domain, whereas
represents the boundaries that make up an image partition.
is a simpler image, which we can consider as "a way to approximate the contents of each image region".
The Mumford-Shah is a functional that assigns one numerical value for each possible partition
of the image domain (with a given way
to approximate the contents of each region), and it is chosen in such a way that "better" partitions are assigned lower values. ("Better" in the sense that the bondaries are kept short and that each domain has distinct properties that can be modeled well by out simpler approximation
). Thus, in order to search for a good segmentation, we search for a partitioning of the image that gives a value that is as low as possible when applying the Mumford-Shah functional. You can read more about the Mumford-Shah functional in the glossary. In order to get a more formal and in-depth explanation about how it is used for our segmentation purposes, you should consult chapter 5 of [Brox05].
The Mumford-Shah functional provides a very general setting for describing image segmentation. Many segmentation methods can be seen as different ways of searching for minima of this functional, with different ways of representing
and different approximations
. In our PDE-based approach, we are interested in the functional's derivatives, which we suppose exist, and to use them in a minimization process in order to converge to a minimum, which we define as the segmentation we are looking for.
(region boundaries) and
(approximating model of each region). In our setting, for now, we aim to partition the image into exactly two, not necessarily connex, regions, and represent this partitioning by a level-set function
.
is then the zero-set of
.
to be piecewise constant on each of the two regions defined by
. This means that the second term of the Mumford-Shah functional above vanish (
zero everywhere), and that the constant value of
for a given region must be equal to the average pixel value for that region, in order to minimize the functional's first term. Through the Euler-Lagrange equations we obtain an expression for an evolving
that, in a slightly modified version, can be written:
here represents the image, so it varies over the domain.
and
are the constant values for
inside and outside the specified region. These values are obtained by averaging pixel values. This is the segmentation model proposed by Chan and Vese in 1999. For details on how we arrive at this equation, you can also consult chapter 12 of Osher's book, or subsection 5.1.2 of Brox' PhD thesis. We note that, by posing
, the above expression can be rewritten:
As you can see the right-hand side now consists of two terms which are the expressions for mean curvature motion and normal direction flow respectively, with
playing the role of deciding the weight of one term with respect to the other. As stated earlier, in LSSEG we use ForceGenerators in order to compute
for various segmentation schemes. The ForceGenerator in LSSEG corresponding to the Chan-Vese model is called ChanVeseForce.
in the Mumford-Shah functional. n If we let
represent some probability distribution of pixels, we obtain the following equation of evolution for
:
Here,
and
represent, respectively, the probability of a particular pixel value inside and outside of the region defined by our level-set function
. We see that, again, the equation of evolution is a sum of a term describing mean curvature motion and a term describing normal direction flow, with
being a parameter that determines the relative weight of the normal direction flow term with respect to the other term. Read subsection 5.1.2 of Brox' PhD thesis in order to see how we arrive at this expression.
If we use a normal probability distribution to model pixel distributions, the above expression becomes:
where
and
are the estimated average pixel values of the image inside and outside of the region, and
and
are the estimated variances. Contrary to the Chan-Vese model, this model is able to distinguish between two regions having the same average pixel values, as long as there is a distinct difference in the variances of the two regions. The LSSEG ForceGenerator corresponding to this model is NormalDistributionForce. We can choose other probability distributions as well; if we construct probability distributions based on the current histograms of pixels in each region, the Parzen density estimate, we obtain an even richer model, but which is more dependent on a good initalization. The LSSEG ForceGenerator used in this model is called ParzenDistributionForce. Again, for details, refer to Brox' PhD thesis.
based on a selected ForceGenerator, Image, initialization and smoothness value
is called develop_single_region_XD().
was greyscale, i.e., contained only one channel. It is however very easy to extend this approach to a multi-channel setting. In the evolution equation, the term expressing mean curve motion remains the same, whereas the term expressing normal direction flow is modified to take into account the contribution from each channel For the model using regional statistics, we will then obtain (supposing we have M different, independent channels):
Details are found in section 5.2 of Brox' PhD thesis. The LSSEG ForceGenerators are designed to work with multichanneled images (including, of course the special case of single-channeled images).
No matter which of the above mentioned models we use, we still need to ensure that the regions into which we want to separate the image have clearly discernable statistical properties. For a white mouse on a black carpet, this is easy - the average pixel intensity is very different between the "mouse" region and the "carpet" region, and a greyscale, single-channeled image should be sufficient to obtain a good result. For an image describing a red cup on a blue table, it is likewise a fairly easy task to obtain the segmentation by using (at least) the "RED" and "BLUE" color components of the color picture. However, often we would like to segment regions that are not easily distinguishable by color or luminosity. In this case, we would have to base our segmentation on different texture properties of the regions we want to obtain. For this purpose, we must find ways to filter the image in ways that enhance the pixel distribution statistical properties between regions of different textures. This is not an easy thing to do, but two possibilities are:
An image of a zebra, with its three structure tensor components.
An image of a zebra, with its computed scale measure (computed using 20 timesteps of duration 1)
In the original evolution equation for segmentation into two regions using regional statistics, we had:
In the multi-region setting, there is no outside anymore, just a number
of different competing regions described by
level-set functions
. For a given level-set function
, we modify its evolution equation in the following way:
We see that the term representing the outside (
) in the original equation has now been replaced by a term
representing the "strongest" competing candidate at the point in question. This leads to an evolution where stronger candidates "push away" weaker ones. As long as we take care to initialize the domains so that they are non-overlapping and together cover the whole image domain, this will lead to an evolution where regions stay separate while still covering the whole image domain at all times.
The multi-region setting has one important implication for the initialization of ForceGenerators when using the LSSEG library: by default the ForceGenerators NormalDistributionForce and ParzenDistributionForce compute the quantity corresponding to the normal direction flow for two-region segmentation, i.e.,
. In the multi-region setting, the outside is not defined anymore, and the corresponding term,
must not be added to the term. For this reason, these ForceGenerators need to be told if they are going to be used in a multiregion setting. This is specified upon the construction of the ForceGenerators by a boolean argument to their constructor.
The algorithm in LSSEG that carries out the evolution of several level-set functions
in a multiresolution setting is called develop_multiregion_XD().
initialize_levelsets(), which selects the use of one of the LSSEG functions:
an illustration of the multilevel segmentation approach
multichan in the code. The channels going into the assembly of this image can be specified by the user on the command line.multichan, it is smoothed using an anisotropic smoothing process that preserves edges in the image. The benefit of this smoothing is to remove fluctuations in order for the segmentation process to converge faster and prevent getting stuck in local minima. It also distributes more evenly the information found in the structure tensor and scale measure channels. The level of smoothing to be used can be set from the command line. The user specifies a number, and when the average pixel variation between two smoothing iterations is less that that number, the smoothing process stops. For this reason, smaller numbers lead to more smoothing . In general, smoothing is good, but too much smoothing may remove important cues for the segmentation. In addition, the smoothing is computationally expensive.multichan) for a given level of resolution has been assembled and smoothed, the segmentation process is started. This is done through calls to the develop_single_region_2D() function or the develop_multiregion_2D() function, depending on whether we segment into two or more regions.Below you find the segmentation result of an image of a guppy. The first segmentation contain two regions, the second contain four. The parameters for the four-region image had to be tuned more carefully. For a higher resolution image, click here.
Left: original image. Center: segmented into two regions.Right: segmented into four regions
segmentation2D guppy.png -init 10segmentation2D guppy.png -init 4 -nr 4 -s 0.1 -lr 60 -i1 10000
1.4.7