In this post I will demonstrate the use of CLASSIX’s new MATLAB application available at Exchange MATLAB files and GitHub. We will segment an image depicting 24 Greek coins, following a demo in scikit-learn. This demonstration used spectral clustering to identify connected regions of similar grayscale pixels in an image. Spectral clustering is suitable for image segmentation tasks, as it naturally works with graph Laplacians that can encode neighborhood information for nodes (in this case, nodes are the pixels of an image). In this way, the method favors the grouping of nearby pixels together, tending to produce connected 2D regions in an image.
CLASSIX is a distance based clustering method and so it is natural to encode pixel affinity information as additional features of the data points. More specifically, we represent each grayscale pixel as a 3D point of the form [c,scl*x,scl*y], where c is the grayscale value and x, y are the pixel coordinates. The scl scaling parameter is used to control the relative weighting between a pixel’s color and position.
In this example we use MATLAB’s new Python interface to load the Greek coins image and preprocess it (smooth, downsampled) just like in the scikit-learn image segmentation example.
clear all
addpath .. % get data
pyrun("from skimage.data import coins")
pyrun("from scipy.ndimage import gaussian_filter")
pyrun("from skimage.transform import rescale")
pyrun("orig_coins = coins()")
% Resize it to 20% of the original size to speed up the processing
% Applying a Gaussian filter for smoothing prior to down-scaling
% reduces aliasing artifacts.
pyrun("smoothened_coins = gaussian_filter(orig_coins, sigma=2)")
coins = pyrun("coins = rescale(smoothened_coins, 0.2, mode='reflect', anti_aliasing=False)","coins");
coins = double(coins);
% plot
imagesc(coins), colormap gray, title('24 Greek coins')
We are now running CLASSIX. The parameters require some fine-tuning, but CLASSIX runs fast enough that this can be done interactively.
[ny,nx] = size(coins);
[X,Y] = meshgrid(1:nx,1:ny);
scl = 0.08; % locality weighting
data = [ coins(:), scl*X(:), scl*Y(:) ];
tic
[labels,explain,out] = classix(data,0.03,20);
fprintf('Runtime in seconds: %f\nNumber of clusters: %d', toc, length(unique(labels)))
This took just under 0.4 seconds. Note that 25 clusters were calculated. These correspond to the 24 coins plus the background of the image.
Calling CLASSIX’s explain method leads to exciting results: it essentially regenerates the original image without “knowing” that the provided data array corresponded to an image!
explain()
CLASSIX is an explainable clustering method that provides justification for its computations. Let’s find out why two data points (pixels) are assigned to the same class (=coin):
explain(4000,4313)
CLASSIX explains that there is a path of data points within cluster #24 that connects the two data points at indices 4000 and 4313, and the step sizes on this path are small (limited by the radius parameter we chose when calling CLASSIX).
The CLASSIX Python repository contains a demo that includes comparisons with spectral clustering: https://github.com/nla-group/classix/blob/master/demos/Segmenting_greek_coins.ipynb
When running on the same laptop
- Spectral clustering required about 3.17 seconds
- CLASS.PY approximately 0.61 seconds are required
- CLASS.M (this example) took about 0.40 seconds
to segment the Greek coin image in similar quality.
It is a common observation that the MATLAB implementation of CLASSIX is slightly faster than Python. In Python, using sklearn’s spectral_clustering method with the cluster_qr option (the fastest we’ve tested) CLASS.PY is about five times faster than spectral clustering in this example.
Finally, we note that clustering may not be the best approach for image segmentation. But the fact that the data has a natural image representation allows for nice visual interpretations of clustering results.
Thanks. I am grateful to Andrew Knyazev for my hint at the demonstration of the Greek coins, and lo Mike Croucher for help developing the CLASSIX MATLAB application.