Lossless compression | Probative archival | Traceability

(Resource) Algorithmic Evaluation of Image Quality

WaavesResources > Evaluation of image quality

In technical imaging, it is essential to be able to accurately describe the nature and amplitude of the degradations to image quality created by compression and reconstruction processes.

Many objective metrics have been devised for this purpose but their correlation to the subjective analysis of human evaluation is not always very good.

Furthermore, many industrial and scientific sectors are more concerned about the interchangeability of images, ie the ability to reconstruct (after compression) a picture very true to the original, than about the subjective quality. And other metrics exist to quantify this second aspect of the image quality.

métriques mse et ssim d'évaluation de qualité d'image
This page describes the 2 types of metrics eand clarifies their use and their practical meaning for the evaluation of the performances of a codec or an image compression library.

Mean squared error

MSE (Mean squared error) measures the average of squared errors between a reference image (the original) and a degraded image (the image reconstructed after compression). It is calculated pixel by pixel, by adding the squares of value differences between pixels and dividing the result by the total number of pixels.

For color images, the MSE is the average of each channel's MSE.

A null MSE (=0) indicates a perfect identity between source and reconstructed images (no information loss). 

The MSE metric has one major drawback: its important sensitivity to the information depth of the image. An image with an MSE score of 100 can appear (subjectively) severely degraded if it is coded using 8 bits but almost perfect in 16bits per channel, for instance. That is why the PSNR (below) is often prefered: it introduces the notion of bit depth in its calculation.

Peak signal-to-noise ratio

PSNR (Peak signal-to-noise ratio) indicates the ratio between the power of a signal and the power of the noise affecting the fidelity of its representation. It is used to express the reconstuction quality of a lossy image compression algorithm  (its value being undefined when the two images compared are identical): the original image constitutes the signal and the errors introduced by the compression are the noise.

PSNR aims to come closer to a good approximation of the human perception of image reconstruction quality.

Its advantage compared to MSE is that it uses image information depth in its calculation (R representing the maximum value of a pixel in the image).

But MSE doesn't make it easy to compare the compression performances of various codecs and doesn't provide a good correlation with human analysis, because its calculation, like MSE's only focuses on pixel by pixel differences without taking any structural information into account. This type of information is essential in human vision.


Strctural similarity

SSIM, the structural similarity index, makes it possible to measure the similarity between 2 images in a way that mirrors human perception much more closely than the previous 2 metrics. It is based on the observation that human vision is highly adapted to the analysis of structural information and thus aims at measuring the changes in this type of information between source and reconstructed images.

It is the product of 3 components:  a luminance term (luminance variations, normalisation), a contrast term (contrast changes, gamma distortion), and a structural term (blur, noise, posterisation, sharpening ..).

Adding a gaussian blur,  for example, has very little impact on MSE (and therefore on PSNR) since the luminance variations pixel to pixel cancel out mutually. The SSIM score, on the other hand will be severely affected, as would human perception be. All the images at the top of this page have an identical MSE of 144 but very different SSIM scores.

MS-SSIM (Multi-Scale Structural SIMilarity) is an extension of SSIM that simulates the perceptual quality at various viewing distances rather than a single one. The algorithm calculates several SSIM values at varying image resolutions, and places less emphasys on the luminance term than on the contrast and structural terms. This modification leads to a higher correlation with subjective analysis, at the expense of a much higher processing cost.  

Discover the Waaves Product Range