Menu Close

Image Pre-Processing

Image pre-processing or data cleansing is an important step, and most ML engineers devote a significant amount of time to it before building the model. Outlier detection, missing value treatments, and removing unwanted or noisy data are some examples of data pre-processing.

Image pre-processing refers to operations on images that are performed at the most basic level of abstraction. These operations do not increase image information content, but rather decrease it if entropy is used as a measure of information.

The goal of pre-processing is to improve image data by suppressing unwanted distortions or enhancing some image features that are important for subsequent processing and analysis tasks.

Image Pre-Processing Techniques are classified into four types, which are listed below.

1. Brightness corrections

sometimes called Pixel brightness transformations.

Brightness transformations change the brightness of pixels, and the transformation is determined by the properties of the pixels themselves.

In PBT, the value of the output pixel is determined solely by the value of the corresponding input pixel. Brightness and contrast adjustments, as well as color correction and transformations, are examples of such operators.

Contrast enhancement is a critical component of image processing for both human and computer vision. It is commonly used in medical image processing as well as as a pre-processing step in speech recognition, texture synthesis, and a variety of other image/video processing applications.

There are two kinds of Brightness transformations, which are listed below.

A . Brightness corrections

B. Gray scale transformation

A . Brightness corrections

The most common Pixel brightness transformation operations are as follows:

A. Gamma correction or Power Law Transform

B. Sigmoid stretching 

C. Histogram equalization

A. Gamma Correction

Gamma correction is a non-linear adjustment to the values of individual pixels. While image normalization performed linear operations on individual pixels, such as scalar multiplication and addition/subtraction, gamma correction performs a non-linear operation on the source image pixels, which can result in image saturation.

The relationship between the output image and gamma is not linear.

B. Histogram equalization

Because it works on almost all types of images, histogram equalization is a well-known contrast enhancement technique. Histogram equalization is a sophisticated method for modifying an image’s dynamic range and contrast by modifying the image’s intensity histogram to the desired shape.

Histogram modelling operators, unlike contrast stretching, can use non-linear and non-monotonic transfer functions to map between pixel intensity values in the input and output images.

C. Sigmoid stretching

The sigmoid function is a nonlinear activation function that is continuous. The name “sigmoid” comes from the fact that the function is “S” shaped. This function is known as the logistic function by statisticians.

By adjusting the contrast factor ‘c’ and threshold value it is possible to tailor the amount of lightening and darkening to control the overall contrast enhancement

2. Geometric Transformations

The previous methods in this article address color and brightness/contrast. Geometric transformation modifies the positions of pixels in an image while leaving the colors unchanged.

Geometric transforms enable the removal of geometric distortion that occurs during the capture of an image. The most common Geometric transformation operations are image rotation, scaling, and distortion (undistortion).

Here are the first two fundamental steps in geometric transformations:

1. Spatial transformation of physical pixel rearrangement in an image

2. Grey level interpolation, in which grey levels are assigned to the transformed image.

Transformations include:

1. Scaling: Scaling is simply resizing an image.

2. Translation: The shifting of an object’s location.

3. Rotation: Simply rotate an object by theta degrees.

4. Shearing: Horizontal pixel shifting

5. Affine Transformation: Rather than defining the scale factors, shearing factors, and rotation angle separately, it is common to combine these three transformations into a single matrix. As a result, the combination of the four transformations is known as Affine Transformation.

6. Perspective Transformation: alter the perspective of an image or video to gain a better understanding of the information at hand. The points on the image from which you want to gather information must be provided here by changing the perspective.

3. Image Filtering and Segmentation

The goal of using filters is to change or improve image properties and/or extract valuable information from images such as edges, corners, and blobs. A kernel, which is a small array applied to each pixel and its neighbors within an image, defines a filter.

Some of the fundamental filtering techniques are as follows:

Low Pass Filtering (Smoothing): Most smoothing methods are based on a low pass filter. Smoothing an image involves reducing the disparity between pixel values by averaging nearby pixels.

High pass filters (Edge Detection, Sharpening): A high-pass filter can be used to sharpen an image. These filters highlight fine details in the image, the inverse of the low-pass filter. pattern, centered within the array

Directional Filtering: A directional filter is a type of edge detector that can be used to compute an image’s first derivatives. The first derivatives (or slopes) are most visible when there is a large difference between adjacent pixel values. Within a given space, directional filters can be designed for any direction.

Laplacian Filtering: A Laplacian filter is an edge detector that computes an image’s second derivatives while measuring the rate at which the first derivatives change. This specifies whether a change in adjacent pixel values is caused by an edge or by continuous progression.

Negative values in a cross pattern are typically used in Laplacian filter kernels, which are centered within the array. The corners can have either a negative or positive value. The value at the center can be either negative or positive.

4. Image Segmentation

Image segmentation is a technique used in digital image processing and analysis to divide an image into multiple parts or regions, often based on the properties of the pixels in the image. Image segmentation could entail separating the foreground from the background or clustering regions of pixels based on color or shape similarities.

Image segmentation is primarily used in

Non-contextual thresholding: The most basic non-contextual segmentation technique is thresholding. It converts a greyscale or color image into a binary image known as a binary region map with a single threshold.

The binary map contains two potentially disjoint regions, one containing pixels with input data values less than a threshold and the other containing pixels with input data values equal to or greater than the threshold.

Thresholding techniques are classified as follows.

1. Simple thresholding

2. Adaptive thresholding

3. Colour thresholding

Non-contextual thresholding groups pixels without regard for their relative positions in the image plane. Contextual segmentation is more effective at separating individual objects because it takes into account the closeness of pixels that belong to a single object. Contextual segmentation can be divided into two categories based on signal discontinuity or similarity.

Discontinuity-based techniques seek complete boundaries enclosing relatively uniform regions while assuming abrupt signal changes across each boundary. Similarity-based techniques attempt to create these uniform regions directly by grouping together connected pixels that meet certain similarity criteria. Both approaches are similar in the sense that a complete boundary divides one region into two.

The types of contextual segmentation are listed below.

Pixel interconnection
Similarity between regions
Growing Region
Segmentation via split-and-merge

5. Fourier transform

The Fourier Transform is a useful image processing tool that divides an image into sine and cosine components. The Fourier or frequency domain equivalent of the image is represented by the transformation’s output, while the spatial domain equivalent is represented by the input image.

Each point in the Fourier domain image represents a specific frequency contained in the spatial domain image.

The Fourier Transform is used in a variety of applications, including image analysis, filtering, reconstruction, and compression.

Because the DFT (Discrete Fourier Transform) is a sampled Fourier Transform, it does not contain all of the frequencies that make up an image, but rather a set of samples large enough to fully describe the spatial domain image.

Image Pre-Processing techniques

We like your comment

%d bloggers like this: