Development of 3D Ear Recognition System using MatlabA beginner’s guide on processing 3D ear imagesGautam KumarBlockedUnblockFollowFollowingJul 10Face, iris, and fingerprint have been widely used biometric traits for authentication of a person.
Usage of state-of-the-art Deep learning techniques in computer vision and image processing has improved the accuracy of the developed system almost 100%.
However, biometric system, based on these traits is easy to forge, especially fingerprint-based authentication and difficult to acquire samples of iris.
Therefore, researchers are trying to develop new traits for authentication.
Recent studies suggest that the ear of a human represents some unique shape and pattern that can be used to recognize a person.
It is reported that the structure and shape of both ears of a person are almost same but strictly different from other people, which makes it a suitable trait for individual recognition.
Unlike face, it is free from aging effect, i.
, shape of the ear don’t change with age and time.
Unlike fingerprint and iris, ear based biometric system doesn’t need user cooperation to acquires samples and can be captured without acknowledgment of user in the unconstrained environment.
Geometric shape features which can be used to develop ear based biometric system is shown in figure 1.
Figure 1: Key components of ear (source)Figure 1 shows the key morphological components, including outer helix (i.
the outer structure of ear), antihelix, lobe, antitragus, tragus, concha and crus of helix.
Feature extraction from the 2D image is relatively easy because of the presence of gray values corresponding to color, shape, and texture.
In 3D images, we have only (X, Y, Z) points and depth value, which makes it difficult to analyze the geometric shape of key component.
Well, before extracting features from 3D ear image, here I will discuss pre-processing steps for the development of ear based biometric system.
One of the difficult part is to find ear position and crop it from entire image.
In this article, i used depth value of nose tip to crop ear from whole side face image.
About DatabaseWe used The University of Notre Dame (UND) databases which are freely available for public use.
Database acquired is of several collections as explained below:Collection E: 114 human subjects, 464 visible-light face side profile (ear) images, captured in 2002.
Collection F: 302 human subjects, 942 3D (+ corresponding 2D) profile (ear) images, captured in 2003 and 2004.
Collection G: 235 human subjects, 738 3D (+ corresponding 2D) profile (ear) images, captured between 2003 and 2005.
Collection J2: 415 human subjects, 1800 3D (+ corresponding 2D) profile (ear) images, captured between 2003 and 2005.
However, in our experiment we used Collection F and Collection G samples.
Reading 3D depth imagesSample images of the database are in .
However, a corresponding RGB image is also provided by UND, but I have used only 3D raw images for experimentation.
You can read raw images and visualize each plan of X, Y and Z as discussed here.
The whole process is same as that of 3D face visualization.
However, cropping of the ear region is slightly different than cropping facial part.
Next, I will discuss how to crop ear region from the entire 3D image.
Ear Detection and CroppingBefore cropping ROI (Region of Interest) let’s first see the orientation of images in database.
Figure 2: 3D visualization of database image.
Figure 3: Visualization in MeshlabFigure 2 and 3 are of the same subject shown at a different angle.
You can rotate and make its orientation in the same direction using Matlab.
From figures shown above, it is clear that database image comes with side face and only ear part of the whole face is important for us.
Therefore, we have to crop only ear from the entire face.
To crop ear region lets first understand depth information available in each image.
If the camera used to acquire image is placed at z-axis facing origin and person is looking towards the increasing value of y-axis the ear tip will have minimum depth value and nose tip will have maximum depth value.
If you closely look at Figure 2, you can notice that the depth value stated from 1900 and goes up to 2000.
That means its ear tip which is very close to the camera.
To understand more about the depth of ear tip please have a look at Figure 5 of this article.
The only difference is that there I have calculated nose tip and here we are talking about ear tip.
From figure 3, it can be observed that the ear tip has minimum z — value and nose to have a maximum y — value.
Looking at figure 2, we can say that the nose tip has a value around -55 unit in the y-direction.
Distance between ear and nose tip is around 25 unit, and it is almost fixed for each image unless the image didn’t capture an angle other than this.
We will discuss it later in this post, for now, I empirically found 50 unit height and 25 unit width rectangular shape is sufficient to cover entire ear region and this rectangular shape if at +/- 25 unit from nose point.
Here, +25 is considered when nose tip has negative value, i.
, right ear image is to be captured (as shown in figure 2) and -25 is considered when y-value (nose tip) is positive and person is looking towards the increasing value of y.
This is the case when the left ear is to be imaged by the sensor.
To understand the cropping process of ROI please see Figure 4.
Figure 4: Cropping ROIHowever, some image samples are collected at a different orientation, as shown in figure 5.
For such images, I didn’t find any relation between ear and nose tipFigure 5: Image captured at different angle.
therefore, I didn’t select those images for further processing.
(Any help regarding the processing of these images will be appreciated).
Considering the method of ROI cropping shown in figure 4, the resulting image shown in figure 6.
Figure 6(a)Figure 6(b)Figure 6(a) and 6(b) represents the cropped ear region shown at different angles.
These images may have holes, noise; therefore, it is necessary to pre-process it carefully.
In the next section, we will discuss methods of removal of spikes, hole filling, and denoising.
Despiking, Filling holes and DenoisingDespiking: The 3D faces are noisy and contain spikes thus requires smoothing techniques to be applied.
In our study, we have extended the concept of 2D weighted median filtering technique to 3D face images.
The studied technique performs filtering of 3D dataset using the weighted median implementation of the mesh median filtering.
Hole filling: Removal of spikes results creation of holes, therefore it is necessary to fill those holes.
For this purpose, we have used 3D interpolation.
Among all interpolation techniques, we used ‘cubic’.
In cubic interpolation method, the interpolated value at a query point is based on a cubic interpolation of the values at neighboring grid points in each respective dimension.
This interpolation is based on a cubic convolution.
Noise removal: I used 3D Gaussian filter for removal of noise.
The code used for pre-processing steps is shown below.
Finally, the pre-processed ear image is shown in figure 7.
We found some holes still present after pre-processing steps, that means further improvement is possible.
You can use other methods for this purpose which can generate better features.
Figure 7: Pre-processed ear imageFinally, I generated mesh from cloud points, which is used to extract features form 3D cropped ear image that I would discuss in the next article.
The complete code is available at my GitHub repository.
You can download and use it for pre-processing of your 3D ear dataset.
Feel free to upvote if you find this post helpful.
I would like to thank Jayeeta Chakraborty, who equally contributed to develop this project.
In the next post, I will share Iterative Closest Point (ICP) method for 3D ear recognition.
I hope this post may help beginners to start working with 3D databases.