Document Type : Original Article
Authors
1 Department of Electronics and Electrical Communications Engineering Faculty of Electronic Engineering Menoufia University: Menouf, Egypt
2 Electronics and Electrical Communications Engineering Dept., Faculty of Electronics Engineering, Menouf, Menoufia University, EGYPT
3 Department of Electronics and Electrical Communication Engineering, Menoufia University, Menouf, Menoufia, Egypt
4 Department of Electronics and Electrical Communications, Faculty of Electronic Engineering, Menoufia University, 32952, Menouf, Egypt.
Abstract
Highlights
Three cancelable speaker identification techniques based on the spectrogram estimation of the encrypted signal using chaotic encryption process, Radon transform algorithm and RSA algorithm have been presented. A lot of simulations have been presented to verify the efficiency of the proposed encryption algorithms. Performance comparison has been made between these techniques to determine the most accurate one. In the simulation study, 20 different samples of voice signals for men and women have been used. First, the original signals are encrypted using the proposed encryption algorithms. Then, the spectrograms of the encrypted signals are estimated and stored in the database instead of the original ones. In the experimental results, the values of EER, FRR, FAR, and AROC have been estimated for each proposed work. The ROC curve and the genuine and impostor distributions are also estimated. The proposed speaker identification techniques were also evaluated using cancelable features under the influence of different noise levels. When comparing the results of all techniques, it was found that the first one using chaotic encryption is clearly affected by noise variance variation. The second one using Radon transformation shows better results at the expense of having much execution time. The RSA algorithm shows the most accurate results with the shortest execution time. This makes the technique more accurate in recognizing the user and also more powerful to resist attack attempts. It also becomes more secure through maintaining the confidentiality of the data.
Keywords
Main Subjects
Biometrics have been used for identifying persons’ identities. Biometric authentication is now widely used in a lot of applications, such as border control, secure computer systems, secure banking services, mobile phones and credit cards. Hence, with biometrics, personal identification based on who he or she is instead of what he or she has (card - code - key) or what he or she knows (password) will be more secure. In addition, it is more complex to copy individuals’ biometrics [1]. The ideal biometric information has some characteristics such as universality, which means that all individuals must be characterized by biometric information. In addition, this information must be as dissimilar as possible for two different individuals, and this indicates uniqueness, and permanency [2].
Speaker identification is the process of identifying a person from his speech signals. This is accomplished through training and testing operations [3] as shown in Fig.1. The training requires feature extraction, and hence speaker modeling through a certain classifier. On the other hand, testing requires a matching operation.
Biometric systems can work in the verification or identification modes [5]. The solution for the problem of attacks is to adopt cancelable
biometric systems. Information security and user privacy are very important concerns in biometric-based systems. One of the main
solutions to achieve these requirements is the cancelable biometrics as it is the protection mechanism used, even if the biometric system is breached. The original biometric template is intentionally distorted in order to be registered in the authentication system [7]. When a revocable biometric pattern is used, we really save a distorted model in the biometric database. In traditional biometric systems, most people are reluctant to present their vital features, because they are worried about their secrecy. Cancelable biometrics solve these secrecy issues, because they prevent the system from storing the users’ original biometrics. Cancelable biometrics are generated through a set of intentional, systematic, and repeated distortions of biometric signals with the aim of protecting user information [8]. The main objective of cancelable biometric schemes is achieving diversity, where multiple cancelable patterns can be created from the same biometric for several applications. In addition, renewability/revocability means direct cancellation and re-issuance if the pattern is breached. Furthermore, non-invertability is adopted to stop fraud. It has to be difficult to get information about the original biometric features from the transformed forms. Finally, the recognition performance using a converted template should be high [9]. Cancelable biometrics achieve a high standard of privacy by allowing various versions to be related with the same biometric data. In each registration, several transformations
can be performed to create the protected templates. This helps in enhancing the ability of generating different user biometrics for different databases. Transforms can be used at two levels to create cancelable templates: signal level and feature level. The transformations are used to make it hard to restore the original templates, when the transformed template is reversed, thus providing the required response against any potential attackers. In addition, the transformed templates have to maintain the discrimination ability between patterns
[10-12].
II. PROPOSED CANCELABLE SPEAKER
IDENTIFICATION TECHNIQUES
This section presents the three proposed techniques for cancelable speaker identification. All of them depend on spectrogram estimation preceded by encryption. The difference between them is in the encryption algorithm. The first one depends on chaotic Baker map for encryption. The second one adds a Radon transform step after encryption. The third one depends on RSA algorithm for encryption.
A. Proposed Technique Based on Chaotic Baker Map This technique begins with chaotic Baker map encryption as a randomization tool for speech signals. Chaotic systems, in general, allow permutation of certain data in matrix format [13]. Chaotic Baker map is one of the most popular maps
B. Proposed Technique Based on Radon Transform
This technique depends on using Radon transform after Baker map encryption Radon transformation is an integral transformation, which aggregates spectrogram values. Fig.6 illustrates three projections of a matrix, M, as an example. Hence, Fig.6 shows three various information sectors of M with respect to the three angles. Obviously, the more the projections utilized are, the more the information obtained from the image. Note that this leads to better performance, but it is very time-consuming. To address the aforementioned problem, we first choose eight projections. An overall study has been done on this choice. The second solution for this problem is the application of eight expectations of the shrinking research space. This greatly reduces the cost of calculations [15-17].
C. Proposed Technique Based on RSA Algorithm
This technique uses the RSA algorithm to encrypt the speech signals.
The Rivest-Shamir-Adleman (RSA) is an algorithm that modern computers use to encrypt and decrypt messages. It is an asymmetrical encryption algorithm. Asymmetrical indicates that there are two different keys. This is called public key cryptography, because one of the keys can be presented to any person. Another key has to be kept secret. The algorithm was constructed based on the fact that finding
the factors of a huge complex number is hard. When the factors are primary numbers, the problem is called initial analysis [19]. The RSA includes a public key and a private key. The public key can be popular to everybody. It is utilized to encrypt messages. Messages, which are encrypted by the public key, can only be decrypted by the private key. The private key has to be preserved as a secret. Calculating the private key from the public key is very severe. The RSA is essentially an authentication system suitable for the Internet. This algorithm was introduced by its inventors in 1977. It is one of the most prevalent asymmetric key cryptosystems included as part of Netscape and
Microsoft Web browsers. First, two huge prime numbers are selected and multiplied by this algorithm to generate the public and the private key pair for encryption and decryption operations [20].
III. RESULTS AND PERFORMANCE ANALYSIS
This section presents the simulation results for all proposed cancelable speaker identification techniques. A unified scenario for simulation is adopted. Twenty speech signals are used and encrypted first. Spectrograms of these signals are estimated and stored in the database. All
simulation experiments on all techniques depend on genuine and imposter tests. Correlation values for genuine as well as imposter records are estimated, and hence the PDFs of correlation values for genuine and imposter tests are obtained. The intersection points for these PDFs are taken as the threshold values. Based on these values, new records are classified as either genuine or imposter records.