FPGA-based speech encrypting and decrypting embedded system
The Texas A&M team describes an FPGA embedded system design project intended to assess the technology’s suitability for use in real-time, high-performance applications.
The test application is a speech encryption system intended for use in military and high-security environments. The team used a Xilinx Virtex II Pro platform FPGA for the project, and was also particularly interested in using the peripheral technologies made available.
We have undertaken a design to assess the viability of using FPGAs in embedded systems with real-time requirements by using one as the basis for a digital speech encryption and decryption system (Figure 1). Digital speech encryption is one of the most powerful countermeasures against eavesdropping on telephonic communications, and was therefore a good test of both the FPGA and the surrounding infrastructural technology. For the purpose of the exercise, we selected the Xilinx Virtex II Pro platform FPGA.
FPGAs as building blocks
The use of FPGAs and configurable processors has become an increasingly interesting option for embedded system development. FPGAs offer all of the features needed to implement even the most complex designs. Clock management is facilitated by on-chip phase-locked loop (PLL) or delay-locked loop (DLL) circuitry. Dedicated memory blocks can be configured as basic single-port RAMs, ROMs, FIFOs, or CAMs. Data processing capabilities, as embodied in the devices’ logic fabric, can vary widely.
The ability to link an FPGA with backplanes, high-speed buses and memories is provided through both on-chip and development kit support for various single-ended and differential I/O standards. Also, today’s FPGAs feature such system-building resources as high-speed serial I/Os, arithmetic modules, embedded processors and large amounts of memory. There are also options from leading vendors for embedded cores and configurable cores.
We developed our FPGA-based embedded system to be a completely programmed chip for a complex function that faced minimal delays during its development and that exploits all that FPGAs have to offer to such complex systems, particularly their practical use in situations that demand the utmost accuracy and precision.
Speech encryption and decryption principles
Speech encryption has always been a very important military communications technology (Figure 2). Given the technology available today, digital encryption is considered the best technological approach. Here, an original speech signal, x, is first digitized into a sequence of bits, x(k), which is then encrypted digitally into a different sequence of bits, y(k), and finally transmitted.
But while digital encryption techniques can offer a very high degree of security, they are not entirely or immediately compatible with all of today’s communications networks. Most telephone systems are still largely analog. Most practical speech digitizers operate at bit rates higher than those that can easily be transmitted over standard analog telephone channels. Meanwhile, low bit-rate speech digitizers still entail relatively high design complexity but offer relatively poor quality results.
Furthermore, almost all digital encryption techniques rely on accurate synchronization between the transmitter and the receiver. Exactly the same block of bits must be processed by the encryption and decryption devices. This not only increases design complexity, but also makes transmission much more sensitive to channel conditions—a slight synchronization error due to channel impairment can completely break the transmission.
There is another type of speech encryption technique, scrambling. The original speech signal, x, is scrambled directly into a different signal, y(t), in analog form before transmission.
Since the scrambled signal is analog, with similar bandwidth and characteristics to the original speech signal, this type of technique can be easily used with existing analog telephone systems. Some conventional scrambling techniques (e.g., frequency inversion, band splitting) do not require synchronization but today offer only relatively low levels of security. More advanced scrambling techniques have recently been developed (e.g., sample data scrambling) and are now used extensively because they continue to offer relative ease of implementation alongside improved levels of security.
A typical advanced scrambling sequence is as follows. Original speech, x, is first sampled into a series of sample data, x(n). This is then scrambled into a different series of sample data, y(n), and recovered into a different signal, y(t), for transmission.
These techniques offer a relatively high level of security and are compatible with today’s technical environment. However, like digital techniques, there is again a heavy dependence on synchronization between transmitter and receiver. The transformation from x(n) into y(n) has to be performed frame-by-frame, and exactly the same frame of sample data has to be used in the scrambling and descrambling processes for the signal to be recovered. As with digital approaches, this complicates implementation and makes transmissions very sensitive to channel conditions.
Recently, two new sample data scrambling techniques have emerged. One scrambles the speech in the frequency domain, and the other scrambles it in the time domain. Both preserve the advantages of traditional sample data scrambling, while eliminating the requirement for synchronization in the receiver. This simplifies the system structure, and significantly improves the feasibility and reliability of sample data scrambling techniques. The basic point here is that the synchronization is only necessary as long as the scrambling and descrambling are performed frame-by-frame. It becomes unnecessary when a ‘frame’ is not defined in the operation.
Scrambling based on frequency band swapping of the analog signal can be used in a wide variety of analog and digital systems since the method can transmit speech signals over a standard telephone line with acceptable quality.
In the time domain method, a digital speech signal is encrypted. This is based on a redundant bit to protect speech information effectively. Although the method is secure, it is hard to apply to a conventional analog transmission line because the bandwidth is wide. In order to reduce the bit rate under the bandwidth for the analog line, a speech encryption system with a low bit-rate coding algorithm is necessary.
The main components involved in cryptography are:
- a sender;
- a receiver;
- plain text (the message before it is encrypted);
- cipher text (the message that has been encrypted);
- encryption and decryption algorithms; and
- a key or keys.
All methods of cryptographic encryption are then divided
into two groups:
- symmetric-key cryptography (a.k.a. private key cryptography); and
- public-key cryptography.
In public key cryptography, the encryption and decryption algorithms are public but the key is secret. Only the key needs to be protected rather than the encryption and decryption algorithms.
In symmetric-key cryptography (Figure 3), a common key is shared by both sender and receiver. Particular advantages of a symmetric-key cryptography algorithm include the following:
- Less time is needed to encrypt a message than when using a public key algorithm.
- The key is usually smaller, so symmetric-key algorithms are used to encrypt and decrypt long messages.
- Each pair of users must have an unique key.
- So many keys may be required that their distribution between the two parties becomes difficult.
Symmetric-key algorithms can be divided into traditional ciphers and block ciphers. Traditional ciphers encrypt the bits of the message one at a time. Block ciphers take several bits and encrypt them as a single unit. Blocks of 64 bits have been commonly used. Today, some advanced algorithms can encrypt blocks of 128 bits.
We worked on two simulation processes for our system:
- An offline simulation using the Xilinx 9.1i project navigator.
- An online simulation using the Virtex II Pro platform FPGA.
The offline simulation process is shown in Figure 4. It did not include a hardware implementation but was intended only to allow for self-test of the encryption/decryption algorithm. Assuming this was successful, our plan was then to move on to the online process.
The following steps were required:
- In the recording process, first we took a voice source input. Then, using Matlab, we created a text (.txt) file. The parameters given during creation of the .txt file were voice duration and its bit rate. The Matlab code for conversion of speech (.wav file) to text (.txt) file was written.
- After successfully creating a text file, a testbench read it character-by-character and then these values were mapped for the encryption/decryption operation.
- After the encryption/decryption operation had been performed, the testbench again created a text file but this one contained the encrypted version of the original voice input.
- In the final step, Matlab code again read the encrypted version of the text file and played it at the specified bit rate.
The online process (Figure 5) included the software as well as the hardware implementation. The board on which the VHDL code was burned contained built-in analog-to-digital and digital-to-analog converter (ADC & DAC) ports. So the VHDL code written at the transmitting end participates in three operations:
- It converts analog voice into digital data by an ADC.
- It encrypts the data using symmetric-key cryptography.
- The encrypted values are sent to the outside world via a DAC.
- Similarly, at the receiver end, the code participates in these three tasks:
- analog-to-digital conversion;
- use of the standard decryption algorithm; and
- digital-to-analog conversion.
This is a real-time process. Input comes continuously from a microphone and is given to the ADC converter on a Virtex II Pro board. We preferred symmetric-key cryptography for encryption because here the same code performs both encryption and decryption operations. As noted, symmetric-key cryptography involves both sender and recipient using a common key. The sender uses an encryption algorithm and the same key for encryption of data, and the recipient uses a decryption algorithm and the same key for the decryption of data. In this process of cryptography, the algorithm used for decryption is the reverse of the encryption algorithm. The shared key must be set up in advance and kept secret from all other parties.
The Stream Cipher of the Data Encryption Standard algorithm was used here. This is a class of ciphers in which encryption or decryption is performed using separate keys created on numerous occasions by a keygen.
In our context, the key space consists of 30 different keys for 30 data samples, and the key space repeats itself for each subsequent data sample. A shift algorithm is also used to obtain these 30 different keys. The algorithm consists of the generation of the key space and the XORing of the space in which the data samples are designed, simulated, implemented on an FPGA, and then tested on hardware.
In this implementation, the keys used for encryption/decryption were each 14 bit long. Since ADC gives 14 bit output, our keys are confined to 14 bits. But the DAC that is present on Virtex II Pro takes 12 bit input, so we are neglecting two MSB bits of the ADC output. Then the DAC sends the encrypted version of speech to the outside world.
Encryption and decryption in practice
In the offline process, we used Matlab to record and play speech, to store the recorded speech in a text file, and to play a speech waveform by reading it from a text file.
There are, as noted, Matlab codes that write to and read from text files. However, we then had to write some VHDL codes. One was for the speech encryption/decryption process and the others were needed to read the text file we had generated in Matlab through the Xilinx testbench.
We observed that the larger the number of bits used to represent the discrete values of a sampled speech signal, the greater the clarity of the speech, although a more complex algorithm was required for higher bit representations. Ultimately, we used 12 bit representations for each sample value and took 20 samples at a time for encryption with 20 different random bit sequences.
In the online process, we first programmed the ADC/DAC converters on the Virtex II Pro board to sample the incoming analog speech signal at a frequency of 1MHz. Once we had the sampled data value for the speech signal, we encrypted it using a random bit sequence. We took a 20-state code for this purpose with a key space of 20 random bit sequences. The encrypted data values were then sent simultaneously through a DAC to speakers. For decryption, since we had used an XORing scheme, we could reuse the same code we had developed for encryption. The ADC output is composed of a 14 bit number whereas the DAC input had to be 12 bit, so we had do a conversion of the 14 bit input number to a 12 bit number by rounding off two LSBs.
- H. J. Beker and F. C. Piper, Secure Speech Communications. London, U.K.: Academic, 1985.
- B. Goldburg, S. Sridharan, and E. Dawson, “Design and cryptanalysis of transform-based analog speech scramblers,” IEEE J. Select. Areas Commun., vol. 11, no. 5, pp. 735-744, May 1993.
- A. Matsunaga, K. Koga, and M. Ohkawa, “An analog speech scrambling system using the FFT technique with high-level security,” IEEE J. Select. Areas Commun., vol. 7, no. 4, pp. 540-547, Apr. 1989.
- K. Li, Y. C. Soh, and Z. G. Li, “Chaotic cryptosystem with high sensitivity to parameter mismatch,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 50, no. 4, pp. 579-583, Apr. 2003.
- G. Manjunath and G. V. Anand, “Speech encryption using circulant transformations,” Proc. IEEE Int. Conf. Multimedia and Expo, vol. 1, pp. 553-556, 2002.
- Digital design: Morris Mano M
- Digital design: Frank Vahid
Department of Electrical & Computer Engineering
Dwight Look College of Engineering
Texas A&M University
Zachry Engineering Center
T: 1 979 845 7441