Side-channel attacks
What is a side-channel attack?
A side-channel attack is a form of reverse engineering. Electronic circuits are inherently leaky – they produce emissions as byproducts that make it possible for an attacker without acess to the circuitry itself to deduce how the circuit works and what data it is processing. Heat and electromagnetic emissions are both viable sources of information for an attacker. Because these emissions do not play a part in the operation of the circuit itself – they are simply side effects of it working – the use of them to perform reverse engineering has earned the term ‘side-channel analysis’ or ‘side-channel attack’. The difference between the two is largely one of intention.
How realistic is the risk?
The risk is, at the very least, costly to reputation. A number of security researchers actively use side-channel analysis to determine the vulnerability of commercial ICs. They will often pick as primary targets products that are sold as possessing high security, able to protect sensitive information or financial value, in order to demonstrate that they have unforeseen vulnerabilites. A recent example is the analysis of a Microsemi field-programmable gate array (FPGA) by researchers at the University of Cambridge, UK and private company Quo Vadis Labs. The researchers claimed their analysis demonstrated the existence of a backdoor to the FPGA’s stored content. Microsemi denied that this was the case arguing that the researchers had simply analysed a test function that could be disabled permanently by customers before deployment.
The year before, researchers at the Horst Hortz Institute for IT Security at Ruhr University in Bochum, Germany described how they cracked the AES-256 encryption that designers can use to protect the circuits downloaded into Xilinx’s Virtex-4 and Virtex-5 FPGAs.
There is even an annual competition to determine who can crack a programmable logic-based AES implementation the fastest using side-channel analysis. The organizers claim the purpose is to help improve countermeasures to attacks on cryptographic functions.
Aside from the reputational risks to silicon vendors, there are risks to users of secure devices as attackers may use side-channel analysis reconstruct encryption keys and attack the system for financial gain or to gain access to secret data.
How does side-channel analysis work?
All attacks take advantage of the changes in processing behavior that will be exhibited at different times during algorithm execution. There are two broad classes of side-channel analysis: simple and differential. Within those classes, attackers can use a range of side-channel properties, such as heat generated, power consumed, or execution time. For embedded systems where the attacker has access to the hardware, heat and power represent the most important sources of leaks, although timing-based attacks are likely to become more on multitasking and multiprocessor systems where the attacker is able to load their own code or use interactions between existing applications to track behavior.
For networked systems, time-based attacks are the most feasible and have been exploited. Systems that use memory caches are particularly vulnerable to timing-based attacks because of the significant difference in performance of a given section of code based on whether accesses to the cache hit or miss and force a slower read or write to main memory.
If an attacker is able to run their own code on the system, they can exploit timing-based attacks not just by observing the runtime of the target application but the timing of memory accesses of their own, as these will be affected by cache behavior. This can even be achieved on cloud servers. The attacker can force the cache into a particular state by running software that fills the lines in a predetermined way and also observe how the target application displaces their own data.
The group led by G Edward Suh at Cornell University has identified network-on-chip (NoC) and shared-memory controllers as vulnerable to timing-related side-channel attacks.
Similar to systems with caches, low end microcontrollers without dedicated encryption circuitry often take varying lengths of time to perform the computations needed to encrypt or decrypt data. Commonly used encryption systems employ a mixture of exponentiation – typically squaring – and multiplication, proceeding one bit at a time. As squaring can be achieved on a binary system using just shift operations, it takes many fewer cycles than the shift-and-add serial multiplication algorithm that will be used by low-end microcontrollers – so the attacker can look at the time it takes to process each instruction. If the microcontroller has a dedicated multiplier, this will consume more energy than the squaring operation, consuming more current and generating more heat and EMI.
In an algorithm such as RSA decryption, a multiplication will only be performed if the exponent bit being processed is 1. The attacker can simply measure changes in current to derive the key, one bit at a time.
Simple power analysis works for low integration ICs where there is little other on-chip activity to mask the behaviour of the target circuit. For this reason, simple power analysis is not generally very useful, although it has served to uncover the encryption keys processed by low-end microcontrollers.
Differential power analysis (DPA) is a statistical method that has proved devastatingly effective at uncovering sensitive information about target circuits even when other surrounding gates are actively switching. DPA involves the attacker making an hypothesis about the behaviour or state of the target circuit – a guess at part of a full encryption key, for example, on the basis that most systems work on keys in a series of, say, 8bit chunks.
If the guess is correct, the emissions associated with the electrical activity inside the chip will be correlated. If not, the actiivty will be uncorrelated. Over a large number of guesses and measurements, the correlated results will separate out, providing the attacker with clues as to the key value. As more measurements are taken, the more any uncorrelated noise is reduced.
The attack on the Microsemi ProASIC3 FPGAs involved a variant of DPA called pipeline emission analysis, developed by the sponsor of the research, Quo Vadis Labs.
Are there countermeasures?
A range of countermeasures can be used to defeat, or at least slow down, side-channel analysis. They concentrate mainly on the reduction of differences between leakage values – in which the operation sequences are made less dependent on key values or intermediates – and randomisation, where the order of operations on the data is constantly changing. This latter technique is generatlly better at defeating the correlation techniques used in DPA as leakage mitigation can be overcome through the use of more measurements.
The techniques used for leakage mitigation include pre-charging registers and buses to prevent the generation of power-leakage signatures based on the change in bus values as data values are passed. Fixed‐time algorithms that have no no data‐dependent delays can reduce the ability to detect data‐related timing signatures. Performing more operations in parallel or even dummy operations will also reduce the attacker’s effective signal‐to‐noise ratio.
Companies such as Cryptography Research (CRI) have patented side-channel countermeasures that can be licensed by OEMs. Others, such as ESCRYPT have developed IP cores that they claim to be far more resistant to side-channel analysis than conventional designs.
At the 22nd Usenix Security Symposium in 2013, researchers from the IMDEA Software Institute and Saarland University presented an auditing tool – CacheAudit – for caches to demonstrate how effectively countermeasures such as preloading could mitigate cache-oriented timing-based attacks.
Boris Köpf and colleagues from IMDEA have also developed techniques to quantify the upper bounds on the amount of information that may be leaked by a particular cache implementation for a given program using the AbsInt Timing Explorer tool.
In general, the team has observed that larger caches increase information leakage because it improves the resolution of data that the attacker can obtain. Conversely, longer line sizes reduce leakage.