Undervolting ARM Processors

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Scrooge Attack:

Undervolting ARM Processors for Profit


(Practical experience report)

Christian Göttel∗ , Konstantinos Parasyris† , Osman Unsal‡ , Pascal Felber∗ , Marcelo Pasin∗ , Valerio Schiavoni∗
† Lawrence Livermore National Laboratory, [email protected]
‡ Barcelona Supercomputing Center, [email protected]
∗ Université de Neuchâtel, Switzerland, [email protected]

Abstract—Latest ARM processors are approaching the com- Table I: List of server-grade and mimicking ARM processors
putational power of x86 architectures while consuming much with their supported ISA. ‘*’: used in our evaluation (see §V).
arXiv:2107.00416v2 [cs.DC] 2 Jul 2021

less energy. Consequently, supply follows demand with Amazon


Processor ISA Cloud provider
EC2, Equinix Metal and Microsoft Azure offering ARM-based
Ampere Altra ARMv8.2+ Equinix, Oracle
instances, while Oracle Cloud Infrastructure is about to add such Ampere eMAG 8180 ARMv8 Equinix
support. We expect this trend to continue, with an increasing AWS Graviton ARMv8 AWS
number of cloud providers offering ARM-based cloud instances. AWS Graviton 2 ARMv8.2 AWS
ARM processors are more energy-efficient leading to substan- Fujitsu A64FX ARMv8.2 -
tial electricity savings for cloud providers. However, a malicious Huawei Kunpeng 920 ARMv8.2 -
cloud provider could intentionally reduce the CPU voltage to Marvell ThunderX ARMv8 Equinix
further lower its costs. Running applications malfunction when Marvell ThunderX2 ARMv8.1 Microsoft Azure
the undervolting goes below critical thresholds. By avoiding NVIDIA Grace TBA -
Broadcom BCM2837(B0)* ARMv8 -
critical voltage regions, a cloud provider can run undervolted
Broadcom BCM2711* ARMv8 -
instances in a stealthy manner.
This practical experience report describes a novel attack
scenario: an attack launched by the cloud provider against its
users to aggressively reduce the processor voltage for saving 2, 5, 6, 7] are currently available across cloud providers.
energy to the last penny. We call it the Scrooge Attack and show ARM processors also started reaching into the supercomput-
how it could be executed using ARM-based computing instances. ing market segment. We expect an increasing availability of
We mimic ARM-based cloud instances by deploying our own
ARM-based devices using different generations of Raspberry Pi.
ARM Neoverse [11] processors and future server-grade ARM
Using realistic and synthetic workloads, we demonstrate to which instances to close the performance gap to x86. Processors offer
degree of aggressiveness the attack is relevant. The attack is different power management mechanisms to adjust frequencies
unnoticeable by our detection method up to an offset of −50 mV. and voltages. The energy footprint of a single execution step
We show that the attack may even remain completely stealthy (i.e., one single instruction on a processor) is fairly indepen-
for certain workloads. Finally, we propose a set of client-based
detection methods that can identify undervolted instances. We
dent of the CPU frequency but dependent on the CPU volt-
support experimental reproducibility and provide instructions to age [12]. Decreasing the CPU voltage below the nominal value
reproduce our results. to conserve power is called undervolting1 . Besides energy
Index Terms—ARM, undervolting, attack, detection savings, undervolting directly influences core temperature and
can also reduce core aging [13]. Undervolting, however, incurs
I. I NTRODUCTION the risk of introducing soft [14] and hard-errors related to
timing violations [15]. These types of errors can be mitigated
Cloud providers continuosly upgrade their commercial of- by carefully analyzing the guardband of processors [16]. In
ferings to adapt to market and customer needs. While the this practical experience report, we consider a scenario where
vast majority of them offer computing instances based on processors supporting a cloud infrastructure are undervolted
x86 processors, the availability of ARM-based cloud instances by an excessively economic and malicious cloud provider
is quickly expanding. ARM processors are increasing their (a scrooge §III-A) to profit from additional electricity bill
market share of server-grade machines [1, 2, 3, 4, 5, 6, 7, 8], savings, while cloud users (from here on referred to as users)
thanks to additional energy and performance improvements. observe similar performance. Unfortunately, undervolting can-
Amazon [5] deploys ARM-based processors currently shipped not be applied arbitrarily. In fact, it comes at the cost of
in off-the-shelf ARM hardware (their AWS Graviton is essen- processor reliability when the supplied voltage is insufficient
tially a more powerful quad-Raspberry Pi 4B [9]). Scaleway to drive the processor’s frequency. We believe this is a risk
offered instances based on custom-made ARM SoCs with that malicious cloud providers are willing to take. For users,
servers smaller than a business card [10]. Table I summarizes undervolting opens up a new attack vector against their cloud
a subset of available server-grade ARM processors, supported
instruction set architectures (ISA), and providers deploying 1 Notice that Dynamic Voltage and Frequency Scaling (DVFS) differs from
this hardware. Several generations of ARM processors [1, undervolting by decreasing frequency as well as voltage.

©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including
reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,
or reuse of any copyrighted component of this work in other works. Presented in the 40th IEEE International Symposium on Reliable Distributed Systems
(SRDS ’21).
applications (see our threat model in §III). The main research 3B 3B+ 4B

Energy (normalized)
questions we address in this work are: 1.05

RQ1: What is necessary for a malicious cloud provider in 1


order to pull off a stealthy undervolting strategy? 0.95

RQ2: Does a cloud user have the ability to uncover such an 0.9
undervolting strategy? 0.85
0 50 100 150 200
To answer those questions, we need to lay the foundation Throughput [Mop/s]
to better understand consequences of (arbitrary) undervolting,
both from the cloud provider and client perspective. In fact, Figure 1: Normalized energy to throughput ratio (ETR) for
depending on supply voltage, frequency, load, and temperature undervolted Raspberry Pi model B platforms operating at
of the CPU, execution steps can yield erroneous computa- maximum throughput
tions. While recent attacks [17, 18] have demonstrated how
undervolting can be effectively exploited to gain access to
sensitive information, we deal with a different threat model: normalized ETR values indicate higher energy efficiency for
the infrastructure is undervolted on purpose by a powerful a given throughput. On average across different throughput
attacker (i.e., the cloud provider), at the risk of exposing hard- values we achieved by undervolting 5 % to 13 % better energy
to-detect unreliable computing instances for users. Without efficiency on the 3B and 3B+ and 0 % to 3 % on the 4B. In
physical access to instances, nor being able to directly manip- essence, these results suggest that a cloud provider can indeed
ulate the supply voltage or frequency, a user’s options remain undervolt ARM-based instances, without directly compromis-
limited. Nevertheless, a user can adjust the processor’s load ing the observed performance.
and operating performance points (§II-B) to influence its heat Our contributions are as follows:
dissipation. In order to operate under full load, the processor • We describe a novel attack scenario based on undervolt-
has to be set to the highest operating performance point, which ing by a scrooge cloud provider to lower energy costs.
implies the highest frequency and supply voltage setting. Con- • We demonstrate how cloud users can with a certain
sequently, undervolted processors present higher probability probability detect this novel scrooge attack.
for erroneous computations to occur because they are unable to • We provide a temperature-based guardband analysis to
maintain high frequencies. This probability is further increased narrow down the operation voltage range of an ARM-
by the propagation delay due to high operating temperature. based processor (§V-D).
If erroneous computations result in faults, one can observe • We describe how our analysis can be used to automati-
application crashes, or kernel panics, leading to cloud instance cally identify undervolted instances (§V-E)
unavailability. While service level agreements (SLA) [19] • We present potential energy gains of undervolting sys-
typically cover such scenarios, a malicious provider might try tems using a reliability benchmark (§V-F). In general
to balance its actions to only yield erroneous computations gains can reach up to 37 %.
not resulting in faults, basically overcoming SLA protections.
This practical experience report is organized as follows.
For this reason, we designed a non-selective fault injection
Section II provides background on the low-level mechanisms
method for detecting the scrooge attack. The sole purpose of
used to undervolt a processor and the Raspberry Pi platform
the detection method is to yield intentional application crashes
as well as the associated side-effects. Our threat model is
or kernel panics on undervolted instances such that the user
given in Section III. We overview our detection method in
is covered by the SLA. While interesting, we consider cloud
Section IV. Our in-depth experimental evaluation is presented
providers or users exploiting undervolting to leak sensitive
in Section V. We discuss and review related work in Section VI
information [20, 21] to be out of scope of this work.
and Section VII, before concluding in Section VIII.
Interestingly, ARM-based Raspberry Pis have already been
collocated in cloud data centers [22]. With the intent to II. BACKGROUND
reproduce and study the dynamics of such deployments (and,
to a smaller scale, mimic AWS using ARM nodes), we first This section defines more precisely a few concepts related to
study the effects of undervolting on three different ARM pro- power management (§II-B), i.e., frequency and voltage scaling
cessors, focusing on energy savings. Figure 1 shows different and associated techniques such as Dynamic Voltage and Fre-
normalized energy to throughput ratios (ETR) [12] obtained quency Scaling (DVFS) and Adaptive Voltage Scaling (AVS).
with ARM Cortex-A processors for the three latest Raspberry In §II-D we explain the relation between such techniques and
Pi models (3B, 3B+, and 4B [9]) at their lowest operational how they affect the overall reliability of a system.
undervolting setting (−75 mV for 3B and 3B+, and −15 mV
for 4B) compared to nominal voltage (i.e., 0 mV, no undervolt- A. ARM in data centers
ing). As shown, undervolting directly influences energy spent Collocation offers allow users to either ship or buy Rasp-
per operation, without negatively affecting throughput. Lower berry Pis in order to deploy lightweight workloads on this

2
CPU-bound Memory-bound Frequency scaling regulates (dynamically) the frequency of
an integrated circuit in order to change performance, conserve
2
Energy [J/op]

Energy [J/op]
power or reduce the amount of heat dissipation. Reducing
0.2
the frequency at a constant voltage is called underclocking or
1 throttling, while increasing the frequency is called overclock-
0.1
ing. The dynamic power dissipated by an integrated circuit
over a period of time is given by P = CV 2 f , where C is the
0 0
capacitance, V is the voltage, and f is the frequency. Thus,
n

n
rt e

rt e
+

+
3B B

4B
K oad C

3B B

4B
K oad C
ar L l

ar L l
H y el

H y el
p e ak

pe ak
ow

ow
r Y

r Y
3

3
ab w

ab w
increasing the frequency results in a higher power consumption
B P

B P
E

E
and operating temperature.
Voltage scaling is an open loop system, in which the
Figure 2: Energy comparison of off-the-shelf and server-garde voltage of an integrated circuit is regulated (dynamically)
devices on CPU-bound and memory-bound workloads. based on an external setting. Increasing or decreasing the
voltage while keeping the frequency constant is called over-
volting and undervolting, respectively. Regulating the voltage
low-energy hardware and thus free up resources on high- enables increasing the frequency or conserving power of an
energy x86 hardware. Furthermore, Raspberry Pis are the integrated circuit, a particularly useful aspect especially for
size of credit cards and have much lower cooling demands, battery-powered devices. Changing the voltage influences the
which allows hosting a large number of units in a single rack. rate at which capacitances can be charged and discharged.
Such off-the-shelf hardware setups allow for large-scale node Thus voltage determines the speed and frequency at which an
deployments as needed in data processing or cloud computing integrated circuit can be operated. Modern operating systems
workloads. While off-the-shelf hardware typically lacks in do not provide direct support to adjust a processor’s voltage
performance and storage capability, its energy consumption individually. The processor’s voltage is either regulated by
remains comparable to server hardware. model-specific registers [26] or through firmware.
Figure 2 compares the energy consumption of ARM- DVFS is the simultaneous software-controlled regulation
based off-the-shelf hardware (i.e., three different Raspberry of voltage and frequency scaling of an integrated circuit.
Pi models) against server-grade hardware using different x86 Depending on the process variation (variation of integrated
architectures. We run a cryptographic (CPU-bound) and a circuits when fabricated) ARM system on a chip (SoC) manu-
memory allocation (Memory-bound) stressor while measuring facturers specify a set of operating performance points (OPPs)
the entire device power consumption. The x86 processors under worst case conditions. These OPPs are pairs of clock
used were an AMD EPYC and three different Intel Xeon frequencies and voltages under which the integrated circuit
processor generations, i.e., Broadwell, Kaby Lake and Harper- is operational with a sufficiently large margin while taking
town. This is a direct comparison of the execution of two into account thermal conditions. In Linux the CPUFreq kernel
distinct binaries of the same source code on two different driver [27] will chose a set of OPPs based on a specified
architectures based on a common metric (J/op). We observe governor. DVFS has been extensively studied [25, 28, 29] to
no major difference for CPU-bound operations between dif- accelerate multi-threaded applications. x86 manufacturers use
ferent architectures [23]. However, memory-bound operations their own DVFS implementations [30, 31, 32].
on off-the-shelf hardware have higher energy consumption. AVS [33] is a closed loop system where the voltage is
In the case of the Raspberry Pi models, these are due to regulated based on its process variation, aging and a feedback
cache size and memory transfer rate. Nevertheless, off-the- loop of sensor data. A hardware monitor or software backed by
shelf hardware achieves lower energy consumption for both sensor data determines if the changes made to the system are
operations compared to older server-grade x86 hardware, i.e., sufficient or if additional changes are necessary. AVS requires
Harpertown. These results indicate that replacing old x86 support from both the processor and the power regulators,
hardware with recent off-the-shelf ARM-based nodes in data in order to adjust the voltage accordingly. The Raspberry Pi
centers will result in energy savings. models B used in this report are equipped with an AVS system.
C. Raspberry Pi
B. Power management
The Raspberry Pi’s firmware is configured at boot time by
The power dissipated by an integrated circuit depends on a text file containing property-value pairs. For example, the
static power (leakage current) and dynamic power (switching frequency and voltage can be set in this configuration file.
power). Since about 2005 [24] the power dissipation con- A particularity is that voltages can only be set to a nominal
tribution of dynamic power has become much higher than offset in steps of 25 mV. This configuration is then parsed
static power. Nowadays, with the decreased transistor size and by the firmware. This undervolting configuration is specific
lowered threshold voltages, static power is becoming more and to the Raspberry Pi and other hardware can more easily be
more important [25]. In the following, we outline techniques undervolted dynamically at runtime. Notice that the requested
to reduce dynamic power. CPU frequency in the operating system can deviate from

3
the actual frequency regulated by the firmware. This is in ensures that the undervolted state of the cloud infrastructure
particular the case if the device reaches the thermal hard limit remains oblivious to users. A cloud provider must find the
at 85 °C. Additionally, the 3B+ has a soft limit temperature at sweet spot [16] for the undervolt configuration in or near the
60 °C that will throttle the CPU frequency and voltage. critical region to provide sufficiently stable instances.

D. Reliability B. The curious cloud user


There are several approaches to determine a processor’s
The curious cloud user is suspicious of the cloud provider
reliability in an undervolted operating regime. Known bench-
and intends to uncover its potentially obfuscated activity.
marks (e.g., SPEC CPU2006 [34], PARSEC [35], etc.) are
Instances of the cloud provider can exclusively be accessed
still used [16, 36, 37]. Recently, new specialized software
remotely by the user. The only way for a user to detect an
systems [38, 39] have been proposed to maximize power
undervolted processor is by querying the firmware, normally
consumption and voltage noise. Even small proof-of-concept
using a specific executable command file for that. By reading
programs are sufficient for fault detection under dynamic
values from the firmware and comparing them to values in the
voltage and/or frequency [17, 20]. Finally, such programs can
boot configuration file, a user can detect an undervolted pro-
also be used to characterize the guardband of a system [20, 40].
cessor. If results of firmware queries can be forged, it becomes
In this practical experience report we distinguish between
difficult for a user to uncover the scrooge cloud provider.
three regions with respect to the guardband: safe, critical, and
A confidential and tamper-proof message exchange with the
failure. A safe region has a sufficiently high voltage margin,
firmware is essential to detect an undervolted processor.
such that erroneous computations or transient faults cannot
A user can suspect an undervolted processor to operate
occur. The critical region designates a small voltage band in
in the critical region in case of kernel warnings or kernel
which the processor occasionally experiences erroneous results
panics appearing during the system boot or while the system
or transient faults. Inside the failure region it is impossible to
is running, despite these being generic kernel warnings rather
boot the operating system either because the voltage cannot
than specific ones. In particular if the booting time is longer
support the processor’s frequency or because erroneous com-
than expected, then this might hint at a failed boot attempt
putations and transient faults lead to kernel crashes or panics.
where the kernel crashed. Most systems have a kernel log
Undervolted instances become unavailable in case a tran-
that can be consulted by the system administrator. However,
sient fault leads to an instance crash. From our perspective,
a cloud provider can tamper with those kernel logs, and the
current SLAs cover single instances that have crashed be-
system utilities are outside the trusted computing base.
cause of undervolted hardware, provided users can sufficiently
support these claims. The situation is trickier with multiple
C. The scrooge attack
instances. Deployed instances would have to crash simulta-
neously, yet process variation plays into the cloud provider’s The scrooge cloud provider makes undervolted ARM in-
hands. These crashes are non-deterministic and, therefore, pro- stances available to users. These undervolted instances should
cess variation helps obfuscating the undervolted setup. Only be indistinguishable from nominal voltage instances. This
simultaneous crashes satisfy today’s cloud provider restrictions includes configuration, firmware, and tools querying CPU volt-
in order for users to be covered by the SLA. ages. Thus, the undervolt configuration needs to be exchanged
for a nominal configuration and any CPU voltage reading
III. T HREAT MODEL request needs to be intercepted. Figure 3 shows different
In this section we discuss our threat model. In particular, actions the cloud provider has to perform during an instance
we intend to clarify: (1) which techniques can a malicious lifecycle in order to hide the undervolt configuration. When a
cloud provider use to hide an undervolted processor from an user boots such an instance, the cloud provider must ensure
unsuspecting user, and (2) which are the methods for a curious that the undervolt configuration is loaded by the firmware
user to reveal an undervolted processor? Notice that we vali- on the machine the instance is running on. However, this
date these methods on a specific hardware configuration (i.e., undervolt configuration should not be accessible once the user
Raspberry Pi boards using Broadcom BCM2837/BCM2711 is connected to the instance. The undervolt configuration has to
processors), but the discussion holds for other platforms re- be swapped for the nominal configuration ¶. Depending on the
lying on similar voltage regulation mechanisms. configuration mechanism, the file system that was booted may
be different from the file system the user finds after booting.
A. The scrooge cloud provider This includes firmware, operating system kernel, binaries, etc.
We assume the cloud provider has full access to the phys- A hidden or obfuscated system service could perform this
ical infrastructure and can connect remotely to the physical task while the operating system is booting. An even stealthier
machines [41]. Furthermore, the cloud provider purposefully approach involves a trusted operating system [42] or auxiliary
undervolts its ARM-based hardware to benefit from additional devices [43] which exchange the configurations before the
savings. Firmware configurations can be hidden from users for operating system is booted. Therefore, without proper system
malicious or security purposes. By maliciously intercepting attestation, there is no guarantee about the authenticity of
any voltage reading requests (see §III-C), the cloud provider the system users believe they have booted. During reboot

4
boot reboot voltage reading Users can for this reason make use of simple CPU-bound
programs that will put the processor under maximum load
deploy ❶ ❷ ❶ ❹ ❺ while monitoring for faults. Inspired by [17] we propose

shutdown deployed firmware request implementing an arithmetic computation (i.e., multiplication)


for which we can validate the result. First we generate two
random numbers which are then multiplied until the instance
Figure 3: State machine with cloud provider actions to obfus- crashes while alternating the position of multiplier and multi-
cate undervolted machine configuration. plicand. Murdock et al. have observed, that the position of
the multiplier and the multiplicand can lead to a faulting
instruction. After each multiplication the result is compared to
or shutdown of the machine configurations might have to be
the original result. While the processor operates at maximum
swapped back · again.
load it will run at the highest frequency and dissipate heat
Any CPU voltage reading request needs to be intercepted
which will raise its temperature. Under these conditions we
and substituted by a plausible nominal voltage value. This
achieve the highest probability to inject faults related to timing
will typically involve a kernel driver that will handle the
violations. Depending on the complexity of the RISC circuitry
communication with the firmware or accessing model-specific
in the ARM processor, certain instructions are more likely to
registers. The request can then be intercepted directly in the
fault then others. To this end, the detection method might not
user space tool or the kernel driver ¸. From the kernel driver
inject faults in its own computation (due to its simple nature),
the request is forwarded ¹ and the actual undervolted CPU
but more likely in other processes. This behavior is favorable,
voltage value is returned to the kernel driver º. The kernel
as it allows to run the detection method until the instance
driver then substitutes this value by a some nominal voltage
itself becomes unavailable due to multiple critical faults in
value, e.g., by adding the undervolt offset to the value. The
system relevant processes. Thus, we detect an undervolted
alleged nominal voltage value is then returned to the user ». A
cloud instance using the detection method by gradually failing
more costly but stealthier variant involves the trusted operating
processes to crash the instance and make it unavailable.
system, to which the cloud provider could delegate voltage
The detection method depends strongly on how aggressively
reading requests instead of a kernel driver.
machines are undervolted and the cooling system employed
If users are allowed to deploy their own kernels, then the
by the cloud provider. The less a machine is undervolted, the
cloud provider needs a different approach. Voltage reading
higher the temperature needs to be raised by the detection
request can no longer be intercepted in kernel space. Instead,
method to fault processes and vice versa. A good cooling
the cloud provider needs to use the hypervisor to intercept
system is a lesser problem than a weakly undervolted machine.
CPU voltage requests and substitute them similarly to the
With a good cooling system the detection method requires a
kernel driver approach.
longer time to raise the processor’s temperature. On the one
In our threat model we assume that the cloud provider will
hand implementing a soft limit temperature throttle in order to
make use of these mechanisms and obfuscate as much as
prevent this detection method is not an ideal solution. Users
possible the undervolted state of the infrastructure from users,
are less inclined to pay for a service which underperforms
a practical effort with significant benefits. Without access
compared to alternative services. On the other hand weakly
to the firmware configuration nor any untampered message
undervolting machines defies the scrooge cloud provider’s
exchange with the CPU voltage regulating mechanism, a user
original idea of minimizing the electricity bill.
can never be sure to obtain a genuine voltage reading.
The cloud provider’s options are limited to completely pre-
D. Relevance of discussed techniques vent the detection method from unveiling the scrooge attack.
While a scrooge cloud provider has powerful mechanisms Even the powerful setup of the cloud provider to tamper with
in place to hide its undervolted instances, the curious user can CPU voltage readings is not sufficient denying the detection
still expose this misbehaviour. For instance, the processor’s method. The scrooge attack has the disadvantage that detection
frequency and package temperature are viable options to methods have a simple design, but it has the advantage that
test for undervolted conditions. The techniques presented in proving the undervolt state without the firmware is difficult.
Section V demonstrate to which extent users can deploy
V. E VALUATION
applications stressing aforementioned options on instances and
how accurately conclusions can be drawn. In this section we explore the behavior of Raspberry Pi pro-
cessors under different nominal and undervolted setups. The
IV. S CROOGE ATTACK DETECTION information gained from these experiments allows quantifying
We describe the user’s detection method in this section. the attack parameters and determining the type of processes to
Furthermore, we describe under which conditions the detection use for the detection method. Then, we derive the probability
method works and where difficulties may arise. at which our detection method can successfully uncover the at-
We assume that users cannot trust any firmware or system tack. We begin by describing our experimental setup to under-
reading on instances. As such, users have no reference to any volt Raspberry Pis before evaluating the firmware’s throttling
parameters for adjusting the detection method to the attack. behavior when reaching the soft limit and limit temperatures.

5
Table II: Soft limit (SL) firmware throttling on the 3B+ C. Limit temperature throttling
OV level Varm [V] VSL
arm [V] farm [MHz] fSL
arm [MHz]
0 1.3750 1.2688 1400 1200 Next, we evaluate the firmware behavior when reaching the
-1 1.3500 1.2375 1400 1200 limit temperature while running under the CPUFreq perfor-
-2 1.3188 1.2125 1400 1200 mance governor. At the limit temperature, the firmware will
-3 1.2938 1.1875 1400 1200
throttle the processor to prevent thermal runaway. Notice that
model 3B+ is not included here, as it is taking too much time
Table III: Limit temperature (L) throttling on the 3B and 4B
reaching the limit temperature while already being throttled for
Model VL
arm [V] fL
arm [MHz] fL
core [MHz] going beyond the soft limit temperature. Neither the 3B nor
3B 1.2813 {1034, 1087, 1141, 1195, 1200} {400} the 4B reduce the voltage when reaching the limit temperature
4B 0.8500 {1000, 1500} {333, 500}
as shown in Table III. However, both models reduce their
frequency. The 3B is reducing its ARM CPU frequency fLarm
in steps of about 54 MHz (except for the first step) while the
The temperature-based guardband analysis allows detecting
4B significantly reduces its frequency by 500 MHz. In addition
the critical region of the device and defines the margin for
the 4B also reduces its GPU frequency fLcore by 167 MHz. We
an undervolt setup. Faults that occurred during the guardband
find that reaching the limit temperature will reduce the load
analysis are analyzed to describe the fault injection of the
put on the processor by the detection mechanism and reduce
detection method. Finally, we measure the energy efficiency
its temperature which lead to a lower fault injection rate.
of the undervolted hardware with a reliability benchmark. The
dataset gathered for this evaluation will be made publicly
available at https://github.com/ChrisG55/Scrooge-Attack. D. Temperature-based guardband analysis

A. Experimental settings The temperature-based guardband analysis helps identifying


voltage margins of the Raspberry Pi models. While this
We use the three Raspberry Pi models 3B, 3B+, and 4B, analysis supports the cloud provider in selecting an undervolt
while booting from the same SD card a Raspbian Buster dis- offset, its core principle can also be exploited by users to
tribution (https://github.com/raspberrypi/linux). All units rely uncover the scrooge attack. This benchmark consists of three
on recent firmware releases (since June 2020). To simulate a stages: 1) booting the operating system while undervolted
realistic cloud scenario we take all measurements in an air- before 2) adjusting the SoC’s temperature either actively or
conditioned room at (24 ± 1) °C and connect the Raspberry passively and 3) running a billion iterations of the multiplica-
Pis to Ethernet and run the SSH daemon. The Raspberry tion benchmark described in §IV as a single-threaded process.
Pis are monitored over UART from an auxiliary machine. We set the CPUFreq governor to performance right before
No other peripherals are connected to the Pis in order to starting the multiplication benchmark. This will guarantee
minimize any interference. Both the Raspberry Pi’s and the that the multiplication benchmark is started at a well defined
auxiliary machine’s clocks are synchronized using NTP, to temperature and that it runs at a constant, maximum frequency.
easily correlate the power consumption logs recorded on the Once the benchmark is finished we repeat the following
auxiliary machine to the benchmark running on the Raspberry process, in which we reduce the ARM CPU voltage level
Pi. The power consumption of the Raspberry Pi is recorded in the configuration before rebooting, until the Raspberry Pi
by an Alciom PowerSpy2 [44] over bluetooth. no longer boots because the supply voltage has gone below
threshold.
B. Soft limit temperature throttling The results of this analysis are shown in Figure 4. All
We start by evaluating the firmware behavior when reaching Raspberry Pi models keep a sufficient margin with their
the soft limit temperature while running under the CPUFreq nominal voltage (connected black bullets) configuration to the
performance governor. Understanding the throttling behavior critical region. Further undervolting of the ARM CPU into
helps evaluating the viability of possible mitigation techniques the critical region results in occasionally failing processes.
by the cloud provider. The Raspberry Pi documentation men- Undervolting the ARM CPU beyond threshold voltage makes
tions that frequency and voltage of the SoC are reduced to it impossible to boot the hardware. Our multiplication bench-
decrease heat dissipation but without indicating by how much. mark, that verifies the correct operation of the CPU, never
The Raspberry Pi 3B+ is the only model with a soft limit detected an incorrect result. We explain this characteristic
temperature programmed into its firmware, therefore, other of the multiplication benchmark, which is purely based on
models are not included in the overvoltage level to OPP arithmetic operations, by not being on a timing-critical path
mapping reported in Table II. The values indicate that the to force an incorrect operation of the CPU.
ARM CPU frequency farm is reduced by 200 MHz and the We can also see that the undervolting depends directly on
CPU voltage Varm is lowered by about 106 mV (four levels). the SoC’s temperature. For instance on the Raspberry Pi 3B we
The voltage stepping of 25 mV remains the same with two clearly observe slightly rising regions, which result from small
exceptions from nominal level −1 → −2 with −31.2 mV and adjustments made by the AVS system. This is mainly due to
from soft limit level 0 → −1 with −31.3 mV. the resistivity of the circuitry that increases with temperature.

6
safe critical failure nominal safe critical failure nominal safe nominal undervolted

1.30

Voltage [V]
Voltage [V]

Voltage [V]
0.86
1.30
1.25
0.84
1.20 1.20
0.82

30 40 50 60 70 80 30 40 50 60 70 30 40 50 60 70 80
Temperature [°C] Temperature [°C] Temperature [°C]

(a) Raspberry Pi 3B (b) Raspberry Pi 3B+ (c) Raspberry Pi 4B


Figure 4: Temperature-dependent guardband measurements of latest Raspberry Pi models B. Triangles indicate lower ( ) and 4
upper (4) frontier measurements while bullets (•) indicate nominal measurements.

3B 0mV 3B -75mV 3B -100mV oops diagnose in the system log. We used this information to
3B+ 0mV 3B+ -75mV 3B+ -100mV analyze the guardband failures and summarized it in Figure 5.
1 Notice that the 4B is not included, as it’s firmware does not
0.8 provide undervolting support and we could not provoke any
Failure rate

0.6 failures in this system. Our analysis indicates that at 60 °C we


0.4 have the highest probability with a 40 % chance on the 3B+
0.2 respectively a 90 % chance on the 3B to provoke a failure in
0 a system operating in the critical region. With an even more
30 40 50 60 70 80 aggressive undervolting at −100 mV, failures can already be
Temperature [◦ C] reliably provoked at 40 °C on both devices. These failures were
provoked in at least 33 different processes (34 % user, 15 %
Figure 5: Temperature-dependent failure rate of Raspberry Pi
kernel processes and 51 % unknown processes) of which the
models at different undervolt levels.
multiplication benchmark is never among the known failed
processes. For the 3B+ the failure rate is dropping at 70 °C
due to the temperature soft-limit throttling in the firmware to
The Raspberry Pi 4B can only be undervolted once to level
bring the system back into the safe region. At nominal voltage
−1 at −15 mV due to missing overclocking 2 support in
no failures could be provoked in any system.
the firmware. This undervolt limit is indicated by the blue
line. However, we observe some basic overheating protection F. Energy efficiency and “reliability”
mechanism that slightly lowers nominal voltage by −12.5 mV
Despite stressing the systems for several hours, we could
in the range of 50 °C to 70 °C.
not provoke any failures in these systems using STRESS - NG.
E. Implications on the detection method success rate We show in heat map Table IV the energy efficiency for all
three Raspberry Pi models based on the ETR ratio of the
Overall we ran 741 guardband analyses of which 265 had
undervolted to the nominal setup. For the measurements we
failed while operating in the critical region. Among the 265
use two cooling setups: active and passive cooling. We ran up
failed runs we identified 407 process failures of the following
to 169 stressors sequentially of which 27 are shown in the heat
5 types: (1) NULL pointer dereferences (20.3 %), (2) paging
map. Each stressor was configured with a timeout of 60 s.
requests (46.4 %), (3) read from unreadable memory (5.4 %),
Our results show that when the Raspberry Pi’s are actively
(4) write to read-only memory (0.9 %), and (5) freeze during
cooled, we can achieve higher energy efficiency. It is even
boot (26.7 %). These types of failures, with the exception of
possible to undervolt the device further. For example, with
(5), usually generate a kernel oops. A kernel oops happens
the 3B+ we were able to undervolt up to −100 mV and
when the operating system kernel detects an incorrect behavior
in some rare instances also run the benchmark successfully.
of a process and can possibly resume execution of the system.
Occasionally we observe minimally better energy efficiency
In some cases execution cannot be resumed because of system
in the passive cooling setup than in the active cooling setup
dependencies or unavailable system resources as a result of
(e.g., futex with −34 %, −3 % and −16 % and 2 %, −1 %
the failing process. The kernel will raise a panic and halt the
and −2 %). We noticed that some stressors have a large
system if a kernel oops occurs in an interrupt handler.
variance in the number of operations. Hence, if these stressors
The ARM architecture provides the Exception Syndrome
achieve a higher than average number of operations during the
Register [45], which the operating system can consult to
measurement, their energy efficiency improves proportionally.
diagnose the type of exception generated by a process. If
With the 4B we notice only minor improvements in energy
possible, the operating system kernel will log this kernel
efficiency. Again, this is due to the lack of overclocking
2 https://www.raspberrypi.org/documentation/configuration/config-txt/ support but also because of the lower core supply voltage
overclocking.md Last accessed on 2021-04-23 compared to the other models. On average across the 27

7
Table IV: STRESS - NG ETR heat map indicating the relative energy efficiency for an undervolted setup compared to a nominal
setup. The darker the shade, the more energy-efficient the stressor ran.

m eso r
g e
lt

sg r t
er ri

vm dom
vo

hs ers

m ba r
M ling

cl ch

i c ch

sy gv
m ch

ur ch
bs ic

w w
el

he
er

p
k

tim
r

er
tex
om

em

se

-r
od

ar

ar
ea

ea

dy

an
nd

pe
o

oc
rk

ac

sf
ll

cs
ll

tim
o

t
Co

sig
lse

tse
po
ge

kc
fo
fu

hr

se
ju

ki

pi
U

ai
at
3B −75 mV 0.94 0.95 0.96 0.95 0.92 1.02 1.03 0.90 0.95 0.95 0.99 0.93 0.93 0.94 1.02 0.94 0.91 0.96 0.95 0.93 0.94 0.91 0.94 0.99 0.95 0.96 0.94
active

3B+ −75 mV 0.89 0.94 0.93 0.93 0.87 0.99 0.94 1.00 0.95 0.94 1.01 0.93 0.96 0.94 0.97 0.94 0.95 0.92 0.95 0.93 0.93 0.92 0.94 0.95 0.95 0.97 0.94
4B −15 mV 1.01 0.99 1.02 0.99 1.02 0.98 1.00 1.06 1.00 0.98 0.96 1.00 1.04 0.99 0.96 1.00 0.97 0.70 0.98 0.98 0.99 1.00 0.97 0.91 0.98 1.00 0.99

3B −75 mV
passive

0.88 0.95 0.92 0.93 0.91 0.66 0.76 0.63 0.94 0.94 0.97 0.93 0.94 0.94 0.93 0.93 0.92 1.03 0.93 0.95 0.92 0.95 0.92 0.96 0.94 0.96 0.93
3B+ −75 mV 0.95 0.95 0.96 0.95 0.98 0.97 0.99 0.80 0.95 0.95 0.94 0.95 0.95 0.95 0.98 0.95 0.95 0.96 0.94 0.95 0.94 0.91 0.95 0.97 0.96 0.97 0.95
4B −15 mV 1.00 0.97 1.02 0.99 1.01 0.84 1.00 1.10 0.99 1.03 0.99 0.98 1.00 1.00 0.97 1.00 1.03 1.01 1.00 1.05 1.04 0.87 1.00 0.91 0.99 0.97 0.99

20 15
10 10
Frequency

Frequency

Frequency
Frequency

10
5 10 5
5

0 0 0 0
0 200 400 600 800 0 20 40 60 0 200 400 600 0 20 40 60 80
Run-time [s] Temperature [°C] Run-time [s] Temperature [°C]

(a) Bare-metal 3B run-time (b) Bare-metal 3B temperature (c) Kubernetes 3B run-time (d) Kubernetes 3B temperature
10 15 20
15

Frequency
Frequency

Frequency
Frequency

10 10
5 10
5 5

0 0 0 0
0 500 1,000 1,500 0 20 40 60 0 500 1,000 1,500 0 20 40 60
Run-time [s] Temperature [°C] Run-time [s] Temperature [°C]

(e) Bare-metal 3B+ run-time (f) Bare-metal 3B+ temperature (g) Kubernetes 3B+ run-time (h) Kubernetes 3B+ temperature
Figure 6: Run-time and temperature histograms of bare-metal and container instances

stressors 5 % / 9 % (active/passive) were saved on the 3B, 6 % 62 °C to crash bare-metal or container instances. Interestingly,
/ 6 % were saved on the 3B+, and 2 % / 1 % were saved on container instances crash on the 3B earlier than bare-metal
the 4B. The highest energy efficiency observed on the 3B was instances. We assume the computing requirements from the
−10 % / −37 % on the hrtimers stressor. On the 3B+ −13 % container environment work in favor of the detection method.
/ −20 % were saved on the fork / hrtimers stressor. Finally,
the −30 % / −16 % were saved on the 4B with the hrtimers /
futex stressor. VI. D ISCUSSION

G. Detection method parameters From our evaluation we conclude that the detection method
In this subsection we quantify the detection method parame- is best used in combination with other processes such as in
ters (i.e., run-time and temperature) based on undervolted bare- STRESS - NG . The user even has the option to scale the number
metal and container instances. Deploying virtual machines of threads in the detection method to adjust the crash time
on the Raspberry Pi is impracticable and were therefore not of an instance as well as the injection rate. A simple CPU-
included in our evaluation. To run containers on the Raspberry bound program like the multiplication benchmark turns out
Pi we deployed a small Kubernetes cluster. to be ideal for injecting faults in an undervolted setup. The
Figure 6 shows histograms with crashes on bare-metal and advantage of such a simple CPU-bound program is that it is
container instances deployed on the 3B and 3B+. We show unlikely to inject faults during its own execution and can run
the run-time of our detection method and the temperature until a kernel panic while raising heat dissipation. In terms of
at which instances crashed. Our observations made with energy efficiency we observed that by undervolting the cloud
the temperature-based guardband analysis in subsection V-E provider can save on average 5 % and up to 37 % for specific
are confirmed by the temperature histograms. The run-time workloads on ARM processors.
strongly depends on the processor’s capability to heat up to a RA1: as shown by our extensive experimental evaluation, in
certain temperature and is therefore not an ideal parameter. We order to pull off a stealthy undervolting strategy, a malicious
observe clear differences between the thermal designs of the cloud provider must exchange any firmware configuration to
two models. For the 3B our detection method requires about undervolt the hardware and intercept any voltage requests
175 s / 30 s to reach 62 °C to crash bare-metal or container coming from users.
instances. On the 3B+ we require about 145 s / 250 s to reach

8
RA2: a cloud user can uncover such an undervolting strategy our detection method more deterministic by injecting faults in
by running a simple CPU-bound benchmark until enough processes more selectively.
processes have failed to render the cloud instance unavail-
able. The drawback of this detection method is that it is non- ACKNOWLEDGMENTS & D ISCLAIMER
selective and cloud instances can fail either soon or late.
The views and opinions of the authors do not neces-
sarily reflect those of the U.S. government or Lawrence
VII. R ELATED W ORK Livermore National Security, LLC neither of whom nor
Undervolting the supply voltage for energy savings has been any of their employees make any endorsements, express
explored on CPUs for ARM [46, 47], x86 processors [16, 48] or implied warranties or representations or assume any le-
the Itanium micro-architecture [49], and for POWER-7 proces- gal liability or responsibility for the accuracy, complete-
sors [36]. This experimental undervolting approach has been ness, or usefulness of the information contained herein.
extended to GPUs [50] and FPGAs [40] as well. On the CPU This work was partially prepared by LLNL under Contract
side, frameworks to automate and optimize the process of DE-AC52-07NA27344 (LLNL-CONF-817551) and by
undervolting have been developed [14, 46]. Recently, AMD the European Union’s Horizon 2020 research and innovation
has announced an undervolting product/framework for their programme under the LEGaTO Project (legato-project.eu),
most recent Ryzen 5000 CPUs [51]. In [52] the authors grant agreement No 780681.
discuss the trade-off between the reduced energy cost and the
SLA violation penalties introduced by higher node failures R EFERENCES
of undervolted X86 and ARM nodes. In CLKSCREW [20], [1] Ampere eMAG 8180 64-bit Arm Processor, Amp 2018-0007 ed.,
the undervolting capabilities of modern ARM processors is Ampere Computing, 4655 Great America Parkway, Suite 601,
exploited to compromise system security, by targetting un- Santa Clara, CA 95054, 2018.
dervolting faults to specific hardware components to extract [2] “Ampere Altra: The World’s First Cloud Native Processor,”
https://amperecomputing.com/altra/, Nov 2020, last accessed on
cryptographic keys. 2021-04-23.
[3] “Huawei Unveils Industry’s Highest-Performance ARM-
VIII. C ONCLUSION AND O PEN C HALLENGES
based CPU,” https://www.huawei.com/en/news/2019/1/
A cloud provider can obfuscate the undervolting of pro- huawei-unveils-highest-performance-arm-based-cpu, Jan
cessors and even run workloads up to 37 % more energy- 2019, last accessed on 2021-04-23.
efficiently. However, by undervolting its infrastructure, the [4] “NVIDIA Grace CPU,” https://www.nvidia.com/en-us/
data-center/grace-cpu/, Apr 2021, last accessed on 2021-04-23.
cloud provider incurs a major risk. Not only does the cloud [5] “AWS Graviton Processor,” https://aws.amazon.com/ec2/
provider reduce the margin of error but also the system’s graviton/, last accessed on 2021-04-23.
stability is at stake. Cloud users can with high probability [6] J. Barr, “Coming Soon - Graviton2-Powered General
detect such situations and exploit them using a simple CPU- Purpose, Compute-Optimized, & Memory-Optimized
bound benchmark. To some extent, the cloud provider can EC2 Instances,” https://aws.amazon.com/de/blogs/aws/
coming-soon-graviton2-powered-general-purpose-compute-optimized-memo
mitigate stability issues with appropriate cooling systems. Dec 2019, last accessed on 2021-04-23.
However, it is questionable if the gains of undervolting the [7] ThunderX Family of Workload Optimized Processors, Cavium,
infrastructure outweigh the costs of such cooling systems. 2315 N. First Street, San Jose, CA 95131, 2016.
Cloud users’ options to detect an undervolted ARM in- [8] T. Yoshida, “Fujitsu High Performance CPU for the Post-
stance remain limited and, as shown in this paper, essentially K Computer,” https://old.hotchips.org/hc30/2conf/2.13_Fujitsu_
HC30.Fujitsu.Yoshida.rev1.2.pdf, Aug 2018, last accessed on
depend on the probability to inject faults non-selectively in 2021-04-23.
processes. As our temperature-based guardband analysis and [9] “Raspbeery Pi Products,” https://www.raspberrypi.org/
failure evaluation have shown, the higher the processor’s tem- products/, last accessed on 2021-04-23.
perature, the more likely faults can be injected into processes. [10] Y. Léger, “Public Preview,” https://blog.scaleway.com/
Despite such a powerful cloud provider attacker model, cloud online-labs-public-preview, Oct. 2014, last accessed on
2021-04-23.
users have an exploitable weak link. Their only option for [11] “Neoverse N1,” https://developer.arm.com/ip-products/
presuming a potentially undervolted instance is by increasing processors/neoverse/neoverse-n1, last accessed on 2021-04-23.
the processor’s heat dissipation. Heat dissipation is increased [12] T. Burd and R. Brodersen, “Energy efficient cmos
by tuning the CPU frequency and load to the processor’s microprocessor design,” in 2014 47th Hawaii International
limit. Under these thermal conditions and an undervolted setup Conference on System Sciences, vol. 1. Los Alamitos, CA,
USA: IEEE Computer Society, jan 1995, p. 288. [Online].
the fault injection probability in processes is rising. Ideally Available: https://doi.ieeecomputersociety.org/10.1109/HICSS.
cloud instances will become unavailable and violate the SLA 1995.375385
as a result of continuously failing processes. Our detection [13] V. M. van Santen, H. Amrouch, N. Parihar, S. Mahapatra, and
method depends strongly on hardware and how systems such J. Henkel, “Aging-aware voltage scaling,” in 2016 Design, Au-
as firmware and AVS react to excessive heat dissipation. As tomation Test in Europe Conference Exhibition (DATE), 2016,
pp. 576–581.
future plans, we intend to expand this study to a more diverse [14] K. Parasyris, P. Koutsovasilis, V. Vassiliadis, C. D. Antonopou-
set of ARM-based hardware targets, focusing in particular on los, N. Bellas, and S. Lalis, “A framework for evaluating
current and future cloud offerings. We would also like to make software on reduced margins hardware,” in 2018 48th Annual

9
IEEE/IFIP International Conference on Dependable Systems Technology for Intel,” https://www.intel.com/content/www/us/
and Networks (DSN). IEEE, 2018, pp. 330–337. en/support/articles/000007073/processors.html, Jun. 2020, last
[15] G. Papadimitriou, A. Chatzidimitriou, D. Gizopoulos, V. J. accessed on 2021-04-23.
Reddi, J. Leng, B. Salami, O. S. Unsal, and A. C. Kestelman, [31] Cool’n’Quiet Technology Installation Guide for AMD Athlon
“Exceeding conservative limits: A consolidated analysis on 64 Processor Based Systems, 0th ed., Advanced Micro Devices
modern hardware margins,” IEEE Transactions on Device and Inc., Jun. 2004.
Materials Reliability, vol. 20, no. 2, pp. 341–350, 2020. [32] AMD PowerNow! Technology, A ed., Advanced Micro Devices
[16] P. Koutsovasilis, K. Parasyris, C. D. Antonopoulos, N. Bellas, Inc., Nov. 2000.
and S. Lalis, “Dynamic undervolting to improve energy effi- [33] L. S. Nielsen, C. Niessen, J. Sparso, and K. Van Berkel, “Low-
ciency on multicore x86 cpus,” IEEE Transactions on Parallel power operation using self-timed circuits and adaptive scaling
and Distributed Systems, vol. 31, no. 12, pp. 2851–2864, 2020. of the supply voltage,” IEEE Transactions on Very Large Scale
[17] K. Murdock, D. Oswald, F. D. Garcia, J. Van Bulck, D. Gruss, Integration (VLSI) Systems, vol. 2, no. 4, pp. 391–397, 1994.
and F. Piessens, “Plundervolt: Software-based Fault Injection [34] J. L. Henning, “SPEC CPU2006 benchmark descriptions,” ACM
Attacks against Intel SGX,” in Proceedings of the 41st IEEE SIGARCH Computer Architecture News, vol. 34, no. 4, pp. 1–
Symposium on Security and Privacy (S&P’20), 2020, 41st IEEE 17, 2006.
Symposium on Security and Privacy (S&P’20). [35] C. Bienia, S. Kumar, J. P. Singh, and K. Li, “The PARSEC
[18] Z. Kenjar, T. Frassetto, D. Gens, M. Franz, and A.-R. Sadeghi, Benchmark Suite: Characterization and Architectural Implica-
“V0ltpwn: Attacking x86 processor integrity from software,” in tions,” in Proceedings of the 17th International Conference
29th {USENIX} Security Symposium ({USENIX} Security 20), on Parallel Architectures and Compilation Techniques, October
2020, pp. 1445–1461. 2008.
[19] P. Patel, A. H. Ranabahu, and A. P. Sheth, “Service level [36] Y. Zu, C. R. Lefurgy, J. Leng, M. Halpern, M. S. Floyd,
agreement in cloud computing,” 2009. and V. J. Reddi, “Adaptive guardband scheduling to improve
[20] A. Tang, S. Sethumadhavan, and S. Stolfo, “{CLKSCREW}: system-level efficiency of the POWER7+,” in 2015 48th An-
exposing the perils of security-oblivious energy management,” nual IEEE/ACM International Symposium on Microarchitecture
in 26th {USENIX} Security Symposium ({USENIX} Security (MICRO). IEEE, 2015, pp. 308–321.
17), 2017, pp. 1057–1074. [37] L. Tan, N. DeBardeleben, Q. Guan, S. Blanchard, and
[21] Z. Chen, G. Vasilakis, K. Murdock, E. Dean, D. Oswald, M. Lang, “Using virtualization to quantify power conservation
and F. D. Garcia, “VoltPillager: Hardware-based fault injection via near-threshold voltage reduction for inherently resilient
attacks against intel SGX Enclaves using the SVID voltage applications,” Parallel Computing, vol. 73, pp. 3–15,
scaling interface.” USENIX Association, Aug. 2021. [Online]. 2018, parallel Programming for Resilience and Energy
Available: https://www.usenix.org/conference/usenixsecurity21/ Efficiency. [Online]. Available: https://www.sciencedirect.com/
presentation/chen-zitai science/article/pii/S0167819117300996
[22] L. Upton, “Raspberry Pi colocation,” https://www.raspberrypi. [38] Y. Kim, L. K. John, S. Pant, S. Manne, M. Schulte, W. L.
org/blog/raspberry-pi-colocation, Apr 2013, last accessed on Bircher, and M. S. S. Govindan, “Audit: Stress Testing the
2021-04-23. Automatic Way,” in 2012 45th Annual IEEE/ACM International
[23] E. Blem, J. Menon, and K. Sankaralingam, “Power struggles: Symposium on Microarchitecture, 2012, pp. 212–223.
Revisiting the RISC vs. CISC debate on contemporary ARM [39] Z. Hadjilambrou, S. Das, M. A. Antoniades, and Y. Sazeides,
and x86 architectures,” in 2013 IEEE 19th International Sym- “Sensing CPU Voltage Noise Through Electromagnetic Emana-
posium on High Performance Computer Architecture (HPCA), tions,” IEEE Computer Architecture Letters, vol. 17, no. 1, pp.
2013, pp. 1–12. 68–71, 2018.
[24] H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, [40] B. Salami, E. B. Onural, I. E. Yuksel, F. Koc, O. Ergin, A. C.
and D. Burger, “Dark silicon and the end of multicore scaling,” Kestelman, O. S. Unsal, H. Sarbazi-Azad, and O. Mutlu, “An
in 2011 38th Annual International Symposium on Computer Experimental Study of Reduced-Voltage Operation in Mod-
Architecture (ISCA), 2011, pp. 365–376. ern FPGAs for Neural Network Acceleration,” in 2020 50th
[25] E. Le Sueur and G. Heiser, “Dynamic Voltage and Frequency IEEE/IFIP International Conference on Dependable Systems
Scaling: The Laws of Diminishing Returns,” in Proceedings of and Networks (DSN), 2020.
the 2010 International Conference on Power Aware Computing [41] K. Beer, “How encryption works in AWS,” https://image.
and Systems, ser. HotPower’10. USA: USENIX Association, slidesharecdn.com/repeat-how-encryption-works-, Jun. 2019,
2010, p. 1–8. last accessed on 2021-04-23.
[26] M. Eleršič, “linux-intel-undervolt,” https://github.com/mihic/ [42] “Open Portable Trusted Execution Environment,” https://www.
linux-intel-undervolt, Aug 2017, last accessed on 2021-04-23. op-tee.org, last accessed on 2021-04-23.
[27] D. Brodowski, “Linux CPUFreq - CPU frequency and voltage [43] “AWS Nitro System,” https://aws.amazon.com/ec2/nitro/, last
scaling code in the Linux kernel,” https://www.kernel.org/doc/ accessed on 2021-04-23.
html/latest/cpu-freq/index.html, last accessed on 2021-04-23. [44] PowerSpy2, 1st ed., Alciom, 4 Mar. 2013.
[28] M. Weiser, B. Welch, A. Demers, and S. Scott, “Scheduling [45] Arm Architecture Reference Manual: Armv8, for Armv8-A archi-
for Reduced CPU Energy,” in First Symposium on tecture profile, Arm ddi 0487f.b (id040120) ed., Arm Limited,
Operating Systems Design and Implementation (OSDI Mar. 2020.
94). Monterey, CA: USENIX Association, Nov. 1994. [46] G. Papadimitriou, M. Kaliorakis, A. Chatzidimitriou, D. Gi-
[Online]. Available: https://www.usenix.org/conference/osdi-94/ zopoulos, P. Lawthers, and S. Das, “Harnessing voltage margins
scheduling-reduced-cpu-energy for energy efficiency in multicore cpus,” in Proceedings of the
[29] J.-T. Wamhoff, S. Diestelhorst, C. Fetzer, P. Marlier, P. Felber, 50th Annual IEEE/ACM International Symposium on Microar-
and D. Dice, “The TURBO Diaries: Application-Controlled chitecture, 2017, pp. 503–516.
Frequency Scaling Explained,” in Proceedings of the 2014 [47] G. Papadimitriou, A. Chatzidimitriou, M. Kaliorakis, Y. Vas-
USENIX Conference on USENIX Annual Technical Conference, takis, and D. Gizopoulos, “Micro-viruses for fast system-level
ser. USENIX ATC’14. USA: USENIX Association, 2014, p. voltage margins characterization in multicore cpus,” in 2018
193–204. IEEE International Symposium on Performance Analysis of
[30] “Frequently Asked Questions about Enhanced Intel SpeedStep Systems and Software (ISPASS). IEEE, 2018, pp. 54–63.

10
[48] G. Papadimitriou, M. Kaliorakis, A. Chatzidimitriou, C. Mag-
dalinos, and D. Gizopoulos, “Voltage margins identification on
commercial x86-64 multicore microprocessors,” in 2017 IEEE
23rd International Symposium on On-Line Testing and Robust
System Design (IOLTS). IEEE, 2017, pp. 51–56.
[49] A. Bacha and R. Teodorescu, “Dynamic reduction of voltage
margins by leveraging on-chip ecc in itanium ii processors,”
in Proceedings of the 40th Annual International Symposium on
Computer Architecture, 2013, pp. 297–307.
[50] J. Leng, A. Buyuktosunoglu, R. Bertran, P. Bose, and
V. J. Reddi, “Safe limits on voltage reduction efficiency in
gpus: a direct measurement approach,” in 2015 48th Annual
IEEE/ACM International Symposium on Microarchitecture (MI-
CRO). IEEE, 2015, pp. 294–307.
[51] I. Cutress, “AMD Precision Boost Overdrive 2:
Adaptive Undervolting For Ryzen 5000 Coming
Soon,” https://www.anandtech.com/show/16267/
amd-precision-boost-overdrive-2-adaptive-undervolting-for-ryzen-5000-coming-soon,
Nov 2020, last accessed on 2021-04-23.
[52] C. Kalogirou, P. Koutsovasilis, C. D. Antonopoulos, N. Bellas,
S. Lalis, S. Venugopal, and C. Pinto, “Exploiting cpu voltage
margins to increase the profit of cloud infrastructure providers,”
in 2019 19th IEEE/ACM International Symposium on Cluster,
Cloud and Grid Computing (CCGRID), 2019, pp. 302–311.

11

You might also like