2012 04880 (Se02)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

JANUS: Benchmarking Commercial and Open-Source Cloud and Edge Platforms for

Object and Anomaly Detection Workloads

Karthick Shankar, Pengcheng Wang, Ran Xu, Ashraf Mahgoub, Somali Chaterji
Purdue University, West Lafayette, IN, USA

Abstract—With diverse IoT workloads, placing compute lower latency computation through running code on the same
and analytics close to where data is collected is becoming IoT device(s) that collects the data.
arXiv:2012.04880v1 [cs.CV] 9 Dec 2020

increasingly important. We seek to understand what is the The three services vary in their strengths and deciding
performance and the cost implication of running analytics on
IoT data at the various available platforms. These workloads which service to use for a given workload is not trivial
can be compute-light, such as outlier detection on sensor data, for several reasons: (1) Users have different $ budget and
or compute-intensive, such as object detection from video feeds performance requirements. (2) Real-world workload charac-
obtained from drones. In our paper, JANUS, we profile the teristics often vary over time [25, 26, 27, 33], e.g., streaming
performance/$ and the compute versus communication cost video analytics can be compute-intensive in case of fast-
for a compute-light IoT workload and a compute-intensive
IoT workload. In addition, we also look at the pros and changing scenes and becomes lighter for relatively static
cons of some of the proprietary deep-learning object detection scenes. (3) The $ cost with each service varies with time and
packages, such as Amazon Rekognition, Google Vision, and geographical regions. (4) Different services have different
Azure Cognitive Services, to contrast with open-source and types of limitations, which may make it impossible to run
tunable solutions, such as Faster R-CNN (FRCNN). We find a particular application on some service. For example, in
that AWS IoT Greengrass delivers at least 2X lower latency
and 1.25X lower cost compared to all other cloud platforms AWS Lambda, a function has a time limit of 15 minutes,
for the compute-light outlier detection workload. For the which makes running complex, stateful algorithms difficult.
compute-intensive streaming video analytics task, an open- Also, picking exact configurations for the instances that run
source solution to object detection running on cloud VMs saves the serverless code is not possible (users can only specify
on dollar costs compared to proprietary solutions provided by the memory requirements, and other resources are scaled
Amazon, Microsoft, and Google, but loses out on latency (up
to 6X). If it runs on a low-powered edge device, the latency is accordingly). Therefore, quantitative evaluation is needed
up to 49X lower. with representative applications in order to identify the
appropriate computing framework for the applications, to
Keywords-sensor data outlier detection, object detection,
AWS EC2, AWS IoT Greengrass, AWS Lambda explore the trade-off between accuracy, performance, $ cost,
and configurability.
In JANUS1 , we compare several cloud computing services
I. I NTRODUCTION
for two representative IoT applications that vary in com-
Cloud computing (a.k.a Infrastructure-as-a-service) is be- plexity. The first is a simple outlier detection application for
coming the main execution environment for many users due sensor data, and the second, is a complex object detection
to its ease of management, scalability, and fault-tolerance. application on streaming video. Both of these algorithms
By removing the need for hardware and cluster management, are ubiquitous in IoT, rely on online data streaming, and
users can now focus on their application needs and have a provide contrasting bandwidth requirements and algorithmic
finer-granularity pricing model for their resource usage, even processing capabilities [29, 36]. Therefore, we select these
more so with the advent of serverless computing where the two applications as representative workloads for compute-
billing is compute-driven rather than time-driven [2, 22]. light vs. compute-intensive IoT applications. IoT devices are
For example, Amazon AWS provides a virtualized server- used for simple data acquisition in many scenarios, like in
based computing service Amazon Elastic Compute Cloud or farms [11, 21] and for self-driving cars [9, 12]. Since the
Amazon EC2. Amazon also provides a serverless computing volume of data acquired through these sensors is high, it
service, AWS Lambda, which allows users to execute their is often run through an outlier detection program to ensure
code without having to provision or manage servers. Users proper analysis of data and to discover faulty sensors. For
essentially pay for the exact amount of allocated resources instance, in the case of farm sensor data, a farmer would
and the compute time (in 100ms increments) with no charge
for idle time. With serverless computing, applications can 1 Our system’s name is inspired by the Greek God who presides over

automatically scale instantaneously by running code in re- passages, doors, gates, and endings. We aim for our system JANUS to be
able to carve out the correct transition to right-sized algorithms for the
sponse to events or triggers. Amazon IoT Greengrass further world to get maximum performance per dollar cost, since he looks to the
extends the AWS infrastructure to the edge and provides future and to the past.
want to know the real-time temperature and humidity of detection algorithm on streaming video on Amazon
the farm, and any delayed intervention may lead to losses Rekognition, Google Vision, and Faster R-CNN (on
in yield and consequent financial losses. For the object Amazon EC2) reveals that Faster R-CNN is 12.8X to
detection workload, many works like [19] use different 21.0X cheaper than Amazon Rekognition and Google
object detection algorithms on IoT devices for a variety of Vision solutions but is also much slower than the
situations like gaze detection and surveillance in smart cities. others (Table IX). Also, we propose a novel ap-
Real-time object detection is essential to security-critical proximation of Faster R-CNN and show that we can
or latency-critical scenarios like self-driving cars, or less flexibly navigate the space of latency versus accuracy.
critically, yet increasingly prevalent, in large crowds at mass In contrast, for the commercial offerings, no such
entertainment events. Thus, these two workloads represent tradeoff is possible. Among the commercial offerings,
popular options for IoT applications while showing variety Google Vision is faster but less performant in latency/$
in compute requirements. terms than Amazon Rekognition (55% less) and Azure
We perform benchmarking experiments with these appli- Cognitive Services (11% less).
cations on two different platform types—edge computing 3) To delve deeper into the open source Faster R-
and cloud computing platforms [31]. Within the edge com- CNN, we execute it on the three commercial
puting platform type, we explore the commercial offerings, cloud platforms—Amazon EC2, Microsoft Azure, and
AWS Greengrass and Google IoT Edge and two different Google Compute. We find that we can execute more
types of compute nodes, a Raspberry Pi and a Docker frames per $ on Google Compute than EC2 (94%) and
container, the latter to emulate more resource-rich devices Azure (153%). We also see that one approximation
like the Nvidia Jetson series. Within the cloud computing knob (the number of region proposals in FRCNN)
platform type, we conduct experiments on three commercial has significant effect on the running time—the run-
offerings, Amazon EC2, Google Compute, and Microsoft ning time is reduced by 57.3% when approximating
Azure Virtual Machine. Our goal is to aid in selection of the aggressively compared to the default parameter value
best platform for each target application. In addition, given while the accuracy is reduced by only 9% (Table IX).
the huge increase in demand for streaming video analytics This tunability also has the additional interpretability
(such as object detection), we profile three leading com- benefit, which is helpful in several domains [23].
mercial offerings, Amazon Rekognition, Google Vision, and
Azure Cognitive Services, to benchmark against a popular II. BACKGROUND
open-source region-based CNN using attention mechanisms Here we give a brief description of the foundational plat-
called Faster R-CNN (FRCNN) [28]. We also use FRCNN form and the different commercial edge computing platforms
to show possible trade-offs between latency and accuracy and the vision services that we benchmark in this paper.
that can impact the end-to-end $ cost. We use FRCNN, We also describe the open-source object detection software
as opposed to other popular object detection algorithms package, Faster R-CNN. Since the commercial cloud com-
e.g., YOLO and SSD, since it has a higher accuracy with puting platforms we consider here are so commonplace, we
classification and bounding box-regression in consecutive omit their background information.
stages, at the expense of computational complexity (useful Edge computing is the practice of placing computing
to showcase JANUS’s compute-intensive use case). resources at the edges of the Internet in close proximity to
In this paper, we ask three questions vis-à-vis the com- devices and information sources. This, much like a cache
puting platforms and software packages described above. on a CPU, increases bandwidth and reduces latency for
applications but at a potential cost of dependability and
1) What platform to run an IoT workload on, on the cloud
capacity [10]. This is because these edge devices are often
and on the edge, respectively for a compute-intensive
not as well maintained, dependable, powerful, or robust as
and for a compute-light workload?
centralized server class cloud resources.
2) What is the latency and $ cost of running on each
platform? The edge paradigm supports the large scale IoT devices,
3) What is the advantage of using an open-source object where real-time data is generated based on interactions with
detection framework on a cloud-based virtual machine the local environment. This complements more heavy-duty
over using the commercial offerings? processing and analytics occurring at the cloud level. This
structure serves as the backbone for applications, such as
Following are the chief insights that come out of JANUS. augmented reality and home automation, which utilize com-
1) Our benchmarking of the compute-light IoT workload plex information processing to analyze the local environment
reveals that AWS Greengrass delivers at least 2X lower to support decision making. In the IoT domain, functional
latency and 1.25X lower cost compared to all other inputs and outputs are physically tied to geographically
platforms for this workload (Tables III and IV). distributed sensors and actuators. If this data is processed
2) Our benchmarking of the compute-intensive object in a central location, immense pressure will be placed on

2
“last mile” networks, and cloud-leveraged IoT deployments MQTT (Message Queue Telemetry Transport, essentially the
will become impractical. messaging protocol for IoT) and HTTP protocols, making it
easier for devices to be registered to Cloud IoT Core. Its
architecture is shown in Figure 2
AWS Greengrass versus Cloud IoT Core: For AWS Green-
grass, AWS IoT Greengrass Core provides local services
(compute, messaging, state, security), and communicates
locally with devices that run the AWS IoT Device SDK [6].
For Google Cloud, the IoT Core provides the services
Figure 1: AWS Greengrass Architecture [5] to communicate with the various IoT devices that have
been registered to it. As such, a key difference between
the two is that Greengrass works with devices by running
AWS Lambda functions and communicating through the
SDK while Google IoT Core uses the standard MQTT (a
machine-to-machine telemetry protocol) or HTTP protocol
for communications. AWS Greengrass and Google IoT Core
both present a gateway between the edge IoT devices and
more powerful cloud services. They act as connectors to
move data between the edge and the cloud. With AWS
Greengrass, Lambda functions are run between the edge
machines through the AWS Greengrass SDK while with
Google IoT core, MQTT commands are used.
Pricing differences: In Amazon’s AWS IoT Greengrass,
payments are structured per the number of AWS IoT Green-
grass Core devices that are connected and interact with the
Figure 2: Google IoT Core Architecture [16] AWS Cloud in a given month. This price depends on the
region that is configured with Greengrass and ranges from
AWS Greengrass is a service offered by Amazon (ini- $0.16–$0.18 per month per IoT Greengrass Core device.
tially as an IoT gateway, now morphed into an edge There is no additional cost for the number of AWS IoT SDK
computing service) that enables data management, durable enabled devices locally connected to any IoT Greengrass
storage, cloud analytics, and local computing capabilities for Core [7]. However, there can be additional charges with AWS
connected edge devices. Notice that Greengrass does not IoT Greengrass if data transfer or any other AWS service
provide any compute power itself and should be looked upon is involved in the application, the pricing of which depends
more as an orchestrator among devices that are outside of on the service used. Amazon S3 is commonly used when
the AWS framework and provided by the user. Connected there is a large quantity of data that needs to be stored and
devices can run AWS Lambda functions or Docker contain- processed elsewhere. In compute-light applications where
ers, while data and control flow to these devices through the the edge device does some processing and sends just a
Greengrass framework. Subsets of the information generated simple result, e.g., outlier detection where the number of
at the edge can be communicated back to the AWS Cloud. outliers are sent as a result, S3 is not needed. In contrast, for
Greengrass also keeps devices’ data in sync, and securely compute-intensive applications that involve images or large
communicates with other devices, even when not connected data sets, S3 can be used to store the entire data set while
to the Internet. This means that Greengrass-connected IoT the edge devices download and process small chunks. In our
devices can still respond quickly to local triggers, interact evaluations, we do all the processing in the device with no
with local resources and minimize the costs associated with extra storage overhead for the outlier detection. For object
transmitting IoT data to the cloud. Its architecture is shown detection, we store the videos on the sensor device and
in Figure 1 upload to the services API for processing, thus incurring no
Cloud IoT Core is a service offered by Google that allows cost for cloud storage. In contrast, Google’s IoT Core pricing
secure connectivity, management, and ingestion of data from is tiered according to the data volume used in a calendar
millions of globally dispersed devices for operational effi- month [17]. This volume is based on all data exchanges
ciency. [16]. Cloud IoT Core runs on Google’s serverless in- between the devices connected to Google IoT Core. Cloud
frastructure, which adaptively scales horizontally in response IoT Core is priced per MB of data exchanged by IoT devices
to real-time events. Like in the Greengrass case, the user has with the service after a 250MB free tier. After that, in the
to provide the device on which the computation of the Cloud initial range, the price is $0.0045 per MB and goes down
IoT Core will run. Cloud IoT core supports the standard to $0.00045 per MB at the high end of the range, beyond a

3
certain threshold of data usage. into play.
Amazon Rekognition, Google Vision, and Azure Vision Faster R-CNN (FRCNN) [28]: FRCNN is a state-of-the-
Services for our object detection application: Amazon art object detection algorithm based on using region proposal
Rekognition provides an API for analyzing images (Amazon networks to hypothesize object locations. Thus it speeds up
Rekognition Image) which we use for streaming video upon its earlier versions—both R-CNN and Fast R-CNN use
analysis. Rekognition uses deep learning algorithms with selective search to find region proposals [14, 15]. Selective
SDKs for many programming languages and requires no search is slow, affecting the performance of the network.
machine learning expertise. With Amazon Rekognition, one In contrast, FRCNN uses a separate network to predict
can identify objects, people, text, scenes, and activities in the region proposals (Region Proposal Networks, RPNs).
images and videos, as well as flag inappropriate content. RPNs are designed to efficiently predict region proposals
Amazon Rekognition also provides facial analysis and facial (with a high recall), with a wide range of sizes and aspect
search capabilities that one can use to detect, analyze, and ratios, by using novel “anchor” boxes to serve as references
compare faces for user verification, people counting, and at differential scales and aspect ratios. Region proposals
public safety use cases [4]. It provides an easy-to-use API are then reshaped using a region of interest (RoI) pooling
that returns the results of computation but without being layer, which essentially uses inputs of non-uniform sizes
able to control the backend of the computation. In addition, to obtain fixed-size feature maps. This RoI pooling layer
although we are able to control the AWS availability zone for then classifies the image within the proposed region and
the VM selection (for latency considerations), the selection further refines the bounding boxes (regressor). We add to
may be too coarse for applications with strict low latency FRCNN, different levels of approximation that can be tuned
requirements such as in autonomous driving. at runtime, for different points in the latency vs. accuracy
Google Vision and Azure Cognitive Services are similar to space. An easy-to-adjust approximation parameter is the
Amazon Rekognition in that they are also an image analysis number of proposals that an RPN generates, which by
service that offer powerful pre-trained machine learning convention is set to be the largest possible number of objects
models through REST and RPC APIs. Google Vision can de- in the image. Since the classifier and bounding box regressor
tect objects and faces, read printed and handwritten text, and are region-wise, a smaller number of proposals reduces the
build valuable metadata into an image catalog [18]. Azure execution cost, at the risk of reduced accuracy when a large
Cognitive Services also provides form and ink recognition number of objects exist in the image. This notion of context-
to analyze written documents and handwriting. aware approximation has been introduced in some domains,
The use cases for these services greatly depend on the like genomics [24], and closer to our application context,
application and the problem that needs to be solved. Ama- streaming video processing [34, 35]. Here our objective is
zon Rekognition platforms have support for popular tasks to expose this novel approximation knob and to show that
like object detection, celebrity recognition, face recognition, this kind of configurability is present only in the open source
content moderation, and text detection. Amazon Rekognition options.
also offers the Pathing option that allows users to run videos
III. E XPERIMENTAL S ETUP
through the service and see the paths that the people in
the video take [4]. Google Vision on the other hand offers Here we describe the benchmark data sets used in the
product search options to scan a product and quickly find study, the workload analysis performed (Outlier detection
similar listings [18]. Azure Vision also has face recognition vs. Object detection), and the experimental setup for the dif-
technology similar to Amazon Rekognition. ferent platforms used in our study. These platforms include:
Customized vision applications versus commercial of- • Edge Devices
ferings—engineering challenges and solutions for the 1) Raspberry Pi 4 model B
data engineer: The APIs in all three platforms ease the pro- 2) Emulated Edge device (using Docker Containers)
cess of prototyping a computer vision application. However, • Cloud Platforms
we also notice that the developers are not able to specify 1) Amazon EC2
the backend compute infrastructure on which it will run. 2) Google Compute
For example, we are not able to leverage our edge device 3) Microsoft Azure Virtual Machine
to force the services to run next to our data storage. The
• IoT Managers
developers are also not able to use their own model or
select a model to run or tune the configuration knobs of the 1) AWS Greengrass
model for desired accuracy/runtime/energy specifications. 2) Google IoT Core
Considering the two challenges above, data engineers can • Other Commercial Offerings
leverage their own computer vision applications on AWS 1) Amazon AWS Lambda (serverless functions)
EC2, AWS Lambda, or AWS IoT Greengrass. This is where 2) Amazon Rekognition
an open-source software package like Faster R-CNN comes 3) Google Vision

4
4) Microsoft Azure Cognitive Services We use 1 CPU and 1 GB RAM as our Docker containers’
We do an exhaustive assessment of these cloud, edge, specification for sensor and edge devices. Our server, coming
and IoT orchestration platforms to evaluate the efficacy with a six-core Intel Xeon CPU E5-2440 clocked at 2.40GHz
of different vendors’ hardware platforms, architectures, or and 48GB RAM, is powerful enough to simulate multiple
networking protocols. Figures 3 and 4 show the setups for sensor and edge devices. Ismal et al. in [20] show that
both workloads. Docker containers provide fast deployment, small footprints,
elasticity, and good performance, which enable them to sim-
A. Outlier-Detection data description ulate edge devices. Furthermore, Docker images are small
The data we used for this benchmarking contains 21k and lightweight, making the CPU, memory, storage, and
points collected from February to October 2019 using tem- network performance similar to physical edge devices [13].
perature and humidity sensors deployed in sensorized farms We also use IoT orchestrators (AWS Greengrass and
and manufacturing units on Purdue University’s campus. We Google IoT core) for both the outlier detection and object
apply extreme value analysis (EVA), which is a popular and detection experiments on the Raspberry Pi and Docker
simple statistical analysis to identify outliers in the data. In container (emulated edge device). This is to account for
this analysis, we fit a Gaussian distribution to the data and scalability if processing is done on multiple edge devices.
use standard statistical outlier detection. Under the Gaussian For the cloud platform experiments, this is not necessary
distribution assumption, we expect that 68% of the data since the data is being sent to a central storage location
points will be within one Standard deviation from the Mean and not individually processed by each device. Our cloud
and 95% to be within two Standard deviations from the infrastructure is as follows for the different platforms:
Mean. We use this distance from the Mean as our outliers’ 1) Amazon EC2: c5.large (2 vCPUs, 4 GiB memory)
cut-off threshold. Tables I – VIII show the number of outliers 2) Google Compute: e2-standard (2 vCPUs, 8 GiB mem-
in both temperature and humidity readings with varying cut- ory)
off thresholds. Figure 5 shows the temperature and humidity 3) Microsoft Azure Virtual Machine: Standard F2s v2 (2
variation of a single device. vCPUs, 4 GiB memory)
B. Object-Detection data description In the cases where we use commercial offerings that
have more sophisticated object detection algorithms, we are
For the compute-intensive workload analysis, we use forced to use the APIs provided by the vendors without the
video data from the ImageNet Large Scale Visual Recog- ability to control the backend device or any parameters.
nition Challenge (ILSVRC) 2015 [30] for video object de-
tection. The evaluation data set contains 555 video snippets IV. E VALUATION
with labels of 30 object categories as ground truth. These
A. Data Preprocessing
videos are good representations of real captured videos from
surveillance cameras or drone cameras. We perform object For the outlier-detection application, we use temperature
detection on these videos—i.e., classify rectangular regions and humidity data collected from 26 WHIN-IoT devices. We
on each frame into one from the 30 object categories. We divide the data into monthly segments. Next, we assume that
assume the video data is stored on the sensor device and the the temperature and humidity will be normally distributed
processing (object detection) is done using different pro- (Gaussian), and we compute the mean and standard devia-
prietary algorithms through commercial offerings or using tion of monthly measurements to identify outliers. Here, the
the variants of the Faster R-CNN model on a cloud virtual goal and motivation of outlier detection is to track if the
machine, which is a widely used custom whitebox model. sensors are malfunctioning.
For the object-detection application, we download the
C. Infrastructure setup widely used ILSVRC 2015 video dataset [1] to the sensor
In our experiments, we use a Raspberry-Pi 4 model B as device as a stand-in for the captured videos on the sensors.
our edge device. This model has a Broadcom BCM2711, We use Amazon Rekognition, Google Vision, Azure Cogni-
Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz, tive Services, and our custom “service” (Faster R-CNN on
4GB LPDDR4-3200 SDRAM, which is one of the most cloud VMs) for object detection on these videos. The video
popular edge devices for developers. We also use Docker data is processed in a streaming manner with each frame
containers as additional emulated edge devices with higher being processed separately.
compute capability, similar to higher powered devices like Overall, we wanted to benchmark using the ubiquitous
the NVIDIA Jetson. We use this strategy to be able to adap- sensor data sets that are generated in different urban and
tively control the edge specs while not needing additional rural IoT settings such as smart factories [32] or connected
hardware. This, in addition to the real edge devices, this farms [11]. We do analysis on different platforms offered
gives us the opportunity to be able to have a platform for by Amazon, Microsoft and Google. These platforms provide
trying out different edge specifications. different virtual machine specifications and price structures.

5
Figure 3: Overall Setup for Outlier Detection Evaluation. Data is collected from the IoT sensor devices (temperature and
humidity) and stored/streamed to the appropriate storage option for different platforms. For example, we store the data in
Amazon S3, which is a remote storage, for cases where the processing is on the cloud or on AWS Lambda. We store the
data into the device’s local storage when the processing is on the edge device. We then perform the same analysis across
all the platforms and report the latency and $ cost for each service. By tuning the threshold for outlier detection, we can
get different proportions of outliers

Table I: Performance and cost metrics for running our compute-light operation, specifically, outlier detection on AWS-
Lambda.
Metric Temperature outliers (%) Humidity outliers (%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) 549.845 1,045.64 1,100 92 $0.000004587
µ±2×σ 446 (2.101%) 561 (2.643%) 605.557 1,104.7 1,200 92 $ 0.000005004
µ±3×σ 6 (0.028%) 5 (0.024%) 545.787 1,063.51 1,100 93 $0.000004587

B. Experiments and Results (Outlier Detection) our outlier detection workload on an emulated edge device
1) Processing on AWS Lambda: In this section, we evalu- using the AWS Greengrass IoT platform. Moreover, we load
ate the performance and $ cost for running our compute-light the data directly from the container’s file-system instead of
workload (outlier detection) on the AWS Lambda service. querying the data from S3. This is performed to show the ad-
We use the temperature and humidity readings from a single vantage of running the analysis on the edge, closer to where
device in the sensor network. Since AWS-Lambda has a the data is being collected, hence having a lower execu-
number of limitations [8] (such as the maximum timeout tion latency. Table II shows the corresponding performance
of 15 minutes on a single lambda execution), analysis of and execution costs for the three cut-off thresholds. AWS
data points from all 26 devices on our campus network is Greengrass provides flat pricing per device, so the costs are
infeasible for a single Lambda. Therefore, we use a single independent of the execution time. Moreover, we notice the
Lambda per device and report the average performance and very low execution time compared to AWS Lambda (52–54
$ cost across all devices. We set the Lambda’s max memory msec for emulated edge device with Greengrass vs. > 1 sec
to 256 MB. We notice that all other resources (such as for Lambda) without the data-passing overhead.
CPU compute capacity) is scaled proportionally to the max 3) Processing on Raspberry-Pi 4B with AWS IoT Green-
memory specified. We store the data in Amazon S3 and grass: Now we show the analysis using a Raspberry-Pi
have the Lambda function download directly from S3. We 4B edge device that is connected to AWS Greengrass IoT
draw several insights. First, the number of outliers decreases platform. Again, we load the data directly from the device’s
with increase in the cut-off threshold [Table I]. However, file-system rather than querying the data from S3. Table III
the runtime is almost identical across the three thresholds. shows the corresponding performance and execution costs
Another major advantage of using AWS Lambda is that for the three cut-off thresholds. We notice that the execution
it has finer-granularity billing for short-lived jobs vis-à-vis times are higher than the execution times for the emulated
other platforms that have a per minute minimum-charge edge device but less than AWS-Lambda (1 sec for AWS-
duration (e.g., EC2 charges for a 60s minimum duration [3]). Lambda > 163–178 msec for Raspberry-Pi with Greengrass
2) Processing on Emulated device with AWS IoT Green- > 52–54 msec for emulated edge device with Greengrass)
grass: Now we show the performance and $ cost of running since there is no data-passing overhead.

6
Table II: Performance and cost metrics for performing outlier detection on an emulated edge device with AWS Greengrass
as the IoT manager. Greengrass provides a flat monthly billing.
Metric Temperature outliers (%) Humidity outliers (%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) - 52.099 - 92 $ 0.0000037
µ±2×σ 446 (2.101%) 561 (2.643%) - 54.577 - 92 $ 0.0000037
µ±3×σ 6 (0.028%) 5 (0.024%) - 52.490 - 93 $ 0.0000037

Table III: Performance and cost metrics for performing outlier detection on Raspberry-Pi 4B with AWS Greengrass as the
IoT manager.
Metric Temperature outliers(%) Humidity outliers(%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) - 178.27 - 12 $0.0000037
µ±2×σ 446 (2.101%) 561 (2.643%) - 172.09 - 12 $0.0000037
µ±3×σ 6 (0.028%) 5 (0.024%) - 163.72 - 12 $0.0000037

Table IV: Performance and cost metrics for outlier detection on an emulated edge device with Google IoT. Unlike AWS
Greengrass, Google IoT charges for only the data transfer between edge devices or between the edge and cloud. We estimate
the price and neglect the initial free data volume per account.
Metric Temperature outliers(%) Humidity outliers(%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) - 95.47 - 32 $0.0045
µ±2×σ 446 (2.101%) 561 (2.643%) - 102.7 - 30 $0.0045
µ±3×σ 6 (0.028%) 5 (0.024%) - 85.3 - 31 $0.0045

Table V: Performance and cost metrics for performing outlier detection on Raspberry-Pi 4B with Google IoT as the IoT
manager.
Metric Temperature outliers (%) Humidity outliers (%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) - 326.67 - 32 $0.0045
µ±2×σ 446 (2.101%) 561 (2.643%) - 351.41 - 30 $0.0045
µ±3×σ 6 (0.028%) 5 (0.024%) - 291.87 - 31 $0.0045

Table VI: Performance and cost metrics for performing outlier detection on Amazon-EC2. We notice that the billed duration
is 1 minute (unlike AWS-lambda) as that is the minimum charge per-instance in EC2.
Metric Temperature outliers (%) Humidity outliers (%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) 404 657 60000 37 $0.001417
µ±2×σ 446 (2.101%) 561 (2.643%) 404 666 60000 37 $0.001417
µ±3×σ 6 (0.028%) 5 (0.024%) 404 675 60000 37 $0.001417

Table VII: Performance and cost metrics for performing outlier detection on Google Compute Engine.
Metric Temperature outliers (%) Humidity outliers (%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) 512 770 60000 32 $0.00112
µ±2×σ 446 (2.101%) 561 (2.643%) 512 776 60000 30 $0.00112
µ±3×σ 6 (0.028%) 5 (0.024%) 512 768 60000 31 $0.00112

Table VIII: Performance and cost metrics for performing outlier detection on Microsoft Azure Virtual Machine.
Metric Temperature outliers (%) Humidity outliers (%) Data-Passing Duration (ms) Billed Duration (ms) Memory Size (MB) $ cost
Duration (ms)
µ±1×σ 5,978 (28.165%) 5,706 (26.883%) 373 633 1000 32 $0.000003
µ±2×σ 446 (2.101%) 561 (2.643%) 373 645 1000 30 $0.000003
µ±3×σ 6 (0.028%) 5 (0.024%) 373 463 1000 31 $0.000003

7
Figure 4: Overall Setup for Object Detection Evaluation. (a) Humidity Variation for Device 30
The video data set is ILSVRC VID 2015 and is stored
on the embedded device (Raspberry Pi or an Emulated
device using a Docker Container). It is directly uploaded
using the vendor’s API if needed, and uploaded to AWS S3
for processing on cloud platforms. Our custom algorithm
Faster R-CNN (FRCNN), where the customization knob is
the number of proposals, is used for comparison with three
commercial services in their corresponding platforms. We
report execution time/frame and $ cost for each service on
each platform.

(b) Temperature Variation for Device 30


4) Processing on Emulated device with Google IoT Core:
Here we use Google IoT platform and a Docker container Figure 5: Device 30’s temperature and humidity variation
that emulates an edge device. We use the same data set as and outlier detection thresholds
with the previous three platforms and show the performance
and cost metrics in Table IV. We notice that running
outlier detection on an emulated device with Google IoT 7) Processing on Google Compute Engine: We show the
performs slightly slower than AWS Greengrass. We also analysis on Google Compute with an e2-standard machine.
notice that in terms of price, Google IoT’s pricing model As seen with Amazon EC2, the min billed duration is 60s,
(which is based on the volume of data transfer) shows the making it expensive for short-lived jobs. Plus, it suffers from
highest $ cost across the 4 platforms. This is because the a data-passing overhead.
minimum data size used for billing is 1 MB (which costs 8) Processing on Microsoft Azure Virtual Machine: Here
$0.0045). However, Google IoT still provides significantly we show the analysis on Microsoft’s Azure Virtual Machine.
better performance (i.e.,, lower latency) compared to AWS- Unlike Amazon EC2 or Google Compute, the minimum
Lambda and AWS-EC2. billed duration is 1 sec, which makes it more suitable for
short-lived jobs. As such, the cost seen on Table VIII is the
5) Processing on Raspberry Pi 4B with Google IoT least compared to all other costs. However, it also suffers
Core: Now we evaluate the performance and $ cost for from the data-passing overhead, as outlined above.
outlier detection on Raspberry-Pi 4B using the Google IoT
platform. Similar to the previous subsection, we notice that C. Experiments and Results (Object Detection)
running this on the Raspberry Pi with Google IoT performs Here, we show the performance and $ cost of the object
slightly slower than corresponding platform in AWS. detection workload on various platforms. In the case of
6) Processing on Amazon-EC2: Here we execute the processing on the edge device, the video frames are stored on
outlier detection application on an AWS EC2 instance the device for processing. In the case of cloud platforms and
(c5.large). As stated earlier, EC2 has a minimum billing other commercial offerings, the video frames are uploaded
duration of 60 sec [3], which makes it more expensive for to Amazon S3 and streamed from there.
short-lived jobs compared to AWS Greengrass or AWS- We also show results with different number of proposals
Lambda. Accordingly, we find EC2 to be the most expensive for Faster RCNN to highlight the advantage of tunability
service compared to the other platforms (Table VI). In terms of open source algorithms. This can have an impact on
of latency, EC2 also suffers from the data-passing overhead the execution time as shown in Table IX. The running
(similar to AWS-Lambda). However, it performs better than time decreases by 57.3% when approximating aggressively
AWS-Lambda since it has higher compute capacity. compared to the default parameter value for the number

8
of partitions. It can also have an impact on the mean faster than Microsoft Azure Cognitive Services and 25%
Average Precision (mAP), as seen in the table. For the same faster than Amazon Rekognition. This can also be seen
aggressive approximation setting, accuracy decreases by 9% when running the open-source algorithm on the cloud, where
compared to the default value. FRCNN on EC2 is the most $ efficient while FRCNN on
Google Compute is the fastest per frame.
Table IX: Performance and cost metrics for running
compute-intensive operations, specifically, object detection V. C ONCLUSION
on Amazon Rekognition, Google Vision, and Microsoft In this paper, we presented JANUS, the first benchmarking
Cognitive Services and an open-source software package, effort of edge computing platforms for different kinds of IoT
FRCNN. workloads. We profile Amazon and Google’s edge offerings
Type Platform Accuracy Frames/ Time/ for a compute-light IoT workload (outlier detection on
$ frame sensor data) and a compute-intensive IoT workload (object
Open-source, FRCNN (100 59.11% - 23.984 detection on streaming video). For the object detection
proposals) on sec workload, we also use the proprietary Amazon, Google,
Raspberry-Pi 4B
and Microsoft computer vision offerings and benchmark
FRCNN (50 58.53% - 16.945
Edge
proposals) on sec them against an open source package called Faster R-CNN.
Raspberry-Pi 4B Our results show that for compute-light workloads, edge-
FRCNN (10 50.13% - 10.234 based services like AWS Greengrass and Google IoT provide
proposals) on sec the best performance and $ cost, with AWS Greengrass
Raspberry-Pi 4B
delivering up to 2X lower latency and up to 1.25X lower
Open-source, FRCNN (100 pro- 59.11% 77,266 2.318 cost compared to Google IoT. In contrast, for compute-
posals) on Amazon sec
EC2 intensive workloads, the magnitude of tradeoff between
FRCNN (100 pro- 59.11% 24,666 2.178 latency/execution time and cost is non-trivial. We show that
Cloud
posals) on Google sec a custom service can be up to 49X slower if run on a slow
Compute Engine edge device and up to 6X slower if run on a cloud virtual
FRCNN (100 59.11% 59,306 3.02 machine vis-à-vis proprietary solutions by Google, Amazon,
proposals) on sec
Microsoft Azure or Microsoft. We also show how to speed up the open-source
Virtual Machine solution by approximating aggressively, reducing runtime by
Commercial, Amazon Rekogni- - 1000 0.633 57.3% at the cost of 9% drop in accuracy, which highlights
tion sec the tunability of custom solutions.
Google Vision - 444 0.471
Cloud
sec R EFERENCES
Microsoft Azure - 500 0.488 [1] ImageNet: Large Scale Visual Recognition Challenge 2015
Cognitive Services sec (ILSVRC2015). http://image-net.org/challenges/LSVRC/
2015/#vid, 2015.
[2] A KKUS , I. E., C HEN , R., R IMAC , I., S TEIN , M., S ATZKE ,
Our first observation for performance is that our custom K., B ECK , A., A DITYA , P., AND H ILT, V. SAND: Towards
service is up to 49X slower than Amazon’s, Google’s and high-performance serverless computing. In 2018 {Usenix}
Microsoft’s commercial object detection services. However, Annual Technical Conference (USENIX ATC 18) (2018),
this is offset by the fact that running a large job with many pp. 923–935.
[3] A MAZON. Amazon EC2 per second billing.
images can cost more as shown by the lower frames/$ for https://aws.amazon.com/about-aws/whats-new/2017/10/
the commercial offerings as opposed to running open-source announcing-amazon-ec2-per-second-billing/.
algorithms on the cloud. The decision is left up to the user [4] A MAZON. Amazon Rekognition. https://aws.amazon.com/
to evaluate the tradeoff between runtime and cost with the rekognition/.
help of benchmarking efforts like JANUS. This shows the [5] A MAZON. AWS IoT Greengrass. https://aws.amazon.com/
advantage of more evolved commercial services on reducing greengrass/.
[6] A MAZON. AWS IoT Greengrass FAQs. https://aws.amazon.
the latency and providing speedy detection results. However, com/greengrass/faqs/.
some drawbacks of commercial offerings are that they are [7] A MAZON. AWS IoT Greengrass Pricing. https://aws.amazon.
a black box and do not offer tuning knobs that can trade com/greengrass/pricing/.
latency for accuracy or price. Furthermore, they do not [8] A MAZON. AWS Lambda Limits. https://docs.aws.amazon.
provide a metric for accuracy, and they do not allow the com/lambda/latest/dg/limits.html.
user to pick the backend on which they run. [9] BAGCHI , S., AGGARWAL , V., C HATERJI , S., D OUGLIS , F.,
G AMAL , A. E., H AN , J., H ENZ , B. J., H OFFMANN , H.,
The next observation is that among the commercial ser- JANA , S., K ULKARNI , M., ET AL . Grand challenges of
vices, Google’s is less $ efficient than Amazon’s (55%) and resilience: Autonomous system resilience through design and
Microsoft’s (11%). However, it is the fastest performer—3% runtime measures. arXiv preprint arXiv:1912.11598 (2019).

9
[10] BAGCHI , S., S IDDIQUI , M.-B., W OOD , P., AND Z HANG , H. Middleware Conference (2017), ACM, pp. 28–40.
Dependability in edge computing. Communications of the [27] M AHGOUB , A., W OOD , P., M EDOFF , A., M ITRA , S.,
ACM 63, 1 (2019), 58–66. M EYER , F., C HATERJI , S., AND BAGCHI , S. SOPHIA:
[11] C HATERJI , S., D E L AY, N., E VANS , J., M OSIER , N., E NGEL , Online reconfiguration of clustered nosql databases for time-
B., B UCKMASTER , D., AND C HANDRA , R. Artificial intel- varying workloads. In 2019 {USENIX} Annual Technical
ligence for digital agriculture at scale: Techniques, policies, Conference USENIX ATC’19 (2019), pp. 223–240.
and challenges. arXiv preprint arXiv:2001.09786 (2020). [28] R EN , S., H E , K., G IRSHICK , R., AND S UN , J. Faster R-
[12] C HATERJI , S., NAGHIZADEH , P., A LAM , M. A., BAGCHI , CNN: Towards real-time object detection with region proposal
S., C HIANG , M., C ORMAN , D., H ENZ , B., JANA , S., L I , N., networks. In Advances in Neural Information Processing
M OU , S., ET AL . Resilient cyberphysical systems and their Systems (2015), pp. 91–99.
application drivers: A technology roadmap. arXiv preprint [29] ROADY, R., H AYES , T. L., VAIDYA , H., AND K ANAN , C.
arXiv:2001.00090 (2019). Stream-51: Streaming classification and novelty detection
[13] F ELTER , W., F ERREIRA , A., R AJAMONY , R., AND RUBIO , from videos. In Proceedings of the IEEE/CVF Conference on
J. An updated performance comparison of virtual machines Computer Vision and Pattern Recognition Workshops (2020),
and linux containers. In 2015 IEEE International Symposium pp. 228–229.
on Performance Analysis of Systems and Software (ISPASS) [30] RUSSAKOVSKY, O., D ENG , J., S U , H., K RAUSE , J.,
(March 2015), pp. 171–172. S ATHEESH , S., M A , S., H UANG , Z., K ARPATHY, A.,
[14] G IRSHICK , R. Fast R-CNN. In Proc. of the IEEE Conf. on K HOSLA , A., B ERNSTEIN , M., B ERG , A. C., AND F EI -F EI ,
Computer Vision (2015), pp. 1440–1448. L. ImageNet Large Scale Visual Recognition Challenge.
[15] G IRSHICK , R., D ONAHUE , J., DARRELL , T., AND M ALIK , International Journal of Computer Vision (IJCV) 115, 3
J. Rich feature hierarchies for accurate object detection (2015), 211–252.
and semantic segmentation. In Proc. of the IEEE Conf. on [31] S URYAVANSH , S., B OTHRA , C., C HIANG , M., P ENG , C.,
Computer Vision & Pattern Recognition (2014), pp. 580–587. AND BAGCHI , S. Tango of edge and cloud execution for
[16] G OOGLE. Cloud IoT Core. https://cloud.google.com/ reliability. In Proceedings of the 4th Workshop on Middleware
iot-core/. for Edge Clouds & Cloudlets (2019), pp. 10–15.
[17] G OOGLE. Cloud IoT Core pricing. https://cloud.google.com/ [32] T HOMAS , T. E., KOO , J., C HATERJI , S., AND BAGCHI ,
iot/pricing. S. Minerva: A reinforcement learning-based technique for
[18] G OOGLE. Google Vision. https://cloud.google.com/vision/. optimal scheduling and bottleneck detection in distributed
[19] H U , L., AND N I , Q. Iot-driven automated object detection factory operations. In 2018 10th International Conference on
algorithm for urban surveillance systems in smart cities. IEEE Communication Systems & Networks (COMSNETS) (2018),
Internet of Things Journal 5, 2 (2018), 747–754. IEEE, pp. 129–136.
[20] I SMAIL , B., M OSTAJERAN , E., K ARIM , M., TAT, W., S E - [33] X U , M., Z HANG , X., L IU , Y., H UANG , G., L IU , X., AND
TAPA , S., L UKE , J.-Y., AND O NG , H. Evaluation of docker L IN , F. X. Approximate query service on autonomous iot
as edge computing platform. cameras. In Proceedings of the 18th International Confer-
[21] J IANG , X., Z HANG , H., Y I , E. A. B., R AGHUNATHAN , N., ence on Mobile Systems, Applications, and Services (2020),
M OUSOULIS , C., C HATERJI , S., P EROULIS , D., S HAKOURI , pp. 191–205.
A., AND BAGCHI , S. Hybrid low-power wide-area mesh [34] X U , R., KOO , J., K UMAR , R., BAI , P., M ITRA , S.,
network for iot applications. IEEE Internet of Things Journal M EGHANATH , G., AND BAGCHI , S. ApproxNet: Content
(2020). and Contention Aware Video Analytics System for the Edge.
[22] J ONAS , E., S CHLEIER -S MITH , J., S REEKANTI , V., T SAI , arXiv preprint arXiv:1909.02068 (2019).
C.-C., K HANDELWAL , A., P U , Q., S HANKAR , V., C AR - [35] X U , R., KOO , J., K UMAR , R., BAI , P., M ITRA , S., M IS -
REIRA , J., K RAUTH , K., YADWADKAR , N., ET AL . Cloud AILOVIC , S., AND BAGCHI , S. Videochef: efficient approxi-
programming simplified: A berkeley view on serverless com- mation for streaming video processing pipelines. In USENIX
puting. arXiv preprint arXiv:1902.03383 (2019). Annual Technical Conference (USENIX ATC) (2018), pp. 43–
[23] K IM , S. G., T HEERA -A MPORNPUNT, N., FANG , C.-H., 56.
H ARWANI , M., G RAMA , A., AND C HATERJI , S. Opening [36] Y U , T., WANG , X., AND S HAMI , A. Recursive principal
up the blackbox: an interpretable deep neural network-based component analysis-based data outlier detection and sensor
classifier for cell-type specific enhancer predictions. BMC data aggregation in iot systems. IEEE Internet of Things
systems biology 10, 2 (2016), 54. Journal 4, 6 (2017), 2207–2216.
[24] KOO , J., Z HANG , J., AND C HATERJI , S. Tiresias: Context-
sensitive approach to decipher the presence and strength of
microrna regulatory interactions. Theranostics 8, 1 (2018),
277.
[25] M AHGOUB , A., M EDOFF , A., K UMAR , R., M ITRA , S.,
K LIMOVIC , A., C HATERJI , S., AND BAGCHI , S. OPTI-
MUSCLOUD: Heterogeneous configuration optimization for
distributed databases in the cloud. In 2020 {USENIX} Annual
Technical Conference USENIX ATC’19 (2020), pp. 1–16.
[26] M AHGOUB , A., W OOD , P., G ANESH , S., M ITRA , S.,
G ERLACH , W., H ARRISON , T., M EYER , F., G RAMA , A.,
BAGCHI , S., AND C HATERJI , S. Rafiki: A middleware for pa-
rameter tuning of nosql datastores for dynamic metagenomics
workloads. In Proceedings of the 18th ACM/IFIP/USENIX

10

You might also like