Master's Thesis: Houcem Eddine TESTOURI

Master’s thesis
To obtain :
research master degree

Specialty : IoT and data processing
Elaborated by :
Houcem Eddine TESTOURI
MAFC : Multimedia Adaptable Flow

Controller for Big Data Systems
Defended on June 30, 2021 before the Board of Examiners :
President : Mr Rabah ATTIA Professor

Reviewer : Mr Riadh BEN ABDALLAH Associate Professor
Supervisor : Mme Takoua ABDELLATIF Associate Professor
Supervisor : Mr Aymen YAHYAOUI Research Assistant
SERCOM Laboratory
2020-2021
Acknowledgement
I would first like to thank my supervisors, Mrs Takoua Abdellatif and Mr

Aymen Yahyaoui, whose expertise was invaluable in formulating the research
questions and methodology. Your insightful comments pushed me to refine
my thinking and took my work to the next level.
In addition, i would like to acknowledge with great appreciation the cru-

cial role played by the professors of the Tunisian Polytechnic School and the
SERCOM laboratory.
Also, I would like to thank my fiancée Meriem Ata for her wise counsel
and sympathetic ear. You were always there for me.
Finally, I could not have completed this work without the support of my fam-
ily and friends who provided with stimulating discussions as well as happy
distractions to rest my mind outside of my research.
i
Contents
List of Figures iv
Introduction 1
1 Background 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
I Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . 3
I.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 3
I.2 Internet of Multimedia Things . . . . . . . . . . . . . . 3
II Big data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
II.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 6
II.2 Big Data characteristics . . . . . . . . . . . . . . . . . 7
II.3 Definition and characteristics of multimedia big data . 8
II.4 Big Data applications in multimedia . . . . . . . . . . 9
III Multimedia big data and IoT . . . . . . . . . . . . . . . . . . 10
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Stream processing systems 12

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
I Stream processing engines . . . . . . . . . . . . . . . . . . . . 12
I.1 Key features . . . . . . . . . . . . . . . . . . . . . . . . 12
II Stream processing technologies . . . . . . . . . . . . . . . . . . 16
II.1 Apache Kafka . . . . . . . . . . . . . . . . . . . . . . . 16
II.2 MQTT . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
II.3 Apache Storm . . . . . . . . . . . . . . . . . . . . . . . 19
II.4 Apache Flink . . . . . . . . . . . . . . . . . . . . . . . 20
II.5 Apache Spark . . . . . . . . . . . . . . . . . . . . . . . 21
II.6 Apache Samza . . . . . . . . . . . . . . . . . . . . . . . 23
II.7 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
ii
3 Related work 25
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
I AMSEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
II RAM3 S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
III Rate adaptive multicast video streaming over software defined
network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
IV Real-time Data Infrastructure at Uber . . . . . . . . . . . . . 29
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4 Solution approach 31
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
I System architecture . . . . . . . . . . . . . . . . . . . . . . . . 31
II MAFC architecture components . . . . . . . . . . . . . . . . . 34
II.1 Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . 34
II.2 Flow controller . . . . . . . . . . . . . . . . . . . . . . 34
II.3 Policy adapter . . . . . . . . . . . . . . . . . . . . . . . 34
II.4 Control policies database . . . . . . . . . . . . . . . . . 34
II.5 Message broker . . . . . . . . . . . . . . . . . . . . . . 35
II.6 Stream processing engine . . . . . . . . . . . . . . . . . 36
II.7 Third parties services . . . . . . . . . . . . . . . . . . . 36
III Application to surveillance systems . . . . . . . . . . . . . . . 36
III.1 Implementation of multimedia stream controller . . . . 36
III.2 Message broker . . . . . . . . . . . . . . . . . . . . . . 36
III.3 Implementation of multimedia data processing engine . 39
IV Experimental evaluation . . . . . . . . . . . . . . . . . . . . . 39
IV.1 Processing time . . . . . . . . . . . . . . . . . . . . . . 40
IV.2 End-to-end latency . . . . . . . . . . . . . . . . . . . . 40
IV.3 Flow controller impact . . . . . . . . . . . . . . . . . . 42
IV.4 Data Loss . . . . . . . . . . . . . . . . . . . . . . . . . 43
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Conclusions and Further Directions 44
Bibliography 46
iii
List of Figures
1.1 Key Data Characteristics of IoT and IoMT [1] . . . . . . . . . 5

1.2 Different applications of IoMT [1] . . . . . . . . . . . . . . . . 6
1.3 Big data Characteristics [2] . . . . . . . . . . . . . . . . . . . 8
2.1 Kafka architecture . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 MQTT architecture . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Apache Storm architecture [3] . . . . . . . . . . . . . . . . . . 19
2.4 Flink architecture [4] . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Spark architecture [5] . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Apache Samza stack [6] . . . . . . . . . . . . . . . . . . . . . . 24
3.1 AMSEP architecture . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 RAM3 S cores architecture . . . . . . . . . . . . . . . . . . . . 27
3.3 RAM3 S implementation for face detection system . . . . . . . 28
3.4 Uber real-time data infrastructure . . . . . . . . . . . . . . . . 29
4.1 MAFC architecture . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2 MAFC implementation . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Processing time per frame . . . . . . . . . . . . . . . . . . . . 40
4.4 Percentage of time processing . . . . . . . . . . . . . . . . . . 41
4.5 End-to-end latency . . . . . . . . . . . . . . . . . . . . . . . . 41
4.6 Flow controller impact . . . . . . . . . . . . . . . . . . . . . . 42
4.7 Data loss with/without kafka broker . . . . . . . . . . . . . . 43
iv
Introduction
Big data and the Internet of Things (IoT) are evolving rapidly, affecting
many aspects of technology and business and increasing the benefits to com-
panies and individuals. The explosion of data generated by the IoT has had
a significant impact on the big data landscape [7]. Multimedia data, such as
text, audio, images and videos are rapidly evolving and becoming the main
channels for controlling, processing, exchanging and storing information in
the modern era [8]. The fact that 2.5 quintillion bytes (2.5 e+9 GB) of
data are generated every day is a huge challenge for this undertaking. Ap-
proximately 90 percent of the world’s data has been created in the last two
years [9]. Video streaming and downloading are expected to account for 82
percent of global internet traffic by 2022, according to estimates. To put this
in perspective, in 2018, 56 exabytes (equivalent to one billion gigabytes) of
internet video were used each month [10].
We can therefore predict the amount of data that will be generated in the
coming years. The enormous development of media data accessible to users
poses a whole new set of problems in terms of data control and processing.
Many crucial data is transferred in the IoT context. Due to the growing rel-
evance of public security in many locations, the demand for intelligent visual
surveillance has increased. It is necessary to build computer vision-based
controlling and processing tools for real-time detection of frauds, suspicious
movements, and criminal suspect vehicles, among other things. The detec-
tion and subsequent tracking of moving objects is an important task for
computer vision based surveillance systems. The identification of an object
in a video that varies its position relative to the field of view of a scene is
defined as the process of recognising moving objects. The detection process
can be characterised as the detection of the trajectory of an object in a video.
While the traditional real-world multimedia systems provide real-time
multimedia processing. Scalability remains a nightmare due to the volume
and velocity of this data, as these traditional systems cannot keep pace with
the addition of multimedia sources such as cameras with exponentially higher
bandwidth and processing rates. More memory, computing power and energy
1
consumption are required with higher bandwidth [1]. The problem is that
the data bandwidth can exceed the processing time, which can lead to data
loss or delay in real-time processing. In addition, intensive data processing
can impact on power consumption and even damage the system hardware.
For similar mission-critical applications, information loss or long notification
delays after a security breach are not acceptable. However, in a big data
context, with a large number of data sources and heterogeneous networks,
deploying solutions dealing with these constraints is difficult to achieve.
In this study, we are interested on real-time control and processing of
massive multimedia data in order to extract interesting events with low la-
tency and without data loss. We propose MAFC, a Multimedia Adaptable
Flow Controller for big data systems. Compared to conventional multime-
dia applications, we show that MAFC enables scalability and improves the
processing performance of big data multimedia systems.
The rest of the report is organized as follows:
• Chapter 1 provides technical background on the concepts of IoT and

Big Data in multimedia data.
• Chapter 2 defines stream processing engines, their main characteristics

and some stream processing technologies.
• Chapter 3 presents some related work to show other researchers’ at-

tempts to solve the scalability and adaptability problems of big data
systems.
• Chapter 4 is devoted to presenting the architecture of the MAFC sys-

tem and an overview of its main components as well as describing its
workflow. Then, an implementation and evaluation was applied to the
proposed solution for performance and adaptability testing.
• A conclusion, future works and suggested research directions are pre-

sented at the end of this report.
2
Chapter 1
Background
Introduction
In this chapter, we will present a relevant background on the Internet of
Things and the Internet of Multimedia Things. Furthermore, a basic knowl-
edge of big data concepts such as definition, characteristics, multimedia big
data and applications is also required.
I Internet of Things
I.1 Definition
The Internet of Things (IoT) designs a group of interconnected objects pos-
sessing each a unique identifier and the ability to exchange data. This in-
terconnection is guaranteed thanks to a network that doesn’t require the
human supervision. First, a group of embedded sensors collects the data. It
is then stored, processed and shared through the network in order to achieve
a certain functionality depending on the studied case.
I.2 Internet of Multimedia Things

Internet of Multimedia Things (IoMT) devices are not the same as Internet
of Things devices. They require additional memory, computational capac-
ity, and power consumption with increasing bandwidth [11]. The important
properties of IoT and IoMT data are illustrated in Figure 1.1. Industrial IoT,
smart cities, smart hospitals, smart grid, smart agriculture, and smart homes
are some of the real-time implementation scenarios. As a result, stringent
quality of service (QoS) requirements are imposed, along with an efficient
3
network architecture. Quality of experience (QoE) refers to how users per-
ceive QoS. QoE can be classified as objective or subjective. Objective QoE
of customers is difficult to quantify and varies greatly with needs. How-
ever, service providers are concerned with subjective QoE when evaluating
the network’s Mean Opinion Score (MOS). The amount of multimedia data
is multiplying. It creates additional problems in terms of data transmis-
sion, processing, storage and sharing. For edge, fog, and cloud devices, new
processing algorithms are needed. For multimedia data storage, new com-
pression and decompression algorithms are introduced. The standard routing
protocol for the IoT is the routing protocol for low-power, lossy networks.
It requires more work, with the consideration of IoMT deployment scenarios
taking into account energy, fault tolerance and latency.
Multimedia communications are supported by IoT features, although
multimedia applications are bandwidth intensive and delay sensitive. The
increasing expansion of multimedia traffic in the Internet of Things has neces-
sitated further development of new techniques to meet its requirements. To
process data, IoMT devices need more bandwidth, more memory and faster
processing resources. Multipoint-to-point and multipoint-to-multipoint sce-
narios are common communication scenarios. Emergency response systems,
traffic monitoring, crime inspection, smart cities, smart homes, smart hospi-
tals, smart agriculture, surveillance systems, Internet of Bodies (IoB), and
industrial IoT are examples of real world multimedia (IIoT) applications.
4
Figure 1.1: Key Data Characteristics of IoT and IoMT [1]
5
Multimedia communication in the IoT faces enormous challenges due to
network dynamics, device and data diversity, QoS stringency, latency sensi-
tivity, and reliability requirements in a resource-constrained IoT. Figure 1.2
shows a variety of IoMT uses.
Figure 1.2: Different applications of IoMT [1]
II Big data
II.1 Definition
The term ”big data” refers to datasets that are too large to be processed by
traditional database management tools and which are too difficult to use. It
also implies datasets with a lot of variation and velocity, which requires the
development of potential solutions to extract value and insights from large,
rapidly changing datasets [12]. According to the Oxford English Dictio-
nary, big data is described as ”very massive data sets that can be processed
by computer to reveal patterns, trends, and relationships, especially in hu-
man behavior and relationships.” However, according to Arunachalam et al.
6
(2018) [13], this definition does not capture all of big data, as big data must
be distinguished from data that is difficult to process using typical data ana-
lytics. Due to the exponential increase in complexity, big data requires more
sophisticated approaches to process it.
II.2 Big Data characteristics

Big data can be described by the following 5 V’s characteristics:
Volume: refers to the amount of data collected by a company. This in-

formation needs to be mined to gain useful insights. Businesses are
drowning in data of all kinds that keeps growing, with terabytes and
even petabytes of data easily accumulating (e.g., turning 12 terabytes of
Tweets per day into better analysis of product sentiment; or converting
350 billion annual meter readings to better predict energy consump-
tion). Moreover, volume is the most critical and distinctive element
of big data, imposing special requirements on all current traditional
technologies and tools.
Velocity: refers to the time it takes to analyze a large amount of data. Fast
processing maximizes efficiency because some activities are extremely
vital and require instantaneous responses. big data streams should be
examined and leveraged as they come into businesses for time-sensitive
activities, such as fraud detection, to maximize the value of the infor-
mation (e.g., processing 5 million business events created each day to
identify potential fraud or analyzing 500 million daily call detail records
in real-time to more quickly predict customer choices).
Variety: refers to the type of data that can be found in big data. This
information can be either structured or unstructured. Any kind of
data, including structured and unstructured data, such as text, sensor
data, audio, video, click streams, log files, etc. are classified as big data.
(e.g., tracking hundreds of live video streams from surveillance cameras
to target points of interest, using the 80% growth in data in photos,
videos and documents to improve customer satisfaction); analyzing the
merged data types generates new challenges, scenarios, etc.
Value: the added value that the acquired data can provide to the intended
process, business, or predictive analysis/hypothesis is defined as an im-
portant element of the data. The value of the data will be determined
by the events or processes it represents, such as stochastic, probabilis-
tic, regular or random occurrences or processes. Depending on this,
7
obligations may be imposed to collect all the data, store it for a longer
period (for a potentially interesting occurrence), and so on. In this
regard, the volume and diversity of data is closely related to the value
of the data.
Veracity: to make a decision, an executive’s level of confidence in the infor-

mation is measured. Therefore, making the right big data correlations
is critical to the future success of the business. However, with one-third
of business leaders distrusting the information used to make choices,
creating trust in big data becomes increasingly difficult as the quantity
and variety of sources grows.
We summarized those characteristics in figure 1.3.
Figure 1.3: Big data Characteristics [2]
II.3 Definition and characteristics of multimedia big

data
Theoretically, multimedia big data is a concept. Multimedia big data has
no specific definition. Multimedia big data differs from traditional big data
in that it is heterogeneous, uses multiple forms of media, and is larger in
8
size. Compared to ordinary big data, it contains a wide variety of data types.
Compared to machines, humans can better interpret multimedia datasets. In
addition, multimedia big data is more complex to manage than traditional
big data, which includes a variety of audio and video files such as interactive
videos, three-dimensional stereoscopic movies, social videos, etc. Multimedia
big data is difficult to describe and categorize because it is collected from
a variety of (heterogeneous) sources, including ubiquitous mobile devices,
sensor-embedded gadgets, the Internet of Things (IoT), the Internet, digital
games, virtual worlds and social media. Analyzing the content and context of
multimedia big data, which is inconsistent across time and space, is sobering.
Due to the increasing growth of sensitive video communication data, the
security of multimedia big data is complex. In order to keep up with the
speed of network transmission, it is necessary to process multimedia big data
quickly and continuously. In order to transmit massive amounts of data in
real-time, multimedia big data must be stored for real-time computing.
Based on the above characteristics, it is clear that scientific multimedia
big data poses some fundamental challenges, such as cognition and com-
plexity understanding, complex and heterogeneous data analysis, distributed
data security management, quality of experience, quality of service, detailed
requirements, and performance constraints. The problems listed above are
related to the processing, storage, transmission, and analysis of multimedia
big data, which leads to new avenues of study in this area. The term ”big
data” refers to datasets that, due to their size and complexity, are too mas-
sive and complex to be processed by typical data processing and analysis
application software. The vast volume of data sets is both structured and
unstructured, making various types of tasks such as querying and sharing
extremely difficult.
II.4 Big Data applications in multimedia

The multimedia big data management system depends on the big data tech-
niques to process and manipulate the multimedia big data efficiency. The
following are some examples of big data applications in multimedia big data
analytics:
Social Networks: Many studies have been conducted on social network

analysis using big data [14, 15, 16]. Study the difficulties of social
actions and behaviors of people on Twitter hashtags, which have a large
number of datasets, visibility, and accessibility. It discusses the social
recommendation system, a new emerging technology that is mainly
employed in social networks to disseminate multimedia information.
9
Smartphones: have recently surpassed the use of other electronic devices
such as desktops and laptops. Smartphones are equipped with modern
technology and features such as Bluetooth, a camera, network connec-
tion, GPS, and a large capacity central processing unit (CPU), among
others. Users can edit, process, and access heterogeneous multimedia
data via smartphones. In big data context, challenges related to mobile
smartphone sensors and data analysis, such as data sharing, influence,
security, and privacy issues. Other issues related to smartphones are
explored, such as big data, security, and multimedia cloud computing.
Surveillance Videos: is an important source of multimedia big data. The

advent of new multimedia big data solutions, such as the volume, ve-
locity, diversity, and value of multimedia generated from surveillance
sources such as traffic monitoring, IoT, and criminal investigation.
Smart city surveillance is one of the most promising applications of
multimedia big data.
Other applications: E-Health, smart TVs, Internet of Things (IoT), dis-

aster management systems, etc are examples of multimedia big data
applications. Biomedical and healthcare data are two of the most com-
mon sources of multimedia big data. They contain a wide range of
data of various sizes, such as patient records, medical photographs,
physician prescriptions.
III Multimedia big data and IoT

With the rapid growth of the Internet of Things, a large number of sensors
are embedded in a wide range of products, from personal electronics to indus-
trial machines, all of which are connected to the Internet. Home appliances,
environmental data, scientific data, geographic data, transportation data,
medical data, personal human data, mobile equipment data, public data,
and astronomical data are all being used to create embedded sensors. Due
to the unique properties of the sources, such as heterogeneity and different
forms of data, multimedia big data collected from IoT devices has different
characteristics from traditional big data, unstructured features, noise, etc.
According to an estimate by IHS Markit, the number of connected IoT de-
vices could reach 125 billion by 2030, resulting in a massive amount of IoT
data [10]. Current solutions for processing large-scale multimedia data are
insufficient to meet the demands of the future. Many IoT operators under-
stand the value and potential of multimedia big data in IoT. It is necessary
10
for the implementation of IoT applications in the development of multime-
dia big data. The rapid growth of the Internet of Things, along with a huge
volume of multimedia data, has created new opportunities for the growth of
multimedia big data. These two well-known technological advances are mu-
tually dependent and should be developed in combination, providing more
opportunities for IoT research.
Conlusion
This chapter allowed us to have a better idea about the concepts treated in
this work such as the Internet of multimedia Things and the multimedia in
big data systems.
11
Chapter 2
Stream processing systems
Introduction
Throughout this chapter, we will focus on the processing and analysis of data
stream, definition, key features and some stream processing technologies.
I Stream processing engines

Stream processing engine is a useful tool for handling many data streams [17].
Users can expect fast responses from the processor, as well as up-to-date
information about the current state of the system. Stream processing has
a wide range of applications, and as the Internet of Things (IoT) becomes
more important, it will continue to grow.
I.1 Key features

The essential characteristics of the Stream processing engine are highlighted
in the following sub-sections. The programming model, data source interac-
tion model, data partitioning method, state management, message processing
guarantee, fault tolerance and recovery, and high-level language support are
among these features.
Programming models
A Stream processing engine’s programming model consists of a processing
job that accepts stream items from a source, processes them, and then emits
some things to a sink [18]. A programming model, in this case, is an abstrac-
tion of the Stream processing engine that enables stream transformations and
12
processing. Given the unbounded nature of streaming data, having a global
perspective of the incoming streaming data is not possible [19]. As a result,
data is handled in pieces on top of an endless data stream by constructing dis-
crete, bounded time periods. In most stream processing engines,(i) a window
is specified by either a time length or a count of records. The count-based
window is sized by the number of events included in the window, whereas
the time-based window defines a moving view that decomposes a stream into
subsets and computes a result over each subset [20] [21]. This makes it pos-
sible to run transformations on the records in the window [18]. In stream
applications, time-based windows are common [21]. (ii) Sliding windows
divide the incoming data stream into fixed-size segments, each with its own
window length and sliding interval.It’s the same as the fixed window if the
sliding interval is the same size as the window length. If the sliding interval
is less than the window length, one or more pieces of data will be included
in more than one sliding window; because the windows overlap, aggregation
transformations in a sliding window will provide a smoother outcome than
in a fixed window. (iii) Unlike fixed and sliding windows, session windows do
not have a predetermined window length. Instead, it is usually defined by a
period of inactivity that exceeds a certain threshold. A session is defined as a
burst of user activity followed by inactivity or a timeout [18]. Many stream
conversions are stateless because they are performed independently on each
record. The most frequent types of stateless transformations are (i) Map,
which converts one record to another, (ii) Filter, which filters out records
that do not fulfill a given predicate, and (iii) FlatMap, which converts each
record to a stream and then flattens the stream elements into a single stream.
Stateful transformations are those that need the processor to retain internal
state across many records.Common types of stateful transformations are (i)
Aggregation, which combines all the records into a single value, (ii) Group-
and-aggregate, which extracts a grouping key from the record and computes
a separate aggregated value for each key, (iii) Join, which joins same-keyed
records from multiple streams, and (iv) Sort, which sorts the records ob-
served in the stream All ingested records within a specified time range are
included in the computation in stateful transformations.Windowing repre-
sents different or partially overlapped time frames and meaningfully limits
the scope of the aggregation [18]. In summary, stream processing engines
can do real-time streaming data processing, in which tuples are processed
as they come, batch data processing, in which time windows are utilized to
divide tuples into batches, or both. A batch of stream elements for a given
stream is a time interval-based subset of the stream.Incoming data streams
are usually processed natively (one tuple at a time) or in micro batches by
distributed stream processing engines. A query or process is represented as a
13
DAG, which consists of a set of node operators as well as data stream source
and sink operators. Stateless or stateful operators are both possible. The
continuous operator model, in which streaming calculations are done by a set
of stateful operators, is used by the majority of distributed Stream processing
engines. Each operator updates the internal state and sends new records in
response to data stream records as they arrive.
Data source Interaction model

It’s crucial to understand how data stream processing engines interact with
data sources or message queuing systems. The basic interaction paradigm
is the push model, in which a data stream engine’s daemon process listens
to an input channel. A pull model is a streaming technique in which the
streaming engines continuously draw published data instances from stream
sources such as WebSocket, Kafka, and Flume at a set frequency. To avoid
data loss, the frequency of data pulling and the pace at which the stream
processing engines process the data must match the rate of data production
at the source. Data sources can also interact with stream processing engines
using a combination of the two models.
Data partitioning strategy

Stream processing engines are used in vast amounts by large online enter-
prises in production. In this circumstance, efficient resource usage is crucial.
A planned task allocation within a stream processing engine can contribute
to inefficiency and poor resource usage . Large volumes of data streams are
partitioned and processed in a parallel or distributed environment for effi-
ciency. A partition is a logical block of data that is dispersed throughout a
cluster. Partitioning strategies have an impact on a system’s scalability and
data handling approach and are a distinguishing aspect of stream processing
engines, as they differ amongst them.Horizontal and vertical techniques to
data splitting can be distinguished . Data is divided into separate groupings
of rows using the horizontal approach. Round-robin, range, and hash parti-
tions are three types of horizontal partitioning strategies that are based on
data set values. Range partitioning is the most preferred strategy, according
to [22], especially when fresh data is loaded on a regular basis. The verti-
cal method separates data into vertical and discontinuous sets of columns,
including cost-based and procedural ways.
14
State management
Operators can be stateless or stateful, as mentioned at the start of this sec-
tion. Stateless operators are simply functional in that they produce output
solely dependent on the input they receive. Stateful operators, on the other
hand, compute their output based on a series of inputs and may employ ad-
ditional side information stored in an internal data structure known as state.
In a data stream application, a state is a data structure that keeps track of
previous operations and influences the processing logic for future calculations.
State was isolated from the application logic that computed and manipulated
the data in traditional stream processing engines, and the state information
was kept in a centralized database management system to be shared among
applications . Managing state effectively poses a number of technical difficul-
ties . The state management facilities in different stream processing engines
naturally fall along a complexity continuum from a simple in-memory-only
choice to a persistent state that can be queried and reproduced.
Message Processing guarantee

Message-processing guarantees are a key aspect of stream processing engines,
with semantics including at-most-once, at-least-once, and exactly-once. A
message is guaranteed to be delivered at most once in an at-most-once se-
mantics, which means that it may be lost during routing and that no more
attempts will be made to deliver it. This is the most basic guarantee seman-
tics, but it’s also the least fault tolerant. Greater fault tolerance is offered in
an at-least-once semantics by ensuring that a message is sent at least once
until an acknowledgement of delivery is received. In rare scenarios, delivery
may be tried many times, resulting in multiple instances of the same message
being delivered at an additional expense for repetitive processing. Most data
processing frameworks provide this as the most common assured semantics.
A message is guaranteed to be sent precisely once with further acknowledge-
ment checks along the route to prevent multiple delivery of the same message
in an exactly-once semantics. This is the preferred semantic. The choosing
of these meanings necessitates a trade-off between application reliability and
expense. For example, a less sophisticated, faster process with weaker guar-
antees may be acceptable for a streaming web analytics program, whereas a
streaming fraud detection application may require a more dependable process
with a higher message guarantee [9].
15
Fault tolerance and recovery
Engines for data stream processing must be up and running for as long as it
takes to recover from failures. Bugs in the system logic, network problems,
the crash of a compute node in a cluster environment, as well as other hard-
ware issues, dependency on third-party software, and bottlenecks caused by
the volume and speed of incoming data are all reasons for failure. A stream
processing engine’s fault-tolerance qualities are demonstrated by its capac-
ity to remain functioning in the face of failures. Stream processing engines
should be able to handle faults with little impact and overhead. Data loss
(for example, due to a node crash or because the data is in memory) and
resource access loss (for example, due to a cluster manager fault) are com-
mon losses caused by stream processing engine failures. On top of regular
processing, failure recovery necessitates additional resources, and repeated
processing results in overhead. For real-time applications, reprocessing from
the start of the pipeline is impractical. Ideally a system should be able to
restore itself to a previous state before the fault occurred and repeat only
some of the failed components of the data processing pipeline. There are two
forms of fault tolerance in distributed stream processing engines: passive
(such as checkpoint, upstream buffer, and source replay) and active (such as
replicas) techniques .
Support for high level languages

Stream processing engines that support various high-level languages, such as
Java, Scala, Python, R, and SQL, give developers more options for coding
languages to reduce the time it takes to construct processing pipelines.
II Stream processing technologies

There are a variety of streaming technologies available that can be used for
this project. Some of them can also be combined to provide a more powerful
service. This section introduces the relevant technologies, which will choose
to our solution implementation
II.1 Apache Kafka

Originally, Apache Kafka [23] was used to connect data sources and data
processors. Its main features at the time were the ability to connect to any
form of source, queue events and pass them to any form of system. These
16
features are still very important today, but other features are also added to
this framework. This allows Kafka to perform simple stream processing.
Kafka offers streaming processing using the Kafka Streams library, and
also offers Streaming SQL via its own SQL language, Kafka SQL (KSQL).
Kafka’s coordinator, Zookeeper, is responsible for coordinating brokers and
failure recovery. Brokers are nodes that manage topics, store events and
perform processing. The broker performs the processing. This is illustrated
in Figure 2.1. he figure describes the possibility of multiple brokers and
subjects, where the subjects can be distributed over several brokers. A topic
can be considered a hub, with some systems publishing to it and others
subscribing to it. This allows Kafka to simplify event processing by allowing
a producer to publish on one topic, process it with Kafka Streams, and
then publish on another topic and forward it to a consumer. Ostensibly,
producers write events to the Kafka cluster, while consumers read them.
Kafka can guarantee ”Exactly-once” semantics, which means that an event
Figure 2.1: Kafka architecture
will only be processed once. On the other hand, this requires a lot of system
management, which slows down the operation considerably. However, as
this is a new feature of Kafka, it will probably be improved in the future.
Otherwise, Kafka is extremely adaptable, allowing different semantics such
as ”At most once Semantics”, which guarantees that an event is processed
17
only once, and ”At least once Semantics”, which guarantees that an event is
processed at least once.
II.2 MQTT
In 1999, Andy Stanford-Clark and Arlen Nipper created the MQTT proto-
col as a bandwidth-efficient means of communication. This protocol sup-
ports many levels of quality services by providing a lightweight subscrip-
tion/publication messaging mechanism to create links between different sys-
tems. The importance of the MQTT protocol becomes apparent when con-
necting remote sites with small hardware. Indeed, to ensure this connection,
it separates the publisher (the transmitter) from the subscriber (the receiver).
These two elements become totally independent and do not need authenti-
cation to exchange data. An MQTT session is divided into four stages:
Figure 2.2: MQTT architecture
Connection: A client first creates a TCP/IP connection with the broker

using either a standard port or a custom port defined by the broker’s
operators. When the server recognizes a reused client ID, it continues
with an old session. Authentication: The client validates the server’s
certificate to authenticate the server or the broker.
Communication: a client can perform publish, subscribe, unsubscribe and

ping operations.
18
Termination: When a publisher or subscriber wishes to terminate an MQTT
session, it sends a DISCONNECT message to the broker, then closes
the connection. sends a DISCONNECT message to the broker and
then closes the connection.
II.3 Apache Storm

To analyze data streams, Apache Storm [3] is a distributed real-time data
analysis system. This technique is extremely adaptable and can be used
in many circumstances. It boasts one of the highest data input rates, is
fault tolerant and horizontally scalable. Storm, like Kafka, uses Zookeeper
to keep track of its nodes. Therefore, a common instance of Zookeeper can
sometimes help Kafka and Storm work together. Apart from that, Storm
has a few other components, as shown in Figure . This diagram shows that
there can be many instances of Zookeeper, Supervisor, and Worker, each
communicating with the others in a hierarchical fashion. They make up the
Storm cluster, which transfers flow data through Spouts and Bolts, as shown
in Figure 2.3.
Figure 2.3: Apache Storm architecture [3]
Spouts, like producers in Kafka, are the inputs to the stream, while Bolts
are the processing engines for the stream. The capacity of the Bolt is compa-
rable to Hadoop’s MapReduce, with the Bolt being able to Map and Reduce
at the same time. In addition, to ensure maximum speed, the workload of
Spouts and Bolts is spread across all workers in the architecture. However,
without the implementation of the Trident high-level API, which relies on
micro-batching, it cannot guarantee ”Exact-once semantics.” The Storm core,
19
on the other hand, has built-in ”At least once” semantics, which is more than
sufficient in most cases.
II.4 Apache Flink

Apache Flink is an open source platform written in Java and Scala, originally
started as a German research project [4], that was later transformed into an
Apache top-level project. Its core is a distributed streaming dataflow that
accepts programs in form of a graph (JobGraph) of activities consuming
and producing data. Each JobGraph can be executed according to a single
distribution option, chosen among the several available for Flink (e.g., single
JVM, YARN, or cloud). This allows Flink to perform both batch and stream
processing, by simply turning data buffering on and off. Data processing by
Flink is based on the snapshot algorithm [4], using marker messages to save
the state of the whole distributed system without data loss and duplication.
Flink saves the topology state in main memory (or on HDFS) at fixed time
intervals. The use of the snapshot algorithm provides Flink with a “exactly-
once” semantics, guaranteeing that every datum is processed at least one time
and no more than that. Clearly, this allows both a low latency and a low
overhead, and also maintains the original flow of data. Finally, Flink has a
“natural” data flow control, since it uses fixed-length data queues for passing
data between topology nodes (see Fig.2.4). The main difference between
Storm and Flink is that, in the former, the graph topology is defined by
the programmer, while the latter does not have this requirement, since each
node of the JobGraph can be deployed on every computing node of the
architecture.
20
Figure 2.4: Flink architecture [4]
Flink’s architecture is fundamentally different from Storm and Kafka in

terms of name and hierarchy; it is simpler and has fewer components in the
overall system. The overall architecture is similar, with Task Managers com-
puting results and a Job Manager scheduling and coordinating tasks to the
Task Managers. Flink is still relatively new to the streaming market, but its
features are sometimes superior to those of its competitors, and it is gener-
ally adopted by huge companies like Uber and Alibaba [24]. Finally, Flink
can guarantee ”semantics exactly once” as well as the underlying categories
”semantics at least once” and ”semantics at most once”, which is one of the
features offered by Flink.
II.5 Apache Spark

Until now, all software was a native stream processor. On the other hand,
Apache Spark [5] is a batch processor. Spark started as a replacement for
Hadoop that included advanced capabilities such as machine learning, SQL
support and streaming. The streaming functionality will be the focus in this
context, with the other features serving as a bonus but not the most signif-
icant aspect. Spark streaming is a library that makes streaming possible in
Spark. Micro-batching on the stream is used to simulate the characteristics of
21
streaming, with each batch including many events or information. It cannot
compete with pure streaming implementations that have significantly lower
latency due to its fundamental batching structure. This may indicate that
Spark is not being considered for testing on systems with lower latency re-
quirements. The data in Spark, on the other hand, is processed consistently
and can guarantee ”Exactly-once” semantics, as well as other underlying cat-
egories. These and other features make Spark a viable option for streaming
systems. Spark Structured Streaming, which is part of the Spark SQL li-
brary, is another streaming library. This library did not support streaming
prior to Spark 2.x, but it has since become a well-known streaming library.
In addition, Structured Streaming is an attempt to combine batching and
streaming, with only minor variations in method calls and syntax.
Figure 2.5: Spark architecture [5]
In addition, compared to its predecessor, this library provides an easier

way to work with streaming, with Spark attempting to remove unnecessary
stream configuration and fine-tuning, such as batch size, which Spark can
now handle. It also allows you to write SQL-like syntax on streams, which
significantly reduces code length and development time.
Spark has a similar design to Flink, with a simple structure of worker
nodes (executors) and a coordinator (cluster manager) that collaborate in
the system. Figure 2.5 illustrates this structure. Despite the fact that the
architecture and functionality are comparable to Flink, they take a com-
pletely different approach. This is due to their initial focus: Spark focused
on batching and then added streaming support, whereas Flink focused on
streaming first and then added batching support.
22
II.6 Apache Samza
Apache Samza [6] is a scalable data processing engine that allows you to
process and analyze your data in real-time. Here is a summary of Samza’s
features that simplify application creation:
Unified API: Use a simple API to describe your application logic indepen-
dent of your data source. The same API can handle both batch and
streaming data.
Pluggability at all levels: Process and transform data from any source.
Samza offers built-in integrations with Apache Kafka, AWS Kinesis,
Azure EventHubs, ElasticSearch and Apache Hadoop. In addition, it
is quite easy to integrate your own sources.
Samza as an integrated library: Effortlessly integrate with your existing

applications by eliminating the need to create and operate a separate
cluster for stream processing. Samza can be used as a lightweight client
library integrated with your Java/Scala applications.
Write once, run anywhere: Flexible deployment options to run applica-

tions anywhere - from public clouds to containerized environments to
bare metal.
Samza as a managed service: Run stream processing as a managed ser-

vice by integrating with popular cluster managers, including Apache
YARN.
Fault tolerance: Seamlessly migrate jobs and their associated state in the
event of failure. Samza supports host affinity and incremental check-
pointing to enable rapid recovery from failures.
Massive scale: The solution has been tested on applications that use multi-
ple terabytes of state and run on thousands of cores. It powers several
large enterprises, including LinkedIn, Uber, TripAdvisor, Slack, and
more.
II.7 Others
Apache Apex is another stream processor that we did not consider. In many
ways, it is identical to Spark in terms of functionality. However, it is a
relatively new system that is only used by a few major companies. In addi-
tion, there are few comparable benchmarks between Apex and other stream
23
Figure 2.6: Apache Samza stack [6]
processors, making it difficult to determine where their system stands in the

stream processor world. Microsoft Azure Stream Analytics, Amazon Kinesis,
and Google Cloud Dataflow are just a few of the different stream processors
available. However, since they are not open source systems, they have not
been further investigated in light of our research needs.
Conlusion
This chapter made us cease the importance of stream analysis for real-time
data processing. In the next chapter, we are going to show some previous
works that address similar issues to our proposed solution.
24
Chapter 3
Related work
Introduction
Many research works have studied multimedia big data systems using sev-
eral methods such as incoming stream control and preprocessing method
optimization. However, others have realized their architecture for stream
processing systems. In this section, some of these papers will be presented
with their results.
I AMSEP
The AMSEP [25], an Automated Multi-level Security Management for Mul-
timedia Event Processing, a system that provides an end-to-end secure pub-
lish/subscribe communication. Each broker topic is associated with a notable
event. Surveillance cameras are used to generate multimedia messages at the
gateways. In order to avoid unnecessary broadcasting of video data, scalar
sensors activate them when needed. Based on a set of Deep Learning (DL)
processors, AMSEPm enables intelligent and autonomous labeling of mes-
sages. The information provided Multimedia event messages are sent to DL
classifiers, which determine the probability of an event occurring.If any of the
DL detectors detect an event (e.g., human presence, fire, flood, etc.), a new
message containing the event class and event metadata (e.g., when, where)
is sent to the security manager. At this point, the security manager parses
a security policy to assign the multimedia event message the appropriate la-
bel. In the case of the section, a DL processor evaluates the likelihood that a
human being is present in the office.The system generates a message that in-
cludes the type of event (intrusion), the office number and the time the event
occurred. The security manager assigns a security label to the event based
25
on these three pieces of information. For example, a nighttime intrusion
into the president’s office is classified as a high-risk security event that must
be reported about the breach with the highest level of confidentiality. Con-
sumers, on the other hand, must have a sufficient level of privacy on a subject
to read its communications.Only security managers with a high security level
are informed of the intrusion, which can be criminal or legitimate in some
scenario (for secrecy reasons). Consumer security levels and approved topics
are assigned and controlled by the security manager. Therefore, consumers
do not discover any information that could compromise the security of the
system. The architecture of the proposed publish/subscribe secure messag-
Figure 3.1: AMSEP architecture
ing system is shown in Figure 4.1. This architecture is modular enough to

allow dynamic updating of the system if new specifications are implemented.
Indeed, the administrator only needs to update the security policy files if the
security policy changes. The rest of the architecture components, as well as
the overall architecture, are not affected. Since publishers and consumers
are not linked, they can be added or removed from the system at any time.
Finally, if a new event of interest is to be added, the related security policies
are updated with a new topic related to that event.
Although this architecture is scalable since it allows adding or removing

consumers and using horizontal scalability, the system is distributed, but even
with trigger sensors, it does not prevent data loss and the system is always
intensive since it processes all incoming data streams without control, which
leads to a risk of latency in the data spike.
26
II RAM3S
RAM3 S [26] is a software framework for the real-time analysis of massive mul-
timedia streams. By exploiting RAM3 S, information from many data sources
(such as sensors and cameras) can be efficiently integrated using RAM3S in
conjunction with a big data streaming platform, with the goal of discover-
ing new and hidden information from the output of the data sources as it
happens, i.e., with very low latency. RAM3 S thus serves as a bridge between
two challenging technologies, one applied (multimedia stream analysis) and
the other architectural (big data platform), allowing the former to be imple-
mented seamlessly on the latter.
The analyzer and receiver interfaces are the core of the RAM3S, and when
used together, they can function as a general purpose multimedia flow pro-
cessor. The Analyzer component is responsible for analyzing the individual
Figure 3.2: RAM3 S cores architecture
multimedia elements. Each media object is compared to the underlying event

classes to determine whether or not an alarm should be triggered. The Re-
ceiver, on the other hand, must break down a single incoming media stream
into discrete elements that can be examined one by one.
Despite the possibility of implementation on different processing engines,
and the provision of a video preprocessing tool to split the video stream into
frames, this framework does not provide a solution to handle backpressure
or data loss.
27
Figure 3.3: RAM3 S implementation for face detection system
III Rate adaptive multicast video streaming

over software defined network
In this research [27], the authors presented an adaptation strategy for SDN-
based networks. The proposed technique estimates the appropriate level of
scalability based on information about the user’s device and network capa-
bilities. The identified scalability level is then used to transmit data. Rate
adaptation with the SDN controller improves video quality and facilitates
decision making.Scalable video levels are determined by the dynamic condi-
tion of the network.
The process of adjusting content to the needs of a user and a network is
called adaptation. Layered video adaptation and bitrate adaptation are the
two methods used to implement it. Scalable Video Coding (SVC) is a stacked
sequence coding scheme that extends H.264/AVC by encoding video in mul-
tiple layers. The base layer in a layered video provides low quality video in
terms of frame rate, resolution and SNR levels. To achieve higher quality, en-
hancement layers are combined with the base layer.Layered video adaptation
is a method to reduce unnecessary layers to meet a user’s capabilities, but
it requires the use of intelligent intermediary devices to acquire information
about the user’s capabilities, network capacity and dynamics. Therefore,
multirate video streaming offers an alternative approach to improve resource
utilization in multicast communication using available routers and switches.
At this stage, the video is encoded at multiple bitrates and stored on the
28
media server. The video data is broadcast by the media server, and users
join a session based on their ability to process it. Compared to unicast trans-
mission, the bandwidth consumption in multi-bitrate multicast transmission
is lower.
While this research work provide a frame rate adapting mechanism, this flow
control may affect on important video stream that must be sent with highest
rate.
IV Real-time Data Infrastructure at Uber

According to uber researchers [28], real-time data processing plays a crit-
ical role in Uber’s technology stack and it empowers a wide range of use
cases. At high level, real-time data processing needs within Uber consists
of three broad areas.first,a Messaging platform that allows communication
between asynchronous producers and subscribers. Second, a Stream pro-
cessing that allows applying computational logic on top of such streams of
messages. Third, an OnLine Analytical Processing (OLAP) that enables an-
alytical queries over all this data in near real time.
Figure 3.4: Uber real-time data infrastructure
In this paper, authors present the overall architecture of the real-time data
infrastructure and identify three scaling challenges that they need to contin-
uously address for each component in the architecture by relying on open
source technologies for the key areas of the infrastructure. On top of those
29
open-source software, they add significant improvements and customizations
to make the open-source solutions fit in Uber’s environment and bridge the
gaps to meet Uber’s unique scale and requirements. for the implimentation,
Uber researchers use Apache Kafka for streaming storage associated with
Apache Flink for stream processing. Also, they impliment Apache Pinot for
OLAP and HDFS for archival store and presto for Interactive SQL Query.
The proposed solution was used in several different real-time use cases across
the 4 broad categories (as in Figure 3.4) in production at Uber such as
Analytical Application(Surge Pricing), Dashboards (UberEats Restaurant
Manager), Machine Learning (Real-time Prediction Monitoring) and Ad-hoc
Exploration (UberEats Ops Automation).
Despite using robust data processing systems such as Apache Flink combined
with kafka to power streaming applications to calculate updated pricing, im-
prove driver dispatch and combat fraud on its platform. they did not apply
their system on multimedia data.
Conclusion
Since these related works propose an efficient solution in the field of flow
controlling and multimedia data processing , they show some limitations and
the need for another solution approach to improve efficienty for multimedia
processing in big data systems.for this we propose a MAFC architecture in
the next chapter by inspiring on previous related works.
30
Chapter 4
Solution approach
Introduction
Based on the previous chapter, very few research works provide a flow con-
trol solution for multimedia big data systems. In this context, we propose
MAFC, an adaptable multimedia flow controller in Big Data systems. In
this chapter, we will present the architecture of the MAFC system and an
overview of its components and workflow description. Then, we aim to im-
plement this architecture by applying it to a surveillance system use case.
This will give us an global view of some of the essential concepts adapted in
the chosen architecture. Furthermore, we will present a series of performance
and adaptability tests, and we will conclude with a discussion of the results.
I System architecture
We propose MAFC, a multimedia adaptable flow controller for Big Data
systems that ensures adaptability with system scalability without data loss
and with performance optimization according to the application context.As
illustrated in Figure 4.1 We make the assumption that the multimedia data is
only a video sources. For the data collection, the proposed architecture is able
to ingest video data from either unbounded video streaming data or stored
multimedia sequences (1). Once the multimedia data has been acquired,
it will decomposed into a sequence of individual multimedia objects that
will be fed one by one to the flow controller (2). Based on the calculated
control policies by the policy adapter (8), the flow controller choose which
frame will be published to the message broker on the topic called ”Frame
Processing” and which is not (3). Then, once the frame was published to
the message broker (4), the real-time data processing engine will extract the
31
specific events from the in coming frames in real-time and publish them to
the concerned topics (6) to consume them by the media stream controller in
order to automatically update the control policies and by third-party services
to notify concerned parties (7).
32
33
Figure 4.1: MAFC architecture
II MAFC architecture components
II.1 Receiver
The main role of the receiver is to break down an incoming multimedia stream
into discrete elements called multimedia objects that can be examined one by
one. The type of ”individual multimedia object” depends on the application
concerned: In a surveillance system scenario, the object is a single image
extracted from a camera video sequence. In this case, the receiver acts as a
video-to-image converter.
II.2 Flow controller

In the proposed solution, the flow controller is the process of managing the
rate of incoming multimedia objects from the receiver before sending them to
the processing engine through the message broker. It provides a mechanism
for the receiver to control the transmission rate based on the information
from the policy adapter to ensure system scalability and real-time processing
with low latency, so that the processing engine is not overwhelmed by the
incoming multimedia data. The flow controller must be distinguished from
the policy adapter. Actually, it watches for any change in control policies
and get notified from the policy adapter. Then, it applies the new policy
adapter’s incoming policies on the multimedia objects.
II.3 Policy adapter

The policy adapter acts as an intermediary between the control policies
database and the flow controller. It provides a user-interface for the admin-
istrator to prioritize policies based on needs and application context, also it
provide an full access to the source code to add or update a specific policy. In
addition, it calculates and updates automatically policies based on extracted
events from the processing engine and previous policies. Then, notifies the
flow controller for this updates and saves it in the control policies database.
II.4 Control policies database

The control policy database contains all the control policy data that models
the multimedia flow controller and that will be modified or accessed by the
policy adapter. This provides a significant persistence advantage. Initially,
it contains preconfigured control policies based on the application context.
For this purpose, we have four preconfigured control policies:
34
Time-based controller: the incoming stream was controlled based on time.
For example, in a surveillance system, we increase the multimedia
stream in the day time when there is a large movement of people and
decrease it at night to make it possible to add cameras inside to provide
security in prohibited areas.
Priority-based controller: in this context, the administrator chooses which

cameras have priority over others, and according to this policy, the
controller increases the throughput of the prioritized cameras over the
others.
Event-based controller: in this case, the controller changes the through-

put according to a specific event triggered by the processing engine,
such as the detection of an intrusion to increase the throughput of
cameras in the intruder’s area, or by the system administrator, such as
the reception of a VIP.
Performance-based controller: in order to keep the system running smoothly

and remain scalable with low checkpoint,our solution controls the in-
coming data flow according to the system performance. Thus, when a
new media source is added, the system automatically decreases the data
flow to allow the new stream to be processed without any problems.
In addition of these principle policies, the user can add other custom policies
according to the its needs and prioritize those policies to choose which order
to follow to control the multimedia stream.
II.5 Message broker

In order to allow common communications infrastructure that grows and
scales to meet the most demanding conditions of multimedia big systems in
case.we used a message broker to handle the flow of multimedia data be-
tween the proposed architecture components. It can serve as a distributed
communications layer that allows applications spanning multiple platforms
to communicate internally. Message brokers can validate, store, route, and
deliver multimedia objects and extracted events to the appropriate destina-
tions. They serve as intermediaries between other components, allowing the
flow controller to issue multimedia objects without knowing where the re-
ceivers are, whether or not they are active, or how many of them there are.
This facilitates decoupling of multimedia flow controller, processing engine
and third parties services within big systems. In order to provide reliable
multimedia data storage and guaranteed delivery, message brokers often rely
35
on a substructure or component called a message queue that stores and or-
ders the messages until the consuming applications can process them. In a
message queue, messages are stored in the exact order in which they were
transmitted and remain in the queue until receipt is confirmed.
II.6 Stream processing engine

For real-time processing in the proposed solution approach, a stream pro-
cessing engine has become a must-have to extract events from incoming
multimedia objects with low latency. It allows real-time response to new
multimedia data events at the moment they occur. Rather than grouping
multimedia data and collecting it at some predetermined interval (batch pro-
cessing) stream processing applications collect and process data immediately
as they are generated. Finally, send extracted events to the concerned parties
through the message broker.
II.7 Third parties services

The third parties services are separate from other components. It consume
the incoming extracted events from each topic through the message broker
in order to generate the necessary data to notify the end-user.
III Application to surveillance systems

We implement our proposed solution in the context of a surveillance system
in the case of a smart city. for this purpose, we use a set of tools like
frameworks, platforms and scripts on top of the proposed architecture.
III.1 Implementation of multimedia stream controller

The multimedia stream controller was implemented using Python for it’s
Compatibility with Major Platforms and Systems.It allows you to you to run
the same code on multiple platforms without recompilation [29]. Hence, you
are not required to recompile the code after making any alteration.Also it
provides a large and robust standard library makes Python score over other
programming languages.
III.2 Message broker

the system implementation uses Apache Kafka [23] as a message broker. It
is a platform for real time environment using distributed public-subscribed
36
messaging system it can handle a large volume of data which enables you to
send messages at end-point. Apache Kafka is developed at Linked In avail-
able as an open source project with Apache Software Foundation. Following
are some points that describe why Kafka.
• Scalability: This framework scale easily without down time.
• High-volume: it is designed to work with high volume of data.
• Reliability: Kafka is partitioned, replicated, distributed fault toler-

ance.
• Data Transformations: this frame work should provide provision for

ingesting the new data stream from producer.
• Low latency: to focus on traditional messaging, requires low latency.
37
38
Figure 4.2: MAFC implementation
III.3 Implementation of multimedia data processing
engine
In order to process all the real-time data coming through Kafka, we have
built a stream processing platform on top of Apache Flink[42]. Apache
Flink is an open-source, distributed stream processing framework with a
high-throughput, low-latency engine widely adopted in the industry. We
adopted Apache Flink for a number of reasons. First, it is robust enough to
continuously support a large number of frames with the built-in state man-
agement and checkpointing features for failure recovery. Second, it is easy
to scale and can handle back-pressure efficiently when faced with a massive
input Kafka lag. Third, it has a large and active open source community as
well as a rich ecosystem of components and toolings. Based on previous com-
parisons done in previous research with Apache Storm and Apache Spark,
Flink was deemed as the better choice of technology for this layer. Storm
performed poorly in handling back pressure when faced with a massive input
backlog of millions of messages, taking several hours to recover whereas Flink
only took 20 minutes. Spark jobs consumed 5-10 times more memory than
a corresponding Flink job for the same workload.
On top of Flink , we use tensorflow with SSD-mobilnet as an object detection
model for its high accuracy with low latency.
In the system implementation, we associate Kafka with Flink for its persis-
tent channel that prevents immediate back pressure processing and avoids
data loss.
IV Experimental evaluation
In order to choose the optimal parameters and to show the impact of MAFC
on the system performance for the situation at hand, we performed a series
of experiments on our use case on different machines by changing the CPU
and RAM performance to test their impact.These tests were performed on 4
Virtual Machines.
the first was a virtual box on top of my PC by allocating 2 virtual processors,
an 8Gb of ram and a 20Gb of storage. then we doubled the number of cpu to
4 virtual processors .For the third Virtual machine was a virtual machine on
Microsoft azure cloud with 4 virtual processors, 16 Gb of ram and 20Gb of
storage. Then we doubled the RAM to 32 GB. The first set of experiments
aims at evaluating the spent time for each processing step from acquiring the
video to sending the extracted objects to third parties.
39
IV.1 Processing time
Figure 4.3 plots the four evaluation measures for the different virtual ma-
chine. for the first one, it took half a second to process an image, almost
zero point forty-five seconds for detection and classification. the second VM
was faster, taking a quarter of a second to process a frame. almost one point
twelve of this time was spent on classification and detection and the rest on
sending and extracting the images from video 2. This demonstrates that the
processing time was halved if we doubled the RAM or processor performance.
Figure 4.3: Processing time per frame
According to Figure 4.4 , the bottleneck of the system is the object detec-
tion and classification phase: about 90% of the latency time is spent in object
detection, while the image splitting and image sending tasks are relatively
fast. so, the flow control take a relatively negligible time in processing.
IV.2 End-to-end latency

Our second test aimed to evaluate the end-to-end latency of the implemented
solution by simulating a fixed length video stream and changing its bit rate.
End-to-end latency refers to the time taken for the whole process from the
capture of the video stream by the camera to the reception of the extracted
event on the notification web page. The figure 4.4 illustrates that on VM1,
the system can process 2 fps in real time without latency, but the latency
increases exponentially by increasing the video rate to twice the length of the
40
Figure 4.4: Percentage of time processing
video itself. For the second virtual machine, the system can process 4 fps in
real time without latency, and then the latency increases as the video rate
increases. And the same thing happened for the other virtual machines.
Figure 4.5: End-to-end latency
41
IV.3 Flow controller impact
Our third test aimed to evaluate the adaptation of a surveillance system
with Multimedia Adaptable Flow Controller in big data systems by changing
the number of cameras starting from and measuring the end-to-end latency.
figure 4.6 shows that the traditional surveillance system that processes all
Figure 4.6: Flow controller impact
incoming multimedia streams without control has not been able to handle
the increasing number of cameras, as a result, the end-to-end latency has in-
creased exponentially to intolerable limits for a surveillance system. Whereas
with the addition of the multimedia flow controller, our system was able to
adapt the system to operate efficiently without latency by adapting the total
incoming flow to the number of cameras.
42
IV.4 Data Loss
In the last experiment, we implement our system without a kafka broker
in order to show the impact of a persistent channel to manage the stream
buffer before processing, So we bombard the system with a flow of frame
throughput higher than what it can handle by simulating a flow of 30 fps
while the system can only handle 13 fps. Figure 4.7 shows that without a
Figure 4.7: Data loss with/without kafka broker
persistent channel, Flink’s memory can’t handle all the incoming streams,
which is why it overwhelmed quickly and caused the system to crash.
Conclusion
In this chapter, we have presented the proposed solution. First, we intro-
duced the workflow. Then, in order to clarify the contribution, we made a
detailed description of each component of the architecture. All of the above
facilitated our next step, namely implementation and evaluation, where we
implemented open source technologies for the MAFC system. Furthermore,
through experimentation, we were able to adapt the system to run efficiently
with low latency despite the high resources consumption by the processing
stage.
43
Conclusions and Further
Directions
This report presents a challenging problematic related to emerging technolo-

gies as IoT, Big data and multimedia stream processing. The architecture
and the implementation of a practical tool for data control based on appli-
cation context allowing for adaptation and scalability is proposed. In the
first and second chapters, we clarified the background and the problematic
that we had to deal with and we presented stream processing solutions. The
third chapter was a presentation of related works while the fourth described
the proposed solution approach and its evaluation on a surveillance system
context of smart cities example. Our proposed solution could be optimized
and open other directions to improve multimedia big systems. For that, we
propose some interesting ideas for further researches.
Moving to the edge

In future work, we are interested in extending MAFC with more advanced
policies and move the flow controller system to the edge in order to optimize
the system performance and reduce the delay due to data communication.
The information collected by IoT edge computing devices does not have to
move as far as it would in a traditional cloud architecture because the data is
processed locally or in nearby edge data centers. Traditional computing de-
sign is also essentially centralized, making it vulnerable to distributed denial
of service (DDoS) attacks and power outages. Edge computing spreads pro-
cessing, storage, and applications across a variety of devices and data centers,
making it impossible for a single outage to bring the entire network down.
Also, moving the flow controller to the edge will provide a huge benefits in
terms of cost and efficiency.
44
ASFC
As a continuation of this research, it is possible to propose an extension of the
MAFC architecture to make it generic and ready to use with any type of data
and not only multimedia data. So, it could be called AFSC as ”Adaptable
Streams Flow Controller in big data systems”.
45
Bibliography
[1] Y. Zikria, M. Afzal, and S. W. Kim, “Internet of multimedia things

(iomt): Opportunities, challenges and solutions,” Sensors, vol. 20, pp.
1–8, 04 2020.
[2] B. Shaqiri, “Exploring Techniques of Improving Security and Privacy in

Big Data,” 2017.
[3] “Apache Storm,” last accessed: june 21, 2021. [Online]. Available:
https://storm.apache.org/
[4] “Apache Flink: Stateful Computations over Data Streams,” last

accessed: june 21, 2021. [Online]. Available: https://flink.apache.org/
[5] “Apache Spark™ - Unified Analytics Engine for Big Data,” last
accessed: june 21, 2021. [Online]. Available: https://spark.apache.org/
[6] “Apache Samza,” last accessed: june 21, 2021. [Online]. Available:
http://samza.apache.org
[7] A. Sestino, M. I. Prete, L. Piper, and G. Guido, “Internet of things and

big data as enablers for business digitalization strategies,” Technovation,
vol. 98, p. 102173, 2020.
[8] Z. Wang, S. Mao, L. Yang, and P. Tang, “A survey of multimedia big

data,” China Communications, vol. 15, no. 1, pp. 155–176, 2018.
[9] “Big Data, for better or worse: 90% of world’s data generated over last
two years.”
[10] “Cisco Annual Internet Report - Cisco Annual Internet Re-

port (2018–2023) White Paper.” [Online]. Available: https:
//www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/
annual-internet-report/white-paper-c11-741490.html
46
[11] Y. A. Qadri, A. Nauman, Y. B. Zikria, A. V. Vasilakos, and S. W.
Kim, “The future of healthcare internet of things: A survey of emerging
technologies,” IEEE Communications Surveys Tutorials, vol. 22, no. 2,
pp. 1121–1167, 2020.
[12] N. El-Gendy and A. Elragal, “Big data analytics: A literature review

paper,” in ICDM, 2014.
[13] D. Arunachalam, N. Kumar, and J. Kawalek, “Understanding big data

analytics capabilities in supply chain management: Unravelling the is-
sues, challenges and implications for practice,” Transportation Research
Part E: Logistics and Transportation Review, vol. 114, 05 2017.
[14] M. Cheung, J. She, and Z. Jie, “Connection discovery using big data of
user-shared images in social media,” IEEE Transactions on Multimedia,
vol. 17, no. 9, pp. 1417–1428, 2015.
[15] Z. Tufekci, “Big questions for social media big data: Representativeness,
validity and other methodological pitfalls,” CoRR, vol. abs/1403.7400,
2014.
[16] R. E. Wilson, S. D. Gosling, and L. T. Graham, “A review of facebook

research in the social sciences,” Perspectives on Psychological Science,
vol. 7, no. 3, pp. 203–220, 2012, pMID: 26168459.
[17] V. GmbH, “What is Stream Processing.”
[18] “Understanding Stream Processing - DZone Refcardz.” [Online]. Avail-

able: https://dzone.com/refcardz/understanding-stream-processing
[19] Beginning Apache Spark 2 - With Resilient Distributed Datasets, Spark

SQL, Structured Streaming and Spark Machine Learning library | Hien
Luu | Apress.
[20] D. Maier, J. Li, P. A. Tucker, K. Tufte, and V. Papadimos, “Semantics

of data streams and operators,” in ICDT, 2005.
[21] J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker, “Seman-

tics and evaluation techniques for window aggregates in data streams,”
in Proceedings of the 2005 ACM SIGMOD International Conference on
Management of Data, ser. SIGMOD ’05. New York, NY, USA: Asso-
ciation for Computing Machinery, 2005, p. 311–322.
47
[22] S. Hamdi, E. Bouazizi, and S. Faiz, “Real-Time Data Stream Partition-
ing over a Sliding Window in Real-Time Spatial Big Data,” in Algo-
rithms and Architectures for Parallel Processing, ser. Lecture Notes in
Computer Science, J. Vaidya and J. Li, Eds. Cham: Springer Interna-
tional Publishing, 2018, pp. 75–88.
[23] “Apache Kafka,” last accessed: june 21, 2021. [Online]. Available:
https://kafka.apache.org/
[24] “How does flink maintain rapid development even as one of the most
active apache project ?”
[25] H. B. Abdallah, T. Abdellatif, and F. Chekir, “Amsep: Automated

multi-level security management for multimedia event processing,” Pro-
cedia Computer Science, vol. 134, pp. 452–457, 2018, the 15th Interna-
tional Conference on Mobile Systems and Pervasive Computing (Mo-
biSPC 2018) / The 13th International Conference on Future Networks
and Communications (FNC-2018) / Affiliated Workshops.
[26] I. Bartolini and M. Patella, “A general framework for real-time analysis

of massive multimedia streams,” Multimedia Systems, vol. 24, pp. 1–16,
07 2018.
[27] R. Mundugar and K. A K, “Rate adaptive multicast video streaming

over software defined network,” Journal of Theoretical and Applied In-
formation Technology, vol. 96, pp. 6163–6171, 09 2018.
[28] Y. Fu and C. Soman, “Real-time data infrastructure at uber,” Proceed-

ings of the 2021 International Conference on Management of Data, Jun
2021.
[29] “Python.org,” last accessed: june 21, 2021. [Online]. Available:

https://www.python.org/doc/
48

Master's Thesis: Houcem Eddine TESTOURI

Uploaded by

Copyright:

Available Formats

Master's Thesis: Houcem Eddine TESTOURI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Master's Thesis: Houcem Eddine TESTOURI

Uploaded by

Copyright:

Available Formats

Master’s thesis

research master degree

Houcem Eddine TESTOURI

MAFC : Multimedia Adaptable Flow

Defended on June 30, 2021 before the Board of Examiners :

President : Mr Rabah ATTIA Professor

I would first like to thank my supervisors, Mrs Takoua Abdellatif and Mr

In addition, i would like to acknowledge with great appreciation the cru-

2 Stream processing systems 12

Conclusions and Further Directions 44

1.1 Key Data Characteristics of IoT and IoMT [1] . . . . . . . . . 5

2.1 Kafka architecture . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1 AMSEP architecture . . . . . . . . . . . . . . . . . . . . . . . 26

4.1 MAFC architecture . . . . . . . . . . . . . . . . . . . . . . . . 33

• Chapter 1 provides technical background on the concepts of IoT and

• Chapter 2 defines stream processing engines, their main characteristics

• Chapter 3 presents some related work to show other researchers’ at-

• Chapter 4 is devoted to presenting the architecture of the MAFC sys-

• A conclusion, future works and suggested research directions are pre-

I.2 Internet of Multimedia Things

Figure 1.2: Different applications of IoMT [1]

II.2 Big Data characteristics

Volume: refers to the amount of data collected by a company. This in-

Veracity: to make a decision, an executive’s level of confidence in the infor-

We summarized those characteristics in figure 1.3.

Figure 1.3: Big data Characteristics [2]

II.3 Definition and characteristics of multimedia big

II.4 Big Data applications in multimedia

Social Networks: Many studies have been conducted on social network

Surveillance Videos: is an important source of multimedia big data. The

Other applications: E-Health, smart TVs, Internet of Things (IoT), dis-

III Multimedia big data and IoT

Stream processing systems

I Stream processing engines

I.1 Key features

Data source Interaction model

Data partitioning strategy

Message Processing guarantee

Support for high level languages

II Stream processing technologies

II.1 Apache Kafka

Figure 2.1: Kafka architecture

Figure 2.2: MQTT architecture

Connection: A client first creates a TCP/IP connection with the broker

Communication: a client can perform publish, subscribe, unsubscribe and

II.3 Apache Storm

Figure 2.3: Apache Storm architecture [3]

II.4 Apache Flink

Flink’s architecture is fundamentally different from Storm and Kafka in

II.5 Apache Spark

Figure 2.5: Spark architecture [5]

In addition, compared to its predecessor, this library provides an easier

Samza as an integrated library: Effortlessly integrate with your existing

Write once, run anywhere: Flexible deployment options to run applica-

Samza as a managed service: Run stream processing as a managed ser-

processors, making it difficult to determine where their system stands in the

Figure 3.1: AMSEP architecture

ing system is shown in Figure 4.1. This architecture is modular enough to

Although this architecture is scalable since it allows adding or removing