Conference Papers by Thomas Pasquier
Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018
Identifying the root cause and impact of a system intrusion remains a foundational challenge in c... more Identifying the root cause and impact of a system intrusion remains a foundational challenge in computer security. Digital provenance provides a detailed history of the flow of information within a computing system, connecting suspicious events to their root causes. Although existing provenance-based auditing techniques provide value in forensic analysis, they assume that such analysis takes place only retrospectively. Such post-hoc analysis is insufficient for realtime security applications; moreover, even for forensic tasks, prior provenance collection systems exhibited poor performance and scalability, jeopardizing the timeliness of query responses. We present CamQuery, which provides inline, realtime provenance analysis, making it suitable for implementing security applications. CamQuery is a Linux Security Module that offers support for both userspace and in-kernel execution of analysis applications. We demonstrate the applicability of CamQuery to a variety of runtime security applications including data loss prevention, intrusion detection, and regulatory compliance. In evaluation, we demonstrate that CamQuery reduces the latency of realtime query mechanisms, while imposing minimal overheads on system execution. CamQuery thus enables the further deployment of provenance-based technologies to address central challenges in computer security.
Proceedings 2020 Network and Distributed System Security Symposium, 2020
Advanced Persistent Threats (APTs) are difficult to detect due to their "low-and-slow" attack pat... more Advanced Persistent Threats (APTs) are difficult to detect due to their "low-and-slow" attack patterns and frequent use of zero-day exploits. We present UNICORN, an anomalybased APT detector that effectively leverages data provenance analysis. From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs. Through extensive yet time-efficient graph analysis, UNICORN explores provenance graphs that provide rich contextual and historical information to identify stealthy anomalous activities without predefined attack signatures. Using a graph sketching technique, it summarizes long-running system execution with space efficiency to combat slow-acting attacks that take place over a long time span. UNICORN further improves its detection capability using a novel modeling approach to understand long-term behavior as the system evolves. Our evaluation shows that UNICORN outperforms an existing state-of-the-art APT detection system and detects reallife APT scenarios with high accuracy. 1 An algorithm that builds a graph data structure is called a graph kernel, which is an overloaded term that also refers to functions that compare the similarity between two graphs. We adopt the second definition here.
Proceedings of the 17th International Middleware Conference, 2016
Internet of Things (IoT) applications, systems and services are subject to law. We argue that for... more Internet of Things (IoT) applications, systems and services are subject to law. We argue that for the IoT to develop lawfully, there must be technical mechanisms that allow the enforcement of specified policy, such that systems align with legal realities. The audit of policy enforcement must assist the apportionment of liability, demonstrate compliance with regulation, and indicate whether policy correctly captures legal responsibilities. As both systems and obligations evolve dynamically, this cycle must be continuously maintained. This poses a huge challenge given the global scale of the IoT vision. The IoT entails dynamically creating new services through managed and flexible data exchange. Data management is complex in this dynamic environment, given the need to both control and share information, often across federated domains of administration. We see middleware playing a key role in managing the IoT. Our vision is for a middleware-enforced, unified policy model that applies end-to-end, throughout the IoT. This is because policy cannot be bound to things, applications, or administrative domains, since functionality is the result of composition, with dynamically formed chains of data flows. We have investigated the use of Information Flow Control (IFC) to manage and audit data flows in cloud computing; a domain where trust can be well-founded, regulations are more mature and associated responsibilities clearer. We feel that IFC has great potential in the broader IoT context. However, the sheer scale and the dynamic, federated nature of the IoT pose a number of significant research challenges.
Proceedings of the 20th International Middleware Conference, 2019
System level provenance is of widespread interest for applications such as security enforcement a... more System level provenance is of widespread interest for applications such as security enforcement and information protection. However, testing the correctness or completeness of provenance capture tools is challenging and currently done manually. In some cases there is not even a clear consensus about what behavior is correct. We present an automated tool, ProvMark, that uses an existing provenance system as a black box and reliably identifies the provenance graph structure recorded for a given activity, by a reduction to subgraph isomorphism problems handled by an external solver. ProvMark is a beginning step in the much needed area of testing and comparing the expressiveness of provenance systems. We demonstrate ProvMark's usefuless in comparing three capture systems with different architectures and distinct design philosophies. CCS Concepts • Applied computing → System forensics; • Software and its engineering → Operating systems; Software verification and validation; • Computing methodologies → Logic programming and answer set programming.
Proceedings of the ACM Symposium on Cloud Computing, 2021
Despite the wide usage of container-based cloud computing, container auditing for security analys... more Despite the wide usage of container-based cloud computing, container auditing for security analysis relies mostly on built-in host audit systems, which often lack the ability to capture high-fidelity container logs. State-of-the-art reference-monitor-based audit techniques greatly improve the quality of audit logs, but their system-wide architecture is too costly to be adapted for individual containers. Moreover, these techniques typically require extensive kernel modifications, making it difficult to deploy in practical settings. In this paper, we present saBPF (secure audit BPF), an extension of the eBPF framework capable of deploying secure system-level audit mechanisms at the container granularity. We demonstrate the practicality of saBPF in Kubernetes by designing an audit framework, an intrusion detection system, and a lightweight access control mechanism. We evaluate saBPF and show that it is comparable in performance and security guarantees to audit systems from the literature that are implemented directly in the kernel. CCS CONCEPTS • Security and privacy → Operating systems security; Intrusion detection systems; • Networks → Cloud computing.
Many users implicitly assume that software can only be exploited after it is installed. However, ... more Many users implicitly assume that software can only be exploited after it is installed. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. We introduce SIGL, a new tool for detecting malicious behavior during software installation. SIGL collects traces of system call activity, building a data provenance graph that it analyzes using a novel autoencoder architecture with a graph long short-term memory network (graph LSTM) for the encoder and a standard multilayer perceptron for the decoder. SIGL flags suspicious installations as well as the specific installation-time processes that are likely to be malicious. Using a test corpus of 625 malicious installers containing real-world malware, we demonstrate that SIGL has a detection accuracy of 96%, outperforming similar systems from industry and academia by up to 87% in precision and recall and 45% in accuracy. We also demonstrate that SIGL can pinpoint the processes most...
2017 IEEE International Conference on Cloud Engineering (IC2E), 2017
Unikernels are a rapidly emerging technology in the world of cloud computing. Unikernels build on... more Unikernels are a rapidly emerging technology in the world of cloud computing. Unikernels build on research into library operating systems to deliver smaller, faster and more secure virtual machines, specifically optimised for a single application service. These features are especially useful in cost or resource constrained environments. However, as with any new technology, early adopters need to master many technical details, and understand many aspects of the mechanisms used to build and deploy unikernels. Both of these factors may slow adoption rates. In this paper, we present our initial experiments into the use of an approach for building unikernels that is accessible to those whose technical expertise is focused on web development. We present PHP2Uni: a tool chain that takes a website built from PHP files—PHP remains the most widely used web language—and builds a resource-efficient unikernel image from them, while requiring little knowledge of the underlying operating system so...
Proceedings of the 3rd International Workshop on Practical Reproducible Evaluation of Computer Systems
Host-based anomaly detectors generate alarms by inspecting audit logs for suspicious behavior. Un... more Host-based anomaly detectors generate alarms by inspecting audit logs for suspicious behavior. Unfortunately, evaluating these anomaly detectors is hard. There are few high-quality, publiclyavailable audit logs, and there are no pre-existing frameworks that enable pushbutton creation of realistic system traces. To make trace generation easier, we created Xanthus, an automated tool that orchestrates virtual machines to generate realistic audit logs. Using Xanthus' simple management interface, administrators select a base VM image, configure a particular tracing framework to use within that VM, and define post-launch scripts that collect and save trace data. Once data collection is finished, Xanthus creates a self-describing archive, which contains the VM, its configuration parameters, and the collected trace data. We demonstrate that Xanthus hides many of the tedious (yet subtle) orchestration tasks that humans often get wrong; Xanthus avoids mistakes that lead to non-replicable experiments. CCS CONCEPTS • Security and privacy → Usability in security and privacy; Intrusion detection systems; Penetration testing.
2020 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 2020
The rise of IoT has led to large volumes of personal data being produced at the network's edge. M... more The rise of IoT has led to large volumes of personal data being produced at the network's edge. Most IoT applications process data in the cloud raising concerns over privacy and security. As many IoT applications are event-based and are implemented on cloud-based, serverless platforms, we've seen a number of proposals to deploy serverless solutions at the edge to address concerns over data transfer. However, conventional serverless platforms use container technology to run user-defined functions. Containers introduce their own issues regarding security-due to a large trusted computing base-, and performance issues including long initialisation times. Additionally, OpenWhisk a popular and widely used containerbased serverless platform available for edge devices perform relatively poorly as we demonstrate in our evaluation. In this paper, we propose to investigate unikernel as a solution to build serverless platform at the edge, addressing in particular performance and security concerns. We present UniFaaS, a prototype edge-serverless platform which leverages unikernels-tiny library single-address-space operating systems that only contain the parts of the OS needed to run a given application-to execute functions. The result is a serverless platform with extremely low memory and CPU footprints, and excellent performance. UniFaaS has been designed to be deployed on low-powered single-board computer devices, such as Raspberry Pi or Arduino, without compromising on performance.
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020
This experimental study presents a number of issues that pose a challenge for practical configura... more This experimental study presents a number of issues that pose a challenge for practical configuration tuning and its deployment in data analytics frameworks. These issues include: 1) the assumption of a static workload or environment, ignoring the dynamic characteristics of the analytics environment (e.g., increase in input data size, changes in allocation of resources). 2) the amortization of tuning costs and how this influences what workloads can be tuned in practice in a cost-effective manner. 3) the need for a comprehensive incremental tuning solution for a diverse set of workloads. We adapt different ML techniques in order to obtain efficient incremental tuning in our problem domain, and propose Tuneful, a configuration tuning framework. We show how it is designed to overcome the above issues and illustrate its applicability by running a wide array of experiments in cloud environments provided by two different service providers. CCS CONCEPTS • Theory of computation → Online learning algorithms; Gaussian processes; Non-parametric optimization.
Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, 2021
Edge computing is rapidly changing the IoT-Cloud landscape. Various testbeds are now able to run ... more Edge computing is rapidly changing the IoT-Cloud landscape. Various testbeds are now able to run multiple Docker-like containers developed and deployed by end-users on edge devices. However, this capability may allow an attacker to deploy a malicious container on the host and compromise it. This paper presents a dataset based on the Linux Auditing System, which contains malicious and benign container activity. We developed two malicious scenarios, a denial of service and a privilege escalation attack, where an adversary uses a container to compromise the edge device. Furthermore, we deployed benign user containers to run in parallel with the malicious containers. Container activity can be captured through the host system via system calls. Our time series auditd dataset contains partial labels for the benign and malicious related system calls. Generating the dataset is largely automated using a provided AutoCES framework. We also present a semi-supervised machine learning use case with the collected data to demonstrate its utility. The dataset and framework code are open-source and publicly available. CCS CONCEPTS • Information systems → Data mining; • Security and privacy → Intrusion/anomaly detection and malware mitigation.
—With the rapid increase in uptake of cloud services, issues of data management are becoming incr... more —With the rapid increase in uptake of cloud services, issues of data management are becoming increasingly prominent. There is a clear, outstanding need for the ability for specified policy to control and track data as it flows throughout cloud infrastructure, to ensure that those responsible for data are meeting their obligations. This paper introduces Information Flow Audit, an approach for tracking information flows within cloud infrastructure. This builds upon CamFlow (Cambridge Flow Control Architecture), a prototype implementation of our model for data-centric security in PaaS clouds. CamFlow enforces Information Flow Control policy both intra-machine at the kernel-level, and inter-machine, on message exchange. Here we demonstrate how CamFlow can be extended to provide data-centric audit logs akin to provenance metadata in a format in which analyses can easily be automated through the use of standard graph processing tools. This allows detailed understanding of the overall system. Combining a continuously enforced data-centric security mechanism with meaningful audit empowers tenants and providers to both meet and demonstrate compliance with their data management obligations.
Data provenance describes how data came to be in its present form. It includes data sources and t... more Data provenance describes how data came to be in its present form. It includes data sources and the transformations that have been applied to them. Data provenance has many uses, from forensics and security to aiding the reproducibility of scientific experiments. We present CamFlow, a whole-system provenance capture mechanism that integrates easily into a PaaS offering. While there have been several prior whole-system provenance systems that captured a comprehensive, systemic and ubiquitous record of a system's behavior, none have been widely adopted. They either A) impose too much overhead, B) are designed for long-outdated kernel releases and are hard to port to current systems, C) generate too much data, or D) are designed for a single system. CamFlow addresses these shortcoming by: 1) leveraging the latest kernel design advances to achieve efficiency; 2) using a self-contained, easily maintainable implementation relying on a Linux Security Module, NetFilter, and other existing kernel facilities ; 3) providing a mechanism to tailor the captured provenance data to the needs of the application; and 4) making it easy to integrate provenance across distributed systems. The provenance we capture is streamed and consumed by tenant-built auditor applications. We illustrate the usability of our implementation by describing three such applications: demonstrating compliance with data regulations; performing fault/intrusion detection; and implementing data loss prevention. We also show how CamFlow can be leveraged to capture meaningful provenance without modifying existing applications.
—Unikernels are a rapidly emerging technology in the world of cloud computing. Unikernels build o... more —Unikernels are a rapidly emerging technology in the world of cloud computing. Unikernels build on research on library operating systems to deliver smaller, faster and more secure virtual machines, specifically optimised for a single application service. These features are especially useful in cost or resource constrained environments. However, as with any new technology, early adopters need to master many technical details, and understand many aspects of the mechanisms used to build and deploy unikernels. Both of these factors may slow adoption rates. In this paper, we present our initial experiments into the use of an approach for building unikernels that is accessible to those whose technical expertise is focused on web development. We present PHP2Uni: a tool chain that takes a website built from PHP files—PHP remains the most widely used web language— and builds a resource-efficient unikernel image from them, while requiring little knowledge of the underlying operating system software complexity.
Internet of Things (IoT) applications, systems and services are subject to law. We argue that for... more Internet of Things (IoT) applications, systems and services are subject to law. We argue that for the IoT to develop lawfully, there must be technical mechanisms that allow the enforcement of specified policy, such that systems align with legal realities. The audit of policy enforcement must assist the apportionment of liability, demonstrate compliance with regulation, and indicate whether policy correctly captures legal responsibilities. As both systems and obligations evolve dynamically, this cycle must be continuously maintained. This poses a huge challenge given the global scale of the IoT vision. The IoT entails dynamically creating new services through managed and flexible data exchange. Data management is complex in this dynamic environment, given the need to both control and share information, often across federated domains of administration. We see middleware playing a key role in managing the IoT. Our vision is for a middleware-enforced, unified policy model that applies end-to-end, throughout the IoT. This is because policy cannot be bound to things, applications, or administrative domains, since functionality is the result of composition, with dynamically formed chains of data flows. We have investigated the use of Information Flow Control (IFC) to manage and audit data flows in cloud computing; a domain where trust can be well-founded, regulations are more mature and associated responsibilities clearer. We feel that IFC has great potential in the broader IoT context. However, the sheer scale and the dynamic, federated nature of the IoT pose a number of significant research challenges.
The usual approach to security for cloud-hosted applications is strong separation. However, it is... more The usual approach to security for cloud-hosted applications is strong separation. However, it is often the case that the same data is used by different applications, particularly given the increase in data-driven ('big data' and IoT) applications. We argue that access control for the cloud should no longer be application-specific but should be data-centric, associated with the data that can flow between applications. Indeed, the data may originate outside cloud services from diverse sources such as medical monitoring, environmental sensing etc. Information Flow Control (IFC) potentially offers data-centric, system-wide data access control. It has been shown that IFC can be provided at operating system level as part of a PaaS offering, with an acceptable overhead. In this paper we consider how IFC can be integrated with application-specific access control, transparently from application developers, while building from simple IFC primi-tives, access control policies that align with the data management obligations of cloud providers and tenants.
International Conference on Cloud Computing Technology and Science
There is a clear, outstanding need for new security mechanisms that allow data to be managed and ... more There is a clear, outstanding need for new security mechanisms that allow data to be managed and controlled within the cloud-enabled Internet of Things. Towards this, we propose an approach based on Information Flow Control (IFC) that allows: (1) the continuous, end-to-end enforcement of data flow policy, and (2) the generation of provenance-like audit logs to demonstrate policy adherence and contractual\slash regulatory compliance. Further, we discuss the role of Trusted Platform Modules (TPMs) in supporting such a system, by providing hardware roots of trust. TPMs can be leveraged to validate software configurations, including the IFC enforcement mechanism, both in the cloud and externally via remote attestation.
Concern about data leakage is holding back more widespread adoption of cloud computing by compani... more Concern about data leakage is holding back more widespread adoption of cloud computing by companies and public institutions alike. To address this, cloud tenants/applications are traditionally isolated in virtual machines or containers. But an emerging requirement is for cross-application sharing of data, for example, when cloud services form part of an IoT architecture. Information Flow Control (IFC) is ideally suited to achieving both isolation and data sharing as required. IFC enhances traditional Access Control by providing continuous, data-centric, crossapplication, end-to-end control of data flows. However, large-scale data processing is a major requirement of cloud computing and is infeasible under standard IFC. We present a novel, enhanced IFC model that subsumes standard models. Our IFC model supports ‘Big Data’ processing, while retaining the simplicity of standard IFC and enabling more concise, accurate and maintainable expression of policy.
To realise the full potential of the Internet of Things (IoT), IoT architectures are moving towar... more To realise the full potential of the Internet of Things (IoT), IoT architectures are moving towards open and dynamic interoperability, as opposed to closed application silos. This is because functionality is realised through the interactions, i.e. the exchange of data, between a wide-range of 'things'. Data sharing requires management. Towards this, we are exploring distributed, decentralised Information Flow Control (IFC) to enable controlled data flows, end-to-end, according to policy. In this paper we make the case for IFC, as a data-centric control mechanism, for securing IoT architectures. Previous research on IFC focuses on a particular system or application, e.g. within an operating system, with little concern for wide-scale, dynamic systems. To render IFC applicable to IoT, we present a certificate-based model for secure, trustworthy policy specification, that also reflects real-world IoT concerns such as 'thing' ownership. This approach enables decentralised, distributed, verifiable policy specification, crucial for securing the wide-ranging, dynamic interactions of future IoT applications.
Security concerns are widely seen as an obstacle to the adoption of cloud computing solutions and... more Security concerns are widely seen as an obstacle to the adoption of cloud computing solutions and although a wealth of law and regulation has emerged, the technical basis for enforcing and demonstrating compliance lags behind. Our CloudSafetyNet project aims to show that Information Flow Control (IFC) can augment existing security mechanisms and provide continuous enforcement of extended. finer-grained application-level security policy in the cloud.
We present FlowK, a loadable kernel module for Linux, as part of a proof of concept that IFC can be provided for cloud computing. Following the principle of policy-mechanism separation, IFC policy is assumed to be expressed at application level and FlowK provides mechanisms to enforce IFC policy at runtime. FlowK’s design minimises the changes required to existing software when IFC is provided. To show how FlowK can be integrated with cloud software we have designed and evaluated a framework for deploying IFC-aware web applications, suitable for use in a PaaS cloud.
Uploads
Conference Papers by Thomas Pasquier
We present FlowK, a loadable kernel module for Linux, as part of a proof of concept that IFC can be provided for cloud computing. Following the principle of policy-mechanism separation, IFC policy is assumed to be expressed at application level and FlowK provides mechanisms to enforce IFC policy at runtime. FlowK’s design minimises the changes required to existing software when IFC is provided. To show how FlowK can be integrated with cloud software we have designed and evaluated a framework for deploying IFC-aware web applications, suitable for use in a PaaS cloud.
We present FlowK, a loadable kernel module for Linux, as part of a proof of concept that IFC can be provided for cloud computing. Following the principle of policy-mechanism separation, IFC policy is assumed to be expressed at application level and FlowK provides mechanisms to enforce IFC policy at runtime. FlowK’s design minimises the changes required to existing software when IFC is provided. To show how FlowK can be integrated with cloud software we have designed and evaluated a framework for deploying IFC-aware web applications, suitable for use in a PaaS cloud.
given to the security, privacy and personal safety risks that arise beyond these subsystems; that is, from the wide-scale, cross-platform openness that cloud services bring to IoT.
In this paper we focus on security considerations for IoT from the perspectives of cloud tenants, end-users and cloud providers, in the context of wide-scale IoT proliferation, working across the range of IoT technologies (be they things or entire IoT subsystems). Our contribution is to analyse the current state of cloud-supported IoT to make explicit the security considerations that require further work.
In this paper we describe the properties of cloud computing—Platform-as-a-Service clouds in particular—and review a range of IFC models and implementations to identify opportunities for using IFC within a cloud computing context. Since IFC security is linked to the data that it protects, both tenants and providers of cloud services can agree on security policy, in a manner that does not require them to understand and rely on the particulars of the cloud software stack in order to effect enforcement.