ATA GUNAY Serverless Execution Engine For MULE
ATA GUNAY Serverless Execution Engine For MULE
ATA GUNAY Serverless Execution Engine For MULE
Ata Gunay
2024
MSc in Computer Science (Software Engineering)
I hereby certify that this material, which I now submit for assessment on the program of
study leading to the award of M.Sc. in Software Engineering, is entirely my own work and has
not been taken from the work of others - save and to the extent that such work has been cited
and acknowledged within the text of my work.
Date: 11/07/2024
2
Acknowledgements
3
Abstract
This project describes the creation of a serverless execution engine for MULE, intending to
improve the programming education platform utilized at Maynooth University. MULE enables
students to write code in a web-based setting, depending on the Virtual Programming Lab’s
Jail System for running the code. The current system’s reliance on old, on-premise resources
presents scalability, upkeep, and security issues. Also, it is not a dedicated server for only the
MULE. This project suggests using a serverless cloud-based technology solution to solve these
problems by increasing scalability, enhancing security with code maliciousness rejection, and
improving the validation framework. The project examines how the MULE/Virtual Program-
ming lab client and the Jail System communicate to determine the best AWS services for a
scalable, cost-efficient serverless setup. It emphasizes the shift from a single unified system to
a decentralized one. It emphasizes ongoing integration and deployment for smooth updates to
ensure the project’s longevity. The design of the execution engine uses large language models
(LLM) for code security, WebSocket for real-time communication, and a multi-language exe-
cution approach. The assessment demonstrates the system’s effectiveness through unit, integra-
tion, and manual tests, affirming its ability to improve the programming education at Maynooth
University greatly.
4
Table of Contents
Declaration 2
Acknowledgement 3
Abstract 4
Table of Contents 5
1 Introduction 7
1.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Solution 20
3.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Discover the Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 MULE Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.2 Moodle VPL Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.3 VPL Jail Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.4 Reverse Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.5 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.1 Proxy Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.2 Dispatcher service (Lambda) . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.3 Executor service (Lambda managed ECS) . . . . . . . . . . . . . . . . . . 33
3.4 System Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.1 Construction of the Software . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.2 Construction of the Infrastructure . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Evaluation 43
4.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Unit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5
4.3 Integration Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4 CloudWatch Monitoring Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5 CloudWatch Alarms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5 Conclusion 46
5.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Appendices 47
.1 Appendix A: Installation of the Client Side . . . . . . . . . . . . . . . . . . . . . 48
.2 Appendix B: Seperation of the Tasks . . . . . . . . . . . . . . . . . . . . . . . . . 49
.3 Appendix C: An Example of README File . . . . . . . . . . . . . . . . . . . . . 51
.4 Appendix D: Handler Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6
Chapter 1
Introduction
1.1 Objectives
This chapter includes the research topic and our main motivation for doing this project. It
also includes the objectives and key results of the project.
1.3 Motivation
This project aims to architect and develop a serverless, cloud-based solution that eliminates
the need for on-premise server management, offers much better scalability, and qualitatively
improves the user experience with the execution engine by providing a better validation frame-
work. That framework should be private for Maynooth University and managed by the autho-
rized people assigned by the university. This project will also deliver a code maliciousness
rejection mechanism for submitted student code to prevent malicious actions on the platform.
To tackle the issue of the old system’s rigidity, this project will also offer an extensibil-
ity mechanism through self-mutating Continuous Integration and Continuous Delivery (CI/CD)
pipelines so that future updates to the project can easily be deployed and tested automatically.
This thesis is organized into six chapters, including the introduction and conclusion. The
content and purpose of each chapter are briefly described as follows:
7
• Chapter 1: Introduction
In this chapter, we introduce the research topic, outline the research questions or hypothe-
ses, and explain the significance and objectives of the study. A brief overview of the
thesis structure is also provided.
This chapter encompasses research on various aspects of software and infrastructure es-
sential for ensuring the project’s maintainability and scalability. The discussed services
and technologies are integral to the project’s development and decision-making processes.
Detailed explanations of their usage and implementation will be provided in the subse-
quent chapters.
8
Chapter 2
Research & Background
2.1 Objectives
This section includes the researches about different aspects of the software and infrastructure
to make the project maintainable and scalable. The mentioned services or technologies are
used in building the project or decision-making steps. Usage and implementation details will
discussed in the next chapters.
IaC is implemented by the AWS Cloud Development Kit (AWS CDK) in this project because
it is officially supported by the AWS.
The AWS Cloud Development Kit (CDK) is an open-source software development frame-
work for defining cloud infrastructure in code and provisioning it through AWS Cloud-
Formation [2.3]. It allows developers to define cloud infrastructure using familiar pro-
gramming languages such as TypeScript, Python, Java, C#, and others.
Here an Overview of the AWS CDK Architecture
9
2.3 Continuous Integration and Continuous Delivery (CI/CD)
The project codes are tracked by Git and hosted by GitHub. GitHub actions run the unit tests
and update some services on the AWS. New code changes trigger the AWS Pipelines, which can
access the code stored by GitHub via the AWS Code Star Connection. Integration tests are run
by the AWS Step Function, and the AWS Pipeline needs Manual Approval to deploy the latest
changes. Manual Approval is critical because end-users may experience connection loss while
deploying according to the deployment strategy. Therefore, deployment time should be decided
manually.
Here, you can find the explanation of the services mentioned above.
Git lets developers see the entire timeline of their changes, decisions, and progression of
any project in one place. From the moment they access the history of a project, the devel-
oper has all the context they need to understand it and start contributing [4].
GitHub hosts Git repositories and provides developers with tools to ship better code
through command line features, issues (threaded discussions), pull requests, code review,
or the use of a collection of free and for-purchase apps in the GitHub Marketplace [4].
• Github Actions
GitHub Actions workflow to be triggered when an event occurs in your repository, such
as a pull request being opened or an issue being created. Your workflow contains one or
more jobs which can run in sequential order or in parallel. Each job will run inside its
own virtual machine runner, or inside a container [5].
Each connection is a resource that you can give to AWS services to connect to a third-
party repository, such as GitHub. For example, you can add a connection in pipeline so
that it starts your pipeline when a code change is made to your third-party code repository
[6].
• AWS CodePipeline
AWS CodePipeline is a fully managed continuous delivery service that helps you auto-
mate your release pipelines for fast and reliable application and infrastructure updates.
CodePipeline automates the build, test, and deploy phases of your release process every
time there is a code change, based on the release model you define [6].
10
• AWS Step Functions
AWS Step Functions is a serverless orchestration service that lets developers create and
manage multi-step application workflows in the cloud. A pipeline can automatically in-
vokes all of the lambda functions in the step function concurrently and collects their re-
sults.
• Manual Approving
In AWS CodePipeline, you can add an approval action to a stage in a pipeline at the point
where you want the pipeline execution to stop so that someone with the required AWS
Identity and Access Management permissions can approve or reject the action.
Deployment strategies define how you want to deliver your software. AWS supports differ-
ent types of deployment strategies. Each of them has unique pros and cons. We use blue/green
deployment because Blue/green deployment helps you minimize downtime during application
updates, mitigating risks surrounding downtime and rollback functionality.
There was no real end-users during the development process of the Project. Therefore, se-
lecting a deployment strategy was not a critical question for us. However, we used the ”Blue/-
green deployment” because it is default deployment strategy for the many services on the Ama-
zon Web Services.
• In-place deployments
In this strategy, the previous version of the application on each compute resource is
stopped, the latest application is installed, and the new version of the application is started
and validated.[2.7]
• Blue/green deployment
Blue/green deployments enable you to launch a new version (green) of your application
alongside the old version (blue), and monitor and test the new version before you reroute
traffic to it, rolling back on issue detection. [2.7]
• Canary deployment
The method will incrementally deploy the new version, making it visible to new users in
a slow fashion. As you gain confidence in the deployment, you will deploy it to replace
the current version in its entirety.[7]
11
• Linear deployment
Linear deployment means traffic is shifted in equal increments with an equal number of
minutes between each increment.[2.7]
• All-at-once deployment
All-at-once deployment means all traffic is shifted from the original environment to the
replacement environment all at once.[2.7]
Cloud Computing is the practice of using a network of remote servers hosted on the Internet
to store, manage, and process data, rather than on a local server or personal computer.
Different types of services supplied by Amazon Web Services are used for the project.
A serverless architecture is a way to build and run applications and services without hav-
ing to manage infrastructure. Your application still runs on servers, but all the server
management is done by your cloud service provider. You no longer have to provision,
scale, and maintain servers to run your applications, databases, and storage systems. [8]
12
• AWS Lambda Function
AWS Lambda Function is a type of serverless compute service. The development team
only responsible for the code and the data, nothing else. Under the hood, these functions
are managed Virtual Machines running managed containers. AWS Lambda Function’s
can be triggered and exposed to the internet using AWS APIGateway (APIGW) service
which supports both HTTP and WebSocket Lambda integration.
Lambda follows the pay-per-use pricing model. You are charged based on the number of
requests for your functions and the duration it takes for your code to execute.
For example, if underlying Lambda function was configured with 128 MB of memory, it
equals to 0.128 GB. System has 1000 students and these users trigger the Lambda 100 time each
over a month, where, on average, each execution takes 5 seconds.
Total GB-seconds:
Total cost:
13
Figure 2.3: AWS Lambda Pricing Table
Firstly, Lambda has a runtime limit. Functions cannot run longer than 15 minutes. This
is not an issue for this use-case, but in general are not suitable for any long or stateful
processes. Lambda also has a limit on HTTP payload size at 6 MB per payload. This
means that when the client submits source and validation code, and this request is larger
than 6 MB’s, the payload will be cropped, rendering the request useless. Intermediate ser-
vices could be built to intercept the message, upload it to s3, and the invoke the Lambda
with an S3 URL. This however defeats the purpose of serverless in the first place, as the
intermediate service would have to be some form of a long-running application. Lastly,
Lambda functions are an event-driven, inherently stateless platform. Code execution and
real-time interaction via a terminal are inherently stateful operations.
Amazon ECS is an opinionated container orchestration service that delivers the easiest way
for organizations to build, deploy, and manage containerized applications at any scale [2.9].
14
When you select to use Amazon ECS with AWS Fargate, Amazon ECS supports serverless con-
tainer orchestration so you can leverage more of AWS’s operational excellence when it comes
to scaling, maintaining availability, and securing your containerized workloads. You can define
the scalability rules with the ECS task definitions.Furthermore Elastic load balancers (ELB) can
be configured with ‘sticky sessions’ such that requests from the same client always go to the
same container regardless of the scaling policy.
For example if 4 containers were enough to handle to load of 1000 students and the contain-
ers are running up to 2 hours each per day on peak load, 1 container being always active with
minimal configuration
15
Daily cost for vCPU:
2.6 Storage
An Amazon ECR private registry hosts your container images in a highly available and
scalable architecture. It allows to you track the history of the images via the tags. It is
compatible with third-party services like GitHub Actions, CircleCI, etc.
We built pipelines to dockerize the existing code in each update. Therefore, the latest
versions of the codes are stored in the ECR.
Amazon S3 is an object storage service. Object storage stores data in a flat structure, using
unique identifiers to look up objects when requested. An object is simply a file combined
16
with metadata and that you can store as many of these objects as you’d like. All of these
characteristics of object storage are also characteristics of Amazon S3.
In Amazon S3, you have to store your objects in containers called buckets. You choose,
at the very minimum, two things: the bucket name and the AWS Region you want the
bucket to reside in. When you choose a Region for your bucket, all objects you put inside
that bucket are redundantly stored across multiple devices, across multiple Availability
Zones. A bucket name which must be unique across all AWS accounts. AWS stops you
from choosing a bucket name that has already been chosen by someone else in another
AWS account.
Note that, you can have folders inside of buckets to help you organize objects. However,
remember that there’s no actual file hierarchy that supports this on the back end. It is
instead a flat structure where all files and folders live at the same level. Using buckets
and folders implies a hierarchy, which makes it easy to understand for the human eye.
The codes sent by the students to our system are stored in the S3 service.
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and
predictable performance with seamless scalability.
All of your data is stored on solid-state disks (SSDs) and is automatically replicated across
multiple Availability Zones in an AWS Region, providing built-in high availability and
data durability.
Execution requests and live containers are stored in the Dynomo DB service.
2.7 Monitoring
You can build many custom dashboards, each one focusing on a distinct view of your
environment. You can even pull data from different Regions into a single dashboard in
order to create a global view of your architecture. It can be used to collect and track
metrics and logs in order to monitor application performance.
• AWS Regions
Regions are geographic locations worldwide where AWS hosts its data centers.
Here are a few examples of Region codes:
17
us-east-1: This is the first Region created in the east of the US. The geographical name
for this Region is N. Virginia.
ap-northeast-1: The first Region created in the northeast of Asia Pacific. The geographi-
cal name for this Region is Tokyo.
Inside every Region is a cluster of Availability Zones (AZ). An AZ consists of one or more
data centers with redundant power, networking, and connectivity. Since they’re located
inside Regions, they can be addressed by appending a letter to the end of the Region code
name. For example:
us-east-1a: an AZ in us-east-1 (Northern Virginia Region)
sa-east-1b: an AZ in sa-east-1 (São Paulo Region in South America)
A VPC is an isolated network you create in the AWS cloud. Each VPC spans multiple
Availability Zones within the Region you choose. After you create your VPC, you need
to create subnets inside of this network.Think of subnets as smaller networks inside your
base network—or virtual area networks (VLANs) in a traditional.
• AWS API Gateway
Amazon API Gateway is an AWS service for creating, publishing, maintaining, monitor-
ing, and securing REST, HTTP, and WebSocket APIs at any scale [2.10].
VPL establishes HTTP and WebSocket connections to the same endpoint. This poses a
problem when we want to automatically route all types of requests to the same Lambda,
as we would need 2 API GW endpoints targeting the Lambda function: the HTTP API
Gateway endpoint and the WebSocket API Gateway endpoint.
18
• Design Problem: AWS API Gateway and AWS Lambda Function
Even if the client was adjusted to support sending different protocol requests to vari-
ous endpoints, the issue of maintaining state persists. Since Lambda operates on a man-
aged platform, developers cannot control the routing of requests to different instances
of the same Lambda function. Consequently, requests from the same user might not al-
ways reach the same instance, but rather trigger a new function invocation. This scenario
poses no problem for stateless use-cases, as the function does not rely on previous execu-
tions. However, when dealing with executing programs and transmitting I/O information,
a problem arises. Instance X must be aware of the actions performed by prior instances,
and those instances must also be aware of how to proceed if they are not the most recent
instance.
Figure 2.7: Green bubbles A-F represent different Lambda instances of the same function (def-
inition). Yellow bubbles 1-10 represent requests. It can be seen that Lambda is scaled up, to the
account concurrency limit, for any request happening during an existing request invocation
If user B sends his first request to upload his code, the request lands on instance A. His
second request to compile his code may land on instance B. Instance B does not have the
code. If user B’s program relies on n inputs from the user, some inputs land on differ-
ent instances, some on the same. How should each instance respond? The problem with
inputs landing on the wrong instance can be solved by storing a queue of user input mes-
sages. If a message lands on an instance that does not contain the program, the instance
could fetch the source code, compile, run, update, and empty the queue into the program,
and then return the continuation of the program output.
19
Chapter 3
Solution
3.1 Objectives
This chapter includes our solution and the latest system design, as well as development steps.
You will gain a deep understanding of the new system’s working principles.
To replace the VPL Jail Server with a brand-new serverless system, we had to understand the
communication between the MULE client and the VPL Jail Server. As it is an outdated system,
there is no documentation on the web. That discovery was going to give us strong insights into
how to build the serverless system.
MULE is a Moodle-based app with the VPL plugin. Therefore, we need to set up Moo-
dle first. A docker-compose file was generated to run the Moodle system with the database
dependencies. Then, it ran on a Linux virtual machine.
You can find the docker file and detailed set-up instructions at Appendix A: Installation of
the Client Side section.
20
After the setup, users can login the system with the ”user” username and ”bitnami” password.
VPL- Virtual Programming Lab is an activity module that manages programming assign-
ments [11] and whose salient features are:
I mentioned that MULE is a Moodle-based app with the VPL plugin. Therefore, the VPL
Plugin should be installed on Moodle to execute and test the code. You can follow the link to
detailed set-up instructions.
After the setup, users can test by executing the code snippets.
21
3.2.3 VPL Jail Server
The VPL Jail Server handles all these executions and evaluation phases. It communicates
with the VPL plugin working on Moodle. When you set up the VPL plugin, you can use the
default VPL Jail Server hosted publically. However, we spun up a new server and used it for
discovery. That methodology allows us to make more granular discoveries.
You should change the default host address of the VPL Jail Server with your server on the
VPL plugin settings page.
A reverse proxy was built with Golang. The main idea is to redirect requests to the server
sent by the client and wait for the server’s response. Then, it sends the response generated by the
server to the client side. In the meantime, it logs the each request to response for the discover.
The reverse proxy was located between the MULE client and the VPL Jail Server, so we had a
chance to watch their communication.
3.2.5 Observations
After all these setups and demonstrations, we can announce the outcomes of the discov-
ery step. There are two different communication types between the client and server: HTTP
and Websocket. All requests are using JSON-RPC format which does not utilise HTTP path
information.
JSON-RPC Protocol
All transfer types are single objects, serialized using JSON.A request is a call to a specific
method provided by a remote system [12]. It can contain three members:
• method - A String with the name of the method to be invoked. Method names that begin
with ”rpc.” are reserved for rpc-internal methods.
• params - An Object or Array of values to be passed as parameters to the defined method.
This member may be omitted.
22
• id - A string or non-fractional number used to match the response with the request that it
is replying to. This member may be omitted if no response should be returned.
• result - The data returned by the invoked method. If an error occurred while invoking the
method, this member must not exist.
• error - An error object if there was an error invoking the method, otherwise this member
must not exist. The object must contain members code (integer) and message (string)
HTTP Communication
The HTTP communication was responsible for initiating a session and uploading the student
code, alongside specifying what type of execution was to be performed. Lastly it was responsible
for fetching evaluation results, if the session was an evaluation session.
23
Figure 3.6: HTTP Request 2
24
Websocket Communication
The client also establishes 1 or 2 web socket connections, depending on the situation; /mon-
itor and /execute. The /monitor web socket connection is responsible for supplying compilation
information such as compilation failures. It also monitors the elapsed time for both compilation
and execution of a program, relaying this information back to the client every second. The /ex-
ecution web socket connection is responsible for transmitting stdout and stderr output streams
back to the client, while also handling stdin input stream from client input to program input and
back.
If a compilation error occurs while the program is compiling then the system turns this mes-
sage:
25
Figure 3.10: Websocket Execute Request and Response
Before going into the details of the services, I would like to show the diagram of the general
structure. If you need it in the following sections, you can go back to this point and review the
diagram again.
26
Figure 3.11: System Design
The proxy service serves as a middleman to enable the proper routing of requests, both
WebSocket and HTTP. Since Lambda accepts requests in a very specific format, it cannot be
used to handle both HTTP and WebSocket at the same time. Furthermore, since the client uses
JSON RPC, there is no way to dynamically configure the application gateway, which is a load
balancer in this case, to properly route requests. For this reason, this service accepts all requests
and routes them based on session metadata stored in DDB. To decrease latency, it caches each
session metadata temporarily so that it only needs to fetch the session metadata once
The main idea of that project is to replace the existing system with a cost-effective serverless
architecture. We discussed why serverless is so important for this project. For this purpose, the
Dispatcher service is of vital importance. The service orchestrates container lifecycle, scaling,
and routing metadata. This approach enables the service to be serverless as the Lambda could
spin up desired containers on demand. Each container can be used by multiple customers si-
multaneously, based on load. You can find the Lambda dispatcher service defined by the AWS
CDK below.
27
this.lambda = new Function(this, `mule-serverless-${props.stage}-${props.
,→ region}`, {
functionName: `${props.functionName}-${props.stage}-${props.region}`,
runtime: Runtime.FROM_IMAGE,
architecture: Architecture.X86_64,
code: Code.fromEcrImage(Repository.fromRepositoryName(this, 'mule-
,→ serverless-lambda', REPOS.LAMBDA.name), {
tagOrDigest: 'latest'
}),
timeout: Duration.minutes(5),
handler: Handler.FROM_IMAGE,
environment: {
OPENAI_API_KEY: openaiSecret.secretValue.unsafeUnwrap(),
S3_BUCKET: this.userContent.bucketName,
DYNAMO_EXECUTIONS_TABLE: this.executions.tableName,
DYNAMO_CONTAINERS_TABLE: this.containers.tableName,
CLUSTER_NAME: this.executionCluster.clusterName,
TASK_DEFINITION_ARN: executionTaskDefinition.taskDefinitionArn,
SUBNETS: vpc.privateSubnets.map((subnet: ISubnet) => subnet.subnetId).
,→ join(','),
SECURITY_GROUPS: executorSecurityGroup.securityGroupId,
}
});
The distributor service responds to three different requests. These are respectively first two
http requests (method: available and method: request) and the test results http request (method:
getresult).
It is the first request between the MULE and the server. We examined the request as the
”HTTP Request 1” in the HTTP Communication section. The request aims to initialize the ses-
sion between them and get the status-ready message from the server. Then, MULE can send
the student’s code. At that point, the dispatcher service returns a proper JSON including a ready
message and execution configuration like maximum file size or maximum memory size with an
ok status code. You can find the definition of the data transfer object that is returned below.
28
res := APIHandShakeResponse{
JSONRPC: "2.0",
ID: body["id"].(string),
Result: struct {
Status string `json:"status"`
Load int `json:"load"`
MaxTime int `json:"maxtime"`
MaxFileSize int `json:"maxfilesize"`
MaxMemory int `json:"maxmemory"`
MaxProcesses int `json:"maxprocesses"`
SecurePort int `json:"secureport"`
}{
Status: "ready",
Load: 0,
MaxTime: 1800,
MaxFileSize: 67108864,
MaxMemory: 2097152000,
MaxProcesses: 500,
SecurePort: 443,
},
}
It is the second request between the MULE and the server. The request aims to upload
student’s codes successfully. We examined the request as the ”HTTP Request 2” in the HTTP
Communication section. The dispatcher service extracts the source files and uploads these files
to the S3 service.
Then, checks if there are any active containers in the ECS cluster. If there are no active
29
containers, it creates a new container and returns its IP address. If there are active containers,
it returns the IP address of the container with the lowest amount of tasks running. If the cluster
load is above 80%, it creates a new container asynchronously and returns the IP of the lowest
loaded running container.
Finally, it gathers the session information into one data object and saves it to the Dynamodb
to use this information later.
You can find the definition of the data transfer object that is returned below.
ticketInfo := TicketInfo{
AdminTicket: uuid.NewString(),
MonitorTicket: uuid.NewString(),
ExecutionTicket: uuid.NewString(),
Port: 80,
SecurePort: 443,
}
Also, to prevent malicious code uploads and improve security a Large Language Model
interrupts this step and verifies the supplied code. It discards if it detects any suspicious code.
The dispatcher service sends the user code to OpenAI API to query if the code contains anything
malicious and/or outside of the usual lab settings. The template query used by the application:
“Given the following code snippet from a file named %s, determine if it performs any opera-
tions outside of standard university-level coding questions. All programs supplied below should
30
be relatively simple, student-level programs.
• outbound network operations are ALLOWED such as querying from known trusted sources
• Infinite loops and buffer overflows are acceptable as long as they are not used to exploit
the system.
• Memory leaks are acceptable as long as they are not used to exploit the system.
• Accessing external dependencies (except for allowed operations like package installa-
tions).
• Please answer ONLY with ‘True’ if the code is considered unsafe or malicious within the
context of standard university-level coding questions, ‘False’ otherwise. Do NOT provide
any additional explanations. Code: %s
It is very important to note that this is an experimental feature and should not be fully trusted.
It is unknown how well LLMs perform security validation as of yet, and they probably don’t do
very well on well obfuscated examples. Industry standard Static Application Security Testing
(SAST) tools can be used to statically analyse user code for known malicious patterns found
in known viruses. An example of a tool with this capability is Kiuwan which is a subscription
based service offering static code analysis. [13]
It is the request for the evaluation of the code. The request aims to test the code and receive
the test results.
The dispatcher service searches the session on the Dynomo DB according to the ”adminTicket”
parameter. It finds the items where the adminTicket attribute is equal to the provided ad-
minTicket value. Then, it pulls the results.txt file from the S3 service. The results are uploaded
by the executor container, another service that will be discussed in the future, to S3 after which
this service will receive the request.
31
Figure 3.14: Dynomo DB Result Record Example
The lamp dispatcher may look super cool after all these steps. . However, in addition to the
many advantages it brings us, there are also side effects that we should keep in mind. The most
popular of these side effects is cold start.
The cold start problem in AWS Lambda refers to the latency experienced when a Lambda
function is invoked for the first time or after a period of inactivity. This latency occurs because
AWS needs to provision resources to run the function. Here’s a more detailed explanation:
• 1. Initialization: When a Lambda function is invoked, AWS must first allocate an exe-
cution environment, which includes provisioning the required CPU, memory, and other
resources.
• 2. Container Bootstrapping: Next, AWS needs to load the function’s code into a new
container and initialize any dependencies.
• 3. Function Initialization: Finally, any initialization code within the function (such as
establishing database connections or loading configuration files) must be executed.
You can find the general overview of the lambda dispatcher service below.
32
Figure 3.15: Dynomo DB Record Example
The executor service executes and validates user code. The service is deployed as a large
docker ubuntu image with preinstalled supported language compilers so that it can compile and
execute user code.
All WebSocket requests are redirected to this service by the proxy service. The executor
service accepts WebSocket requests and then processes them. ”websocketHandler” function
handles all the WebSocket requests.
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
sessionId, err := websocketHandler(w, r)
If it is the first request to receive the executor service send by the proxy service, the message
should include the session ID. Because the files were sent to a server at runtime, this request
did not exist in the previously existing stream. However, we need to retrieve the files before the
execution steps in this system.
func IsFirstMessage(message string) bool {
return strings.HasPrefix(message, "session_id")
}
The executor service fetches the source files from the S3, and session details from the
Dynomo DB services according to the session ID.
lang, maxTimeSeconds, isValidation, _ := getSessionDetails(sessionId)
sourceFiles, _ := getFiles(sessionId, "source")
33
There are different language handlers to compile and execute the source files. The executor
selects a language handler according to the information fetched from the Dynomo DB. Each
handler implements a interface for setup, compile and execute the source code. You can find
the the handler interface in the appendix section.
handler, err := language.GetHandlerFor(lang)
session.Handler = handler
If the request is a validation request, the executor fetches the validation files from the S3
service. Validation and execution files must have different names in this system version.
if isValidation {
validationFiles, _ := getFiles(sessionId, "validation")
for k, v := range validationFiles {
if _, ok := sourceFiles[k]; ok {
return fmt.Errorf("[%s]␣source␣and␣validation␣files␣should␣not␣
,→ have␣the␣same␣name␣(%s)", session.Id, k)
}
sourceFiles[k] = v
}
}
Monitor Request
In order for the request to be understood by the executor service, the Proxy service must add
a ”monitoring” message to the request.
func IsContextMonitor(message string) bool {
return strings.HasPrefix(message, "context:monitor")
}
This request aims to give an order to the system for the compiling of the source code or
running the validation tests. The mentioned request was reviewed in the ”WebSocket Commu-
nication” section under the ”Discovering the Existing System” chapter. The request aims to
initiate the compilation of the source code. It was examined in the ”WebSocket Communica-
tion” section of the ”Discovering the Existing System” chapter. The request awaits an approval
message, after which it can either launch an interactive terminal session between the student and
the server or retrieve results if it is a validation request. The request type, whether validation or
standard execution, is specified by the type parameter in the body of the second HTTP request
(method: request).
if IsContextMonitor(message) {
if err := session.CompileProgram(); err != nil {
return session.Id, err
}
}
34
Execute Request
The request aims to execute the program and transfer the user inputs to the server, and server
outputs to the user’s interactive terminal. The server calls the related ”execute” function accord-
ing to the language handler. For example, if the source files include Python code, the server calls
the ”execute” function of the Python handler.
if IsContextExecute(message) {
if HasContent(message) {
if err := session.ForwardMessageToProcess(message); err != nil {
return session.Id, err
}
} else {
if err := session.ExecuteProgram(); err != nil {
return session.Id, err
}
}
}
Various services and approaches have been used for the construction of the entire system.
We use GitHub to store our codes as GIT repositories. The application is designed with
modularity and separation of concerns in mind. A mono-repo approach of the existing solution
was replaced with a repository per component. The structure is as following:
1. [Typescript] mule-serverless-cdk
2. [Go] mule-serverless-lambda (dispatcher)
3. [Go] mule-serverless-proxy
4. [Go, Python] mule-serverless-executor
5. [Go] mule-serverless-lambda-integ
6. [Go] mule-serverless-proxy-integ
7. [Go] mule-serverless-executor-int
35
All the repositories can find under the Mule Serverless Execution organization. However,
you should send a request to join the organization as the repositories are private.
We use GitHub Actions to send codes to AWS other than the codes contained in the CDK
repository because the CDK repository has its trigger mechanism through the AWS CodeStar
connection.
GitHub Actions is a continuous integration and continuous delivery (CI/CD) platform that
allows you to automate your build, test, and deployment pipeline. You can create workflows that
build and test every pull request to your repository, or deploy merged pull requests to production
[5].
I want to explain a workflow from one of our repositories for a deeper understanding.
name: Deploy on PR Merge
on:
pull_request:
types: [closed]
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
if: github.event.pull_request.merged == true
steps:
- uses: actions/checkout@v4
36
- name: Push to ECR
run: docker push 354384408511.dkr.ecr.eu-west-1.amazonaws.com/mule-
,→ serverless-lambda
This GitHub Action workflow is designed to automatically deploy updates to an AWS Lambda
function when a pull request (PR) is merged into the main branch of a repository. The workflow
is triggered by pull request events, specifically when a PR is closed. It defines a job named ‘de-
ploy‘ that runs on the latest version of an Ubuntu runner. The job has the necessary permissions
to read the repository contents and write an ID token for authentication, and it only runs if the
pull request has been successfully merged.
The workflow includes several steps. First, it checks out the repository to access its con-
tents. It then builds a Docker image named ‘mule-serverless-lambda‘ using the repository’s
Dockerfile. Next, it configures AWS credentials for the workflow using a specified IAM role
(‘oidcLambdaRole‘) and sets the AWS region to ‘eu-west-1‘.
After configuring AWS credentials, the workflow logs in to Amazon ECR (Elastic Container
Registry) to enable pushing Docker images to ECR. It tags the Docker image with the ECR
repository URI and pushes the tagged image to the specified ECR repository. Finally, the work-
flow updates the AWS Lambda function (‘mule-serverless-lambda-gamma-eu-west-1‘) with the
new Docker image from the ECR repository.
The infrastructure created within the scope of the project may need to be moved to different
Amazon accounts. For this reason, we used the Amazon Cloud Development Kit (AWS CDK),
an infrastructure as code (IaC) approach, to build the system. You can find detailed information
about these terms in the Research & Background section.
All the hardware that the system has is maintained in an Amazon account by a main self-
mutating pipeline created by AWS CDK.
The CDK pipeline handles any subsequent changes to the deployment (or even in the pipeline
process itself). Therefore, developers modify the CDK deployment only through version control
after the pipeline creation. This feature makes it self-mutating, i.e., self-updating, as the pipeline
can automatically reconfigure itself [14].
37
Triggering the Pipeline
The pipeline can be triggered manually or automatically. The pipeline connects to the
GitHub repository where all the infrastructure codes are stored. The connection is established
via the Amazon CodeStar Connection Service. You can find the definition of the connection
below. Thanks to this connection, any update on the GitHub repository triggers the pipeline.
/**
* Fetch GitHub repository
* Secret key must added to the AWS account
* @param repo the information about GitHub repository
*/
export const createSource = (repo: Repo) => {
return CodePipelineSource.connection(
repo.owner + '/' + repo.name, repo.branch,
{
connectionArn: "arn:aws:codestar-connections:eu-west-1:354384408511:
,→ connection/61bd033c-4790-4412-b8b0-b8510088116d",
}
)
}
Pipeline Stages
The source stage fetches the code from the source code repository. Whenever we push
something to the main branch of these repositories, i.e., merge a pull request, our pipeline will
run as we have configured the main branch as the pipeline trigger.
38
• 2. Build (Created automatically by the AWS)
The build stage has two roles: it converts the CDK code to CloudFormation templates and
builds any assets that end up in S3 buckets. The developer can define this workflow, but the
steps must produce the CDK synthesizing output. The phase when CDK tooling converts the
CDK code to the CloudFormation templates is called synthesizing. In this step, we run the unit
tests as well with the ”npm run test” command.
UpdatePipeline makes any changes to the pipeline, i.e., modifies it with new stages and
assets if necessary. The developer cannot alter this stage. One thing to notice is that the pipeline
process is always initially run with the currently saved version. If a change in the version control
introduces changes to the pipeline, the pipeline execution is canceled in this stage and restarted
with the new version.
In the assets stage, the pipeline analyzes the application stack and publishes all files to S3
and Docker images to ECR that the application needs for deployment. CDK Pipelines stores
these assets using its buckets and ECR registries. By default, they have no lifecycle policies, so
the CDK developer should ensure that the assets will not increase their AWS bill unexpectedly.
39
Figure 3.17: Gamma Stage with the App and Monitoring Stacks
The gamma stage creates and updates application infrastructure resources. There are Appli-
cation and Monitoring stacks. A stack is a collection of AWS resources you can manage as a
single unit. The application stack contains all the hardware definitions necessary for the proper
operation of the system, such as lambda dispatcher, proxy, and executer services. On the other
hand, the monitoring stack contains a cloudwatch dashboard with the proper metrics selected by
us.
This stage aims to run all integration tests concurrently and collect the results. Then, the
pipeline can decide to stop or continue. The integration stage contains only a stack named
integration. The integration stack pulls the GitHub repositories that include the ’INTEG’ in their
names and builds lambda functions with these codes. These lambda functions run concurrently
in a state machine supplied by the AWS Step Functions Service. You can find the diagram of
the state machine below.
40
Figure 3.18: Step Functions Graph
The integration stage checks the status of the state machine by a post step in the stage. You
can find the definition of the post step below.
integStageDeployment.addPost(integTestStep);
The system needs manual approval by the system admin after all the infrastructure is de-
ployed and tests are passed. Test teams can apply User Interface Tests (UAT) in the gamma
stack and then they complete the UAT step by giving approval.
You can find the initialization of the manual approval step below.
• 8. Production
41
This is the latest and most stable version of the app after all UATs and integration tests are
done. It includes whatever the gamma stage includes (application and monitoring stacks).
3.5 Documentation
Carlos and I organized a lot of different discussion and brainstorming sessions during the
development process. We published the meeting notes after each meeting. Our main goal was
to prepare wiki pages and documents to make the learning curve of a new participant in the
project more horizontal. We use Confluence to keep all the notes related to the project. In
addition to the project papers, there are README.md files for each project. This file includes
general information about the project like test coverage percentage and answers some important
questions like How does a developer build the system or run the tests? Last but not least, these
files explain important business logics to make the developers comfortable to contribute. You
can find an example of a README.md file in the Appendix C: An Example of README File
section. Also, You can send an email to [email protected] if you come across any
problem while accessing to resources.
42
Chapter 4
Evaluation
4.1 Objectives
This chapter includes several approaches used during the development process to answer
that are we building the project right? You will find our project’s verification steps.
Verification is the process of checking that a software achieves its goal without any bugs. It
is the process to ensure whether the product that is developed is right or not. It verifies whether
the developed product fulfils the requirements that we have. Verification is static testing [15].
Unit testing is the process where you test the smallest functional unit of code. Software test-
ing helps ensure code quality, and it’s an integral part of software development. It’s a software
development best practice to write software as small, functional units then write a unit test for
each code unit [16].
Each project in our system has several unit tests, and these tests run before a new pull request
on the development environment by the developer and after a new pull request on the CI/CD
environment by the GitHub Actions. A GitHub action calculates the code coverage percentage
after the related tests are run on a new pull request. If the calculated code coverage is less than
eighty, it throws an error and attempts to lock the merging action. You can find an example of
our pipelines below.
43
Figure 4.1: GitHub Actions That Were Run On A New PR
Automated integration testing validates system components and boosts confidence for new
software releases. It looks like an end-to-end testing of the application. Critical parts or main
components can be tested by the integration tests.
Let us get back to our business. We implement integration tests as three different AWS
Lambda Functions. They all run parallel in the same state machine a part of the AWS Step
44
Function. Our app is leveraged by the three critical components. These are Executor, Proxy,
and Lambda. Thus, we have three other different repositories, Executor-Integration-Test, Proxy-
Integration-Test, and Lambda-Integration-Test as they are run as AWS Lambda Functions.
Any updates in the integration test repositories push to the AWS Lambda Function via
GitHub Actions.
We can reach the system logs or usage metrics via the AWS CloudWatch Dashboard. You
can find a part of our dashboard below. It makes the development process more transparent as
developers can debug their code and see the system logs immediately on different environments
like Alfa, Beta, or Gamma. Also, teams can make data-driven technical decisions. In our case,
If we would like to apply vertical scaling to one of our services, we can calculate the desired
infrastructure precisely because we know the usage of the existing system thanks to CloudWatch.
CloudWatch can also notify you if your application is behaving differently than you ex-
pected. The goal for you as a developer or DevOps engineer should always be that you be
notified about errors in your system before your customers experience them [18].
We set different AWS CloudWatch alarms to track any unexpected behavior or infrastructure
usage level in the system. For example, we can notify us when CPU usage of an executor service
exceeds 80
45
Chapter 5
Conclusion
5.1 Objectives
This chapter includes a short conclusion about the project. You can find information about
the last version of the system, my growth areas, special thanks, and more.
5.2 Epilogue
In conclusion, we delivered the features that affect the main use cases successfully. This
project also went beyond expectations by providing an integrated CI/CD pipeline that enables
easy future updates to the service, a CloudWatch monitoring dashboard that allows for infras-
tructure resource and logs monitoring, and by providing an Infrastructure as Code repository
which programmatically defines the infrastructure for the entire application making it easy to
migrate the project to other AWS accounts and easily expanding the availability to different
Regions in the world if the service is to be used outside of Ireland for other universities.
This project has significantly contributed to my growth in areas such as Amazon Web Ser-
vices, Golang, and effective communication. It was my first project where I used Golang, AWS,
and Typescript for CDK. Also, I haven’t worked in an international team before. Therefore,
sometimes explaining or understanding the current situation was hard for me. However, It was
fun and I am proud to be a part of that project and make fresh students’ code execution processes
more smooth. Thank you!
46
Appendices
47
.1 Appendix A: Installation of the Client Side
version: '2'
services:
mariadb:
image: docker.io/bitnami/mariadb:11.0
environment:
# ALLOW_EMPTY_PASSWORD is recommended only for development.
- ALLOW_EMPTY_PASSWORD=yes
- MARIADB_USER=bn_moodle
- MARIADB_DATABASE=bitnami_moodle
- MARIADB_CHARACTER_SET=utf8mb4
- MARIADB_COLLATE=utf8mb4_unicode_ci
volumes:
- 'mariadb_data:/bitnami/mariadb'
moodle:
image: docker.io/bitnami/moodle:4.2
ports:
- '9922:8080'
- '443:8443'
environment:
- MOODLE_DATABASE_HOST=mariadb
- MOODLE_DATABASE_PORT_NUMBER=3306
- MOODLE_DATABASE_USER=bn_moodle
- MOODLE_DATABASE_NAME=bitnami_moodle
# ALLOW_EMPTY_PASSWORD is recommended only for development.
- ALLOW_EMPTY_PASSWORD=yes
volumes:
- 'moodle_data:/bitnami/moodle'
- 'moodledata_data:/bitnami/moodledata'
depends_on:
- mariadb
volumes:
mariadb_data:
driver: local
moodle_data:
driver: local
moodledata_data:
driver: local
48
.2 Appendix B: Seperation of the Tasks
Lambda Dispatcher
2. DynamoDB Adapter
3. S3 Adapter
4. CloudWatch Adapter
5. ECS Adapter
API Proxy
3. DynamoDB Adapter
49
VPL Proxy
CDK
1. Create Pipeline
(a) Gamma and Production Stage (CARLOS)
2. Create App Stack
(a) S3 Bucket (CARLOS)
3. Create Definitions for:
(a) Session/Execution Table (ATA)
(b) Available Container Table (CARLOS)
(c) Permissions (ATA/CARLOS)
4. Create VPC (CARLOS)
5. Create Executor and Proxy Cluster (CARLOS)
6. Create Task Definitions for Executor and Proxy (CARLOS)
7. Pass Executor Task Definition and VAC to Lambda Environment (CARLOS)
8. Define Fargate Service for Proxy Task (CARLOS)
9. Create ALB with Target Proxy Cluster (CARLOS)
10. Create Monitoring Stack for Above Services (CARLOS)
11. Create GitHub Actions Workflows
(a) Build, Test on Push, and Deploy on Pull Request Close (ATA)
12. Create Integrations Stack with Step Function (CARLOS)
13. Create ECR Repositories and GitHub Actions Roles for Image Deployment (ATA)
50
Executor
3. Develop Utility Functions for Stdout, Stderr, Stdin Interaction with Web Socket (CAR-
LOS)
4. Develop Language Handler Abstract Usage Flow in Web Socket Loop (CARLOS)
6. DynamoDB Adapter
7. S3 Adapter
Mule-Serverless-Executor
51
What?
The service allows remote communication by transmitting the compilation, execution, and
validation process I/O over WebSocket to the outgoing system. It is designed to mimic the
original VPL jailserver functionality in a more modern way.
52
5. Utility functions to help with creating languages handlers can be found in ./internal/language/utils
(look at example implementations for java, c, python, etc.)
7. To test the functionality of the new language handler, create a new handler_test.go in
the same directory as the new handler.go and write the tests (look at examples for other
language implementations as a guide).
Validation mechanism
The Validation mechanism’s goal is to provide a better lecturer experience when writing
tests to validate student labs. It replaces the existing shell script-based testing with Python and
dynamic, variable input user code execution alongside weighted test cases.
In order to validate user-supplied source code, this service exports its source code execution
mechanism as a C-library used by the Python-based runner script. This ensures that the same
underlying mechanism is used by the validation runner and the service, providing consistency
and a more stable experimental environment.
The validation scripts take in the following environment variables, these are mandatory:
53
3. Collects tests from the validation code
4. Dynamically loads the tests and execution library
5. Executes and aggregates test scores
6. Writes the results.txt file in /<os temporary folder>/<session id>
Build
Testing
To test the system locally, run go test ./... from the root directory.
If you wish to invalidate the test cache and re-run all test cases, run go clean --testcache
&& go test ./....
54
// Execute executes the project and returns the stdin,
,→ shouldContinue channel and an error
Execute(executablePath string,
onStdout func(string) error,
onStderr func(string) error,
onClose func(error) error,
timeoutMonitor *utils.TimeoutMonitor) (stdin io.WriteCloser
,→ , shouldContinue <-chan struct{}, err error)
// Handles checks if the handler can handle the given language
// example if the language is java, the Java handler should
,→ return true and false for every other language.
Handles(language string) bool
}
55
References
[1] University of Las Palmas de Gran Canaria, “Vpl jail system 2.7.1 documentation,” 2024.
Accessed: 2024-06-18.
[2] K. Morris, Infrastructure as Code, 2E: Dynamic Systems for the Cloud Age. Sebastopol,
CA: O’Reilly Media, 12 2020.
[3] Amazon Web Services, “Aws cloud development kit (cdk) - developer guide.” https:
//docs.aws.amazon.com/cdk/v2/guide/home.html, 2024. Accessed: 2024-06-13.
[4] S. Chacon and B. Straub, Pro Git. New York, NY: Apress, 2nd ed., 2014.
[5] GitHub, “Understanding github actions,” 2024. Accessed: 2024-06-13.
[6] Amazon Web Services, “Overview of amazon web services - compute services,” 2024.
Accessed: 2024-06-13.
[7] A. W. Services, “Introduction to devops on aws: Deployment strategies,” 2023.
[8] Amazon Web Services, “Serverless architectures with aws lambda,” 2024. Accessed:
2024-06-17.
[9] Amazon Web Services, “Amazon ecs serverless recommendation guide,” 2024. Accessed:
2024-06-18.
[10] Amazon Web Services, “Amazon api gateway developer guide,” 2024. Accessed: 2024-
06-18.
[11] University of Las Palmas de Gran Canaria, “What is vpl?.” https://vpl.dis.ulpgc.
es/index.php/about/what-is-vpl, 2024. Accessed: 2024-06-19.
[12] Wikipedia contributors, “Json-rpc.” https://en.wikipedia.org/wiki/JSON-RPC,
2024. Accessed: 2024-06-20.
[13] Kiuwan, “Code security (sast).” https://www.kiuwan.com/code-security-sast/,
2024. Accessed: 2024-06-22.
[14] Findy Network, “Deploying with cdk pipeline.” https://findy-network.github.io/
blog/2023/05/08/deploying-with-cdk-pipeline/, 2023. Accessed: 2024-06-22.
[15] Infyom, “Verification and validation: What’s the difference?.” https://infyom.com/
blog/verification-and-validation-what%E2%80%99s-the-difference, 2024.
Accessed: 2024-07-01.
[16] Amazon Web Services, “What is unit testing?.” https://aws.amazon.com/what-is/
unit-testing/, 2024. Accessed: 2024-07-01.
[17] Amazon Web Services, “What is amazon cloudwatch?.” https://aws.amazon.com/
cloudwatch/, 2024. Accessed: 2024-07-05.
[18] AWS Fundamentals, “The benefits of using cloudwatch alarms in
your aws environment.” https://blog.awsfundamentals.com/
the-benefits-of-using-cloudwatch-alarms-in-your-aws-environment,
2024. Accessed: 2024-07-05.
56