Networking Case Study in Stem Education - Application Layer Protocol Labs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

NETWORKING CASE STUDY IN STEM EDUCATION - APPLICATION

LAYER PROTOCOL LABS


M. Mikac
University North (CROATIA)

Abstract
The TCP/IP network stack offers different communication services through defined standard transport
and network layer protocols. It represents a networking subsystem that is currently used by most of the
application layer protocols - for example, DNS (Domain Name System), FTP (File Transfer Protocol),
HTTP (Hypertext Transfer Protocol), different e-mail protocols, and many others. All application
protocols rely on transport layer protocols (TCP or UDP), but in their standard form, they completely lack
any data security mechanisms. That is why nowadays networks include "security layer" protocols such
as TLS (Transport Layer Security).
When introducing application protocols to our undergraduate STEM students, we provide clear lab
examples in which basic protocol facts can be checked by simply recording and analysing network traffic
at the host. This paper gives an introduction to a few often-used application layer protocols and their
relation to the TCP/IP network stack. It includes the description of our standard networking lab - using
the well-known network analyzer, Wireshark, students have to notice and try to explain some specific
events and protocol behaviours. Two application protocol-related labs are described - DNS and HTTP
labs. As part of the HTTP lab, the basics of securing transferred data are depicted, explained, and
hopefully, confirmed by students.
Generally, the goal of the paper is to provide an explanatory example that can help STEM students
understand the basics of everyday used application protocols.
Keywords: TCP/IP, application protocol, Wireshark, STEM, networking, HTTP, DNS.

1 INTRODUCTION
The Internet is mainly based on ideas and protocols defined in the TCP/IP network stack ([1], [2]). Even
though those concepts originate back in the 1980s, they are still used every day as dominant underlaying
foundation for application layer protocols, such as DNS (Domain Name Server, [3]), HTTP (Hypertext
Transfer Protocol, [4]), FTP (File Transfer Protocol, [5]), different e-mail protocols and other proprietary
application layer protocols used in computer networks. The TCP/IP stack itself can be seen as a layer-
based system, as shown in Fig. 1 – the lowest layer relates to physical network technology used to
implement communication links (e.g. Ethernet, optical, wireless or mobile), while network/Internet layer
always uses IP (Internet Protocol, IPv4 [6] or IP v6 [7], with current Google statistics [8] showing the
availability of IPv6 connectivity around 35%).

HTTP DNS
port 80 port 53
...HTTP, DNS, HTTPS
port 443
Application IMAP, POP,
SMTP, FTP...
TLS
TLS
Transport TCP • UDP TCP UDP
ICMP
Internet IPv4 • IPv6 IP
ARP
IEEE 802.3
IEEE 802.11
Link LTE
HSDPA...

RFC 1122 Protocols/standards

Figure 1. TCP/IP network stack and some of standard application protocols.

Some of interesting issues and functionalities of IP that are presented to our students were described
in [9]. Transport layer of the TCP/IP stack proposes TCP (Transmission Control Protocol, [10]) and UDP

Proceedings of EDULEARN21 Conference ISBN: 978-84-09-31267-2


2938
5th-6th July 2021
(User Datagram Protocol, [11]) – introduction to transport layer protocols and accompanying labs in our
networking course was given in [12]. Actually, it can be said that transport layer offers a communication
service for the highest layer protocols – application protocols. As depicted in Fig. 1, there is also a
sublayer, we could call it “security layer” between application and transport layer, offering secure end-
to-end communication without changing application protocols. It is currently dominated by TLS
(Transport Layer Security) protocol.
When introducing application protocols to our undergraduate STEM students, we try to provide clear
overview, followed by lab examples, that gives students the ability to check all basic protocol facts by
simply analysing the recorded network traffic. While, for the most labs in this course, we use test
environment based on multiprotocol emulator, IMUNES [13], using the principles described in [14], the
application layer labs can be performed on any network enabled host, using standard operating system
and network analyzer tool, Wireshark [15].
The paper is organised as follows: after this introductory section, the application protocols covered with
this paper (DNS, HTTP) are briefly described, followed by the short methodology section proposing
methods for performing lab tests. The results section covers the lab exercises related to DNS and HTTP,
confirming the theory and principles described in all previous sections.

2 APPLICATION LAYER PROTOCOLS


Application layer protocols are implemented on the highest layer of the TCP/IP stack and, by definition,
use lower layer protocols to communicate with the network (and, of course, other hosts or users) – that
kind of communication among neighbour layers is called vertical communication and it can be practically
depicted as encapsulation procedure [14]. Shortly, in order to communicate with the network, certain
protocol data needs to be transferred to the media (via the layer closest to the media, physical/link layer).
After that the data travels through the network (as the signal through some kind of media) and finally
gets to the destination, where, again, inverse process has to be performed – data being extracted from
the media and pushed up to the appropriate layer and accompanying protocol.
The communication among the hosts is allowed only so that the same protocols and layers can
communicate – that is, so-called, horizontal communication. Meaning, HTTP protocol and application
layer on one host can communicate only with the same layer and the same protocol – HTTP at
application layer on another host. In practice, someone (HTTP client) sends a request to the HTTP
server – server accepts the request and sends a response back to the client (using HTTP protocol as
set of communication rules). All the communication on end-to-end basis is done using HTTP – but,
clearly, lower layers are involved too – without their assistance (remember, lower layers actually provide
service for higher layers) HTTP data would not be able to get to the destination!

2.1 Confusing terms - application protocols, applications, apps


Being at the top of networking subsystem, the application protocols can be considered to be the closest
to users (all of us). As the matter of fact, we all use different application layer protocols to perform many
of our every-day network related activities – reading and sending e-mails via different e-mail protocols
(or HTTP when using webmail), checking for the news and hooking at the social networks (all using
HTTP) etc. That being said, there may be some often used terms that could bring to confusion. As well
as using different network services, most of us use cellular phones with modern operating systems –
remember installing some apps (applications) on your mobile phone?! The same may apply to our
desktop computers – we install different programs, applications. So, the term – application or app - may
seem confusing to some of us, not only to our students! That is why we suggest explaining “the wider
picture” to the students – as on lectures slide snapshot given in Fig. 2.
We all use some kind of computer and networking equipment – our tablets, laptops, cell phones, TVs –
on those devices, we use different programs (software, applications) to perform different operations.
Among them, we often use web browser as an application that is in fact a program that visualizes web
pages (psst, we shall not make even bigger confusion by mentioning web applications). But, the web
browsers need to fetch those web pages from somewhere, from the network! For that, the browser uses
networking subsystem of the operating system of the device (based on TCP/IP stack) to perform the
part of its functionalities, related to network communication. That is depicted on Fig. 2 – user interacts
with web browser (step 1, as depicted) and finally web browser shows the result on screen (final step,
depicted as step ? since it is unclear how many steps we could identify – depends on how deep we want
to analyse the process).

2939
More precisely, the web browser shall use HTTP (application layer protocol) to obtain the web page, by
sending HTTP request to the web server! Prior to that, it will have to use DNS to obtain required
information about the web server user wants to access (basically, it has to obtain logical, IP address for
the given symbolic web server (domain) address). In each case, web browser shall contact the
application layer of the networking subsystem (step 2 in Fig. 2)!

Figure 2. Schematic view – “wider picture”– user, software application and TCP/IP stack.

Shortly – when talking about application protocols in our networking class, we refer to application layer
protocols of TCP/IP stack (even though, the application protocol may be a term that could be used for
other usages, e.g. rules of communication among two programs). The term application relates to certain
user-oriented piece of software – program installed on the device, using certain operating system (e.g.
Windows on desktop PC, Android on mobile phone, some kind of smart or proprietary OS on TV etc.).
Additionally, Fig. 2 shows how the operating system relates to networking subsystem – while users
interact with user apps (programs, such as previously mentioned web browser, e-mail clients or any
other software requiring the networking) using some user interface (UI), the app, when necessary, must
interact with networking subsystem (when handling user requests by sending something to the network
and receiving data from the network, before presenting it to the user). The app shall only use the highest
layer protocols (HTTP, DNS, FTP etc.). Based on the communication process, it can be considered
client or server – client application (or software process) usually initiates the communication by sending
some kind of request to the server, while server waits for requests, starts the processing and finally
sends the response (most of our every-day used apps are clients – for example, the web browser is
HTTP client). Less important for this paper, Fig. 2 also depicts that networking subsystem at its lowest
layer must have the ability to use hardware related functionalities of the operating system – for example,
sends the data to device drivers which are responsible for working with network cards etc.

2.2 Domain Name System – DNS


DNS is an important global public network subsystem responsible for associating symbolic domain
names with various entity information necessary for Internet to work properly. In essence, it is a
hierarchical and decentralized “database” with precisely defined rules for obtaining required control
information, such as public IP addresses, for hosts available publicly, using so-called domain names.
It is out of the scope of this paper to describe DNS in details – Fig. 3 depicts the idea of the hierarchy –
at the top, there are 13 global root DNS servers [16], each containing the same database with all top-
level domain (TLD) server details. For each top-level domain (com, net, org and other generic TLDs
(gTLD) and, mentioning only few, hr (Croatia), es (Spain), de (Germany) as country code TLDs (ccTLD))
appropriate TLD server contains the database of all “subdomains” (e.g. facebook.com, google.com and
many others, for .com generic TLD). Each domain must have its own authoritative name server (NS)
responsible for providing all DNS related information for certain domain. The closest to the end-users

2940
are local DNS servers that are responsible for contacting all required servers, when and if necessary,
and sending response to the clients.

Figure 3. Concept of DNS hierarchy and communication – example of facebook.com lookup.

When our application (web browser, for example) gets a user request to fetch the information from some
web server (for instance, contacting Facebook with facebook.com or Google with google.com domain
name entered in address field of the browser – as show in Fig. 3) it has to connect to the server. But, in
order to connect, it must have server public IP address (symbolic domain names are here primarily to
make it easier for us, humans, to remember the most used addresses) – in general, it has to obtain
server IP address based on entered domain name, and that is why it has to contact the DNS before
making the connection. Additionally, each device can store DNS cache locally (simply put, by recording
all previously obtained domain name and IP address pairs), which may allow skipping some of steps
described next. For the sake of the clarity, we skip considering the device local DNS cache.
Our browser shall require server IP address from the DNS – as a DNS client, it connects to local DNS
server and sends a query (step 1 in Fig. 3). Local DNS server has to collect necessary information and
respond with the results. In the worst-case scenario, local DNS server will have to contact root NS in
order to get information about TLD NS responsible for .com domain (steps 2 and 3 in Fig. 3). After that
it would have to contact .com TLD NS in order to get address of authoritative NS responsible for
facebook.com (steps 4 and 5). Finally, it would have to contact authoritative NS to get the IP address of
the web server (steps 6 and 7). Obtained IP address is sent from local DNS server to the client (step 8)
and, finally, the browser can produce and send HTTP request (and encapsulate it to TCP and IP – being
able to fill in all required control information in protocol headers, including destination IP address in IP
header). On transport layer, DNS can use both UDP and TCP. UDP uses well-known-port 53 on the
server, and it is usually used for sending queries and retrieving results, while TCP is used for
administrative purposes (the same TCP port number 53, is assigned to DNS).

2.3 Hypertext Transfer Protocol – HTTP


Today, HTTP is probably the most used application protocol – initially for browsing the Internet and
nowadays with wider usage in different web and mobile applications for communicating with servers
without standard hypertext document fetch and visualization.
HTTP is encapsulated into TCP on transport layer, meaning that any HTTP based communication shall
include logical TCP connection [12] – before exchanging HTTP packets, underlying TCP connection
needs to be established. Well-known port used by default configuration of HTTP server is 80. The HTTP
communication is based on request-response exchange – the client sends a HTTP request and the
server responds with HTTP response. Based on configuration settings, the TCP connection can be “kept
alive” to handle multiple client requests – it is default for HTTP/1.1, while older HTTP/1.0 insisted on
closing the connection after each response-request exchange. Since TCP connection establishment is
time consuming operation [12], continuous establishment and closures of connection may impact the
performances.
For standard application protocols, it is usual that protocol control information includes simple text
keywords in header, followed by data payload. Since it is about clean, plain text format, control
information can be easily read using any network analyser tool. For the education purposes, that fits
perfectly, allowing students to check what is really being transferred. But, from security and privacy point

2941
of view, that may be considered a flaw. That is why additional secure sublayer is used to protect and
encrypt data being send through the network.

2.4 Secure HTTP communication


Any transfer of content or privacy sensitive data using HTTP or any other unsecure application layer
protocol represents a security risk. That is where secure sublayer based on TLS come very handy – it
offers real-time secure communication between client and server, providing encryption to data payload
encapsulated to TCP segments. When used in combination with HTTP, secure HTTP is abbreviated as
HTTPS. Instead of well-known port 80, HTTPS server well-known port is 443, by default. The basic idea
of establishing secure communication in HTTPS is shown in Fig. 4.

Figure 4. The concept of client/server negotiation when establishing secure HTTPS communication.

Figure only shows the idea, not technical details – when HTTPS is used, TCP connection has to be
established, as well as when using HTTP. But, prior to exchanging any data, the client and the server
have to negotiate secure communication parameters. If both sides are able to conform the requests (for
example, the server can insist on new version of TLS – if the client cannot support it, the secure
connection cannot be established) the procedure of exchanging and checking certificates starts. In case
all the checks are successful, symmetric key is generated and securely exchanged, and the secure
communication can start!

3 METHODOLOGY
While using adapted test environment for standard student labs, as described in [9], [12] and [14], labs
related to analysis of the application layer protocols can be performed in standard desktop environment
– no emulated network is required and student can perform the analysis in real-world network.
For the labs related to DNS and HTTP, network enabled PC is required, with installed network analyzer
tool, Wireshark [15]. Lab example related to DNS and described in 4.1 uses nslookup tool to connect to
DNS. It is a standard tool available on all platforms, providing direct access to DNS by sending queries
and receiving DNS answers. It can be used for educational purposes to allow students to record and
analyse network traffic and discuss some details of DNS (detecting transport protocol used, checking
DNS related control information – type of query etc.). Official reference to Windows OS implementation
of nslookup is given in [17].
Labs related to HTTP can be performed by using any available web browser. In this case, Mozilla Firefox
was used, but the result should not be importantly different when using other browsers. For provided
example, shared hosting server with prepared content and simple HTML page was used – commercial
domain unin.com.hr is used, with Apache [18] as HTTP server. For preparing HTML documents, Visual
Studio Code editor was used [19].

2942
In order to record a network traffic, Wireshark is started, proper network interface selected and capturing
activated. After completing each lab or part of the lab, captured data is recorded and analysed, following
suggestions given in official lab notes. Since used in real-world environment it was expected that large
amount of network traffic shall be captured and it was suggested that recorded data is filtered – the
simplest way would be filtering the data so that it matches server IP address (the static IPv4 address of
hosting server for unin.com.hr is 185.62.75.251). Additional functionalities of Wireshark, such as “Follow
TCP stream” can be used if necessary.

4 RESULTS
Using the methodology described in previous section, students are encouraged to study the official lab
documentation and work on given assignments. In this paper, only few lab examples are covered, but it
should give enough hints and ideas for expanding or adapting. The results are shortly described and
accompanied with visuals of used tools. All the examples presented here relate to domain unin.com.hr.

4.1 DNS
Tool called nslookup allows us to interact with DNS. Depending on the implementation, it supports
different optional parameters that can impact the results – for example, default DNS server can be
defined, the type of the DNS query selected etc.
The DNS lab example presented here queries the DNS for details about domain name unin.com.hr.
When analysing the domain, if following the lectures as expected, student should be able to determine
top level domain (hr) and its type (country code domain). Based on Fig. 3, it should be clear that host
(DNS client) will communicate only with local DNS server (or default server set with proper parameter
when calling nslookup).
For this example, instead of using local DNS server (in home networks, usually the server administered
by our Internet service provider), as default “local” DNS server we selected Google public DNS server
[20] with IPv4 address 8.8.8.8. After that, we send default query to the DNS server, by simply sending
known domain name (complete “trace” of nslookup command prompt can be seen in Fig. 5).

Figure 5. Possible usage of nslookup tool to query for unin.com.hr

The result, showing filtered traffic captured in Wireshark is shown in Fig. 6. As it can be noticed, the filter
used in Wireshark is ip.dst == 8.8.8.8 or ip.src == 8.8.8.8. Why using IPv4 address 8.8.8.8?
Because it is the address of the DNS server we are using as default in this example.
What could be read and concluded from the captured data shown in Fig. 6? Actually, a lot! First of all,
as directly seen in central part of the Wireshark window, the result shows and proves that DNS uses
UDP as transport protocol and well-known port 53 on server side. Another thing that can be seen is that
standard query including only domain name (unin.com.hr) produces total of 4 queries – first two, for
certain reasons (correct lookup query for our domain would be if we enter correct fully qualified domain
name, FQDN, which should end with a dot – unin.com.hr. instead of unin.com.hr) try to get info about
unin.com.hr.Home – that is related to Windows networking and should be ignored. Without that, it can
be seen that two different queries about unin.com.hr were sent – type A and type AAAA – A is asking
for IPv4 address, while AAAA represents a query for IPv6 address.

2943
Fig. 6 shows details of server response (DNS answer) to type A query. As an answer, it returns
unin.com.hr public IPv4 address to the client. Also, when analysing the data, content view at the bottom
of Wireshark window could be used to determine that plain text data (e.g. domain name) is transferred
as packet payload. A query for IPv6 does not return any IPv6 since the server has no public IPv6.

Figure 6. DNS query recordings in Wireshark (UDP at transport layer).

Additionally, negative server responses (“No such name” response to unin.com.hr.Home query) could
be analysed and it could be seen that default server (8.8.8.8) returns a name of one of the root DNS
servers (a.root-servers.net) as administrative server that could have additional info about the domain.

4.2 HTTP
HTTP traffic for the analysis is generated using web browser connecting to prepared HTML example
page at address http://unin.com.hr/edulearn (index.html file located in folder edulearn at the server). The
web page includes a HTML form (used to fill some data – imaginary username and password - and send
it to the server using POST HTTP method) and two images – EduLearn21 logo and our institution logo.
Both images are located at the server. When visualized in the browser, example page looks as shown
in Fig. 7, while the HTML code used for the web page is listed in Fig. 8.
Couple tests can be done as part of this lab exercise. First, the web page access can be analysed – it
should show TCP connection establishment and few HTTP requests – first for the web page
(index.html), and then two more requests during the same TCP connection (HTTP/1.1 is used) for
fetching the image files (edulearn_logo.png and unin_logo.png).

Figure 7. Sample HTTP page at https://unin.com.hr/edulearn

2944
Figure 8. Source HTML for sample page (Visual Studio Code snapshot).

Another example would be recording the traffic after sending data through HTML form to the server. The
screenshot of traffic recorded in Wireshark and filtered so that it includes only established TCP
connection is given in Fig 9. The selected packet shows plain text data being send from the form to the
server (username and password are clearly visible and that kind of HTTP usage represents security risk
– the password is visible to anyone being able to record our network traffic).

Figure 9. Unsecure HTTP – POST to the server with readable username and password.

4.2.1 Secured HTTP - HTTPS


The same procedure used to record traffic shown in Fig. 9 is used when analysing secure
communication. But, instead of pointing the browser to http://unin.com.hr/edulearn, the address has to
be https://unin.com.hr/edulearn – by that, the browser is informed that we want to use secure HTTP.
That activates the process previously explained using Fig. 4. The recorded data (Fig. 10) shows that
TCP connection is established, using server port 443, but no plain text can be found! That is because
the connection was successfully secured and all the payload, including username and password,
encrypted.

2945
Figure 10. Secured HTTP – TCP connection, negotiation and “missing” username and password.

5 CONCLUSIONS
As part of our networking course, we try to provide clear overview of all topics, followed by practical lab
examples used to confirm the theory presented during the lectures. When covering topics related to
application layer protocols students are encouraged to study the official lab documentation and work on
given assignments on their own, in their real networking environment. This paper described few quite
simple and easy to understand lab exercises, covering and explaining expected results. It is our intention
to organize the labs so that any student can confirm principles learned during the course. Due to the
space limitations, just two of application protocols, DNS and HTTP, were included in the paper, but that
should give proper overview of the methods used in our education process.

ACKNOWLEDGEMENTS
The registration fee and publication costs were paid by the author private business, Inter-biz, and not
his domicile institution (University North). Author can be contacted via [email protected] or
[email protected].
Graphical elements on some figures (Fig. 3) were downloaded from UxWing free icon library [21], while
other illustrative elements visible in figures being snapshots of lecture notes were found and imported
directly from Microsoft PowerPoint.

REFERENCES
[1] IETF, Network Working Group: “Requirements for Internet Hosts – Communication Layers”, RFC
1122 – status: Internet Standard, 1989. Retrieved from https://tools.ietf.org/html/rfc1122
[2] IETF, Network Working Group: “A TCP/IP Tutorial”, RFC 1180 – status: Informational, 1991.
Retrieved from https://datatracker.ietf.org/doc/html/rfc1180
[3] IETF, Network Working Group: “Domain Names – Implementation and specification”, RFC 1035 –
status: Internet Standard, 1987. Retrieved from https://datatracker.ietf.org/doc/html/rfc1035
[4] W3C (World Wide Web Consortium), HTTP Specifications and Drafts, Retrieved from
https://www.w3.org/Protocols/Specs.html
[5] IETF, Network Working Group: “File Transfer Protocol (FTP)”, RFC 959 – status: Internet Standard,
1985. Retrieved from https://datatracker.ietf.org/doc/html/rfc959

2946
[6] Internet Protocol Specification, RFC 791 – status: Internet Standard, 1981. Retrieved from
https://datatracker.ietf.org/doc/html/rfc791
[7] Internet Protocol version 6 Specification, RFC 8200 – status: Internet Standard, 2017. Retrieved
from https://datatracker.ietf.org/doc/html/rfc8200
[8] Google IPv6 Statistics. Accessed 10. May 2021. Retrieved from
https://www.google.com/intl/en/ipv6/statistics.html
[9] M. Mikac, M. Horvatić, V. Mikac, “Networking case study in STEM education – IP fragmentation”, in
INTED 2020 Proceedings, IATED Academy, pp. 1068-1077, 2020.
[10] IETF – Information Sciences Institute (ISI), University of Southern California, “Transmission Control
Protocol – DARPA Internet Program Protocol Specification”, status: Internet Standard, 1981.
Retrieved from https://tools.ietf.org/html/rfc793
[11] IETF – J. Postel, ISI, “User Datagram Protocol”, status: Internet Standard, 1980. Retrieved from
https://tools.ietf.org/html/rfc768
[12] M. Mikac, “Networking case study in STEM education – transport layer protocol (TCP and UDP)
labs”, in EDULEARN20 Proceedings, IATED Academy, pp. 2328-2337, 2020.
[13] IMUNES – Integrated Multiprotocol Network Emulator/Simulator. Accessed 12. May 2021.
Retrieved from http://imunes.net
[14] M. Mikac, M. Horvatić, “An approach for teaching and understanding computer networks using
realistic emulation tool”, in ICERI 2019 Proceedings, IATED Academy, pp. 1209-1219, 2019.
[15] Wireshark Network Analyzer. Accessed 1. May 2021. Retrieved from https://www.wireshark.org/
[16] Internet Assigned Numbers Authority - IANA, “List of root servers”. Accessed 12. May 2021.
Retrieved from https://www.iana.org/domains/root/servers
[17] Microsoft Docs. Accessed 12. May 2021., Retrieved from https://docs.microsoft.com/en-
us/windows-server/administration/windows-commands/nslookup
[18] Apache HTTP Server. Accessed 13. May 2021. Retrieved from https://apache.org
[19] Microsoft Visual Studio Code, Development IDE. Accessed 12. May 2021. Retrieved from
https://code.visualstudio.com/
[20] Google Public DNS. Accessed 13. May 2021. Retrieved from
https://developers.google.com/speed/public-dns
[21] UxWing Free Download icons. Accessed 12. May 2021. Retrieved from
https://uxwing.com/tag/networking-icons/

2947

You might also like