OCR PGOnline Full A-Level Textbook
OCR PGOnline Full A-Level Textbook
OCR PGOnline Full A-Level Textbook
Computer
Science
OCR AS and A Level
Computer Science
P.M. Heathcote
R.S.U. Heathcote
Published by
PG Online Limited
The Old Coach House
35 Main Road
Tolpuddle
Dorset
DT2 7EW
United Kingdom
[email protected]
www.pgonline.co.uk
2016
Acknowledgements
We are grateful to the OCR (Oxford Cambridge and RSA Examinations) for permission to use questions
from past papers.
The answers in the Teacher’s Supplement are the sole responsibility of the authors and have neither
been provided nor approved by the examination board.
We would also like to thank the following for permission to reproduce copyright photographs:
Screenshots of Arriva Bus App © Arriva PLC
Colossus photograph © The National Archives
Google Maps ‘StreetView’ © Google 2015
Screenshot from Roboform website © Roboform
Alan Turing © By kind permission of the Provost and Fellows, King’s College, Cambridge
from Archives Centre, King’s College, Cambridge. AMT/K/7/12
Trans-continental Internet connections © Telegeography
Internet registries map © Ripe NCC
thetrainline.com screenshot © by kind permission of thetrainline.com
Other photographic images © Shutterstock
A catalogue entry for this book is available from the British Library
ISBN: 978-1-910523-05-6
Copyright © P.M.Heathcote and R.S.U.Heathcote 2016
All rights reserved
No part of this publication may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means without the prior written permission of the copyright owner.
Printed and bound in Great Britain by Lightning Source Inc., Milton Keynes
ii
Preface
The aim of this book is to provide detailed coverage of the topics in the new OCR AS and A Level
Computer Science specification.
The book is divided into twelve sections and within each section, each chapter covers material that can
comfortably be taught in one or two lessons. Material that is applicable only to the second year of the full
A Level is clearly marked. Sometimes this may include an entire chapter and at other times, just a small
part of a chapter.
Each chapter contains exercises and questions, some new and some from past examination questions.
Answers to all these are available to teachers only in a free Teacher’s Pack which can be ordered from
our website www.pgonline.co.uk.
This book has been written to cover the topics which will be examined in the written papers at both
AS and A Level. Sections 10, 11 and 12 relate principally to problem solving skills, with programming
techniques covered in sufficient depth to allow students to answer questions in Component 02.
Pseudocode, rather than any specific programming language, is used in the algorithms given in the
text. Sample Python programs which implement many of the algorithms are included in a folder with the
Teacher’s Pack.
This resource is endorsed by OCR for use with specifications H046/H446 AS Level Computer Science
and A Level Computer Science. In order to gain OCR endorsement, this resource has undergone
an independent quality check. Any references to assessment and/or assessment preparation are
the publisher’s interpretation of the specification requirements and are not endorsed by OCR. OCR
recommends that a range of teaching and learning resources are used in preparing learners for
assessment. OCR has not paid for the production of this resource nor does OCR receive any royalties
from its sale. For more information about the endorsement process, please visit the OCR website,
www.ocr.org.uk.
iii
Contents
Section 1
Components of a computer 1
Chapter 1 Processor components 2
Chapter 2 Processor performance 7
Chapter 3 Types of processor 10
Chapter 4 Input devices 16
Chapter 5 Output devices 20
Chapter 6 Storage devices 25
Section 2
Systems software 29
Chapter 7 Functions of an operating system 30
Chapter 8 Types of operating system 36
Chapter 9 The nature of applications 39
Chapter 10 Programming language translators 44
Section 3
Software development 51
Chapter 11 Systems analysis methods 52
Chapter 12 Writing and following algorithms 57
Chapter 13 Programming paradigms 64
Chapter 14 Assembly language 69
Section 4
Exchanging data 74
Chapter 15 Compression, encryption and hashing 75
Chapter 16 Database concepts 82
Chapter 17 Relational databases and normalisation 88
Chapter 18 Introduction to SQL 95
Chapter 19 Defining and updating tables using SQL 101
Chapter 20 Transaction processing 106
iv
Section 5
Networks and web technologies 110
Chapter 21 Structure of the Internet 111
Chapter 22 Internet communication 119
Chapter 23 Network security and threats 126
Chapter 24 HTML and CSS 130
Chapter 25 Web forms and JavaScript 136
Chapter 26 Search engine indexing 142
Chapter 27 Client-server and peer-to-peer 147
Section 6
Data types 154
Chapter 28 Primitive data types, binary and hexadecimal 155
Chapter 29 ASCII and Unicode 159
Chapter 30 Binary arithmetic 162
Chapter 31 Floating point arithmetic 167
Chapter 32 Bitwise manipulation and masks 174
Section 7
Data structures 178
Chapter 33 Arrays, tuples and records 179
Chapter 34 Queues 184
Chapter 35 Lists and linked lists 190
Chapter 36 Stacks 200
Chapter 37 Hash tables 204
Chapter 38 Graphs 209
Chapter 39 Trees 214
v
Section 8
Boolean algebra 222
Chapter 40 Logic gates and truth tables 223
Chapter 41 Simplifying Boolean expressions 228
Chapter 42 Karnaugh maps 233
Chapter 43 Adders and D-type flip-flops 238
Section 9
Legal, moral, ethical and cultural issues 242
Chapter 44 Computing related legislation 243
Chapter 45 Ethical, moral and cultural issues 249
Chapter 46 Privacy and censorship 255
Section 10
Computational thinking 259
Chapter 47 Thinking abstractly 260
Chapter 48 Thinking ahead 265
Chapter 49 Thinking procedurally 268
Chapter 50 Thinking logically, thinking concurrently 272
Chapter 51 Problem recognition 277
Chapter 52 Problem solving 282
vi
Section 11
Programming techniques 287
Chapter 53 Programming basics 288
Chapter 54 Selection 294
Chapter 55 Iteration 299
Chapter 56 Subroutines and recursion 303
Chapter 57 Use of an IDE 313
Chapter 58 Use of object-oriented techniques 319
Section 12
Algorithms 327
Chapter 59 Analysis and design of algorithms 328
Chapter 60 Searching algorithms 334
Chapter 61 Bubble sort and insertion sort 340
Chapter 62 Merge sort and quick sort 345
Chapter 63 Graph traversal algorithms 351
Chapter 64 Optimisation algorithms 358
Index 364
vii
viii
Section 1
Components of a computer
In this section:
Chapter 1 Processor components 2
1
SECTION 1 – COMPONENTS OF A COMPUTER
• control unit
• buses
1-1
• arithmetic/logic unit (ALU)
• dedicated registers
Control Unit
The Control Unit controls and coordinates the activities of the CPU, directing the flow of data between
the CPU and other devices. It accepts the next instruction, decodes it into several sequential steps such
as fetching addresses and data from memory, manages its execution and stores the resulting data back
in memory or registers.
Buses
A bus is a set of parallel wires connecting two or more components of a computer. It typically consists of
8, 16, 32 or 64 lines.
The processor is connected to main memory by three separate buses. When the CPU wishes to access
a particular main memory location, it sends this address to memory on the address bus. The data in that
location is then returned to the CPU on the data bus. Control signals are sent along the control bus.
In the figure below, you can see that data, address and control buses connect the processor, memory
and I/O controllers. These three buses are known collectively as the system bus. Each bus is a shared
transmission medium, so that only one device can transmit along a bus at any one time.
Data and control signals travel in both directions between the processor, memory and I/O controllers.
Addresses, on the other hand, travel only one way along the address bus: the processor sends the
address of an instruction, or of data to be stored or retrieved, to memory or to an I/O controller.
2
CHAPTER 1 – PROCESSOR COMPONENTS
Control Bus
Data Bus
Address Bus
System bus
Control bus
The control bus is a bi-directional bus, meaning that signals can be carried in both directions. The data
and address buses are shared by all components of the system. Control lines must therefore be provided
to ensure that access to and use of the data and address buses by the different components of the
system does not lead to conflict.
The purpose of the control bus is to transmit command, timing and specific status information between 1-1
system components.
Control lines include:
• Bus Request: indicates that a device is requesting the use of the data bus
• Bus Grant: indicates that the CPU has granted access to the data bus
• Memory Write: causes data on the data bus to be written into the addressed location
• Memory Read: causes data from the addressed location to be placed on the data bus
• Interrupt request: indicates that a device is requesting access to the CPU
• Clock: used to synchronise operations
Data bus
The data bus, typically consisting of 8, 16, 32 or 64 separate lines, provides a bi-directional path for
moving data and instructions between system components.
Address bus
Memory is divided up internally into units called words. A word is a fixed size group of digits, typically 16,
32 or 64 bits, which is handled as a unit by the processor, and different types of processor have different
word sizes.
Each word in memory has its own specific address. The address bus transmits the memory addresses
of words that are used as operands in program instructions, so that the data can be retrieved and sent
back to the processor. When an instruction has been performed and the result is to be stored at a
particular memory location, it is transmitted via the data bus.
3
SECTION 1 – COMPONENTS OF A COMPUTER
Registers
Registers are special memory cells that operate at very high speed. All arithmetic, logical or shift
operations take place in registers and there are typically up to 16 general purpose registers in the CPU.
However, although most modern computers have many registers, some special-purpose processors
still use a single accumulator, in order to simplify the design. The accumulator takes the place of the
general purpose registers. For simplicity, we will assume that all calculations take place in a single
register called the accumulator.
Carrying out instructions one after the other requires many different pieces of information to be held.
As well as the accumulator, there are several other special-purpose registers:
• the program counter (PC), which holds the address of the next instruction to be executed.
This may be the next instruction in a sequence of instructions, or, if the current instruction is a
branch or jump instruction, the address to jump to, copied from the current instruction register (CIR)
to the PC.
• the current instruction register (CIR), which holds the current instruction being executed, divided
into operand and opcode.
1-1 • the memory address register (MAR), which holds the address of the memory location from which
data (or an instruction) is to be fetched or to which data is to be written.
• the memory data register (MDR), which is used to temporarily store the data read from or written
to memory. It is also sometimes known as the memory buffer register.
A simplified diagram showing the connections between these registers is shown below.
Accumulator
MDR
PC
Main
CIR MAR memory
Q1: Which registers hold and transfer data or instructions? Which registers hold and transfer
the memory addresses of data or instructions?
4
CHAPTER 1 – PROCESSOR COMPONENTS
Fetch
Execute Decode
(Decode phase)
4. The instruction held in the CIR is decoded. The instruction is split into opcode and operand and
the opcode is used to determine the type of instruction and what hardware to use to execute it. The
operand holds either:
• the address of the data to be used with the operation, which is then copied to the MAR, or
• the actual data to be operated on, which will be copied to the MDR
• the data to be operated on may be passed to the ALU/accumulator
(Execute phase)
5. The appropriate instruction/opcode is carried out on the operand.
Q2: At which stage in this process is the ALU needed? At which stage is the accumulator involved?
5
SECTION 1 – COMPONENTS OF A COMPUTER
Exercises
1. (a) In the context of computer architecture, explain what is meant by the term bus. [2]
(b) Name three control lines used by the control bus. [3]
(c) What is the data bus used for? [2]
3 1
2
Main
4 5 memory
1-1 (a) Provide full names for the components numbered 1 to 5 by completing the table below.
step 2a
2
step 2b
3 step 3
4 step 4
6
CHAPTER 2 – PROCESSOR PERFORMANCE
• Clock speed
• The number of cores, or duplicate processors, linked together on a single chip
• The amount and type of cache memory
Clock speed
The system clock generates a series of signals, switching between 0 and 1 several million times per
second and synchronising CPU operations. Each CPU operation starts as the clock changes from 0 to 1
(or in some systems from 1 to 0), and the CPU cannot perform operations faster than the clock cycle
(the time the clock takes to go from 0 to 1 and back to 0).
All processor activities begin on a clock pulse, although some activities may take more than one clock
cycle to complete. One clock cycle per second = 1 Hertz (Hz), and clock speed is measured in Gigahertz
(GHz), about 1 billion cycles per second. Typical speeds for a PC are between 2 and 4 GHz. The greater
1-2
the clock speed, the faster instructions will be executed.
Number of cores
In a traditional computer (von Neumann machine), instructions are fetched and executed one at a time
in a serial manner. However, many computers nowadays have multiple cores. A dual-core processor
has two processors linked together in the same integrated circuit, and a quad-core computer has four
linked processors.
Each core is theoretically able to process a different instruction at the same time with its own
fetch-execute cycle, making the processor two or even four times faster with a quad-core chip.
However, although a dual-core processor has twice the power, it does not always perform twice as
fast, because the software may not always be able to take full advantage of both processors.
Q1: Explain why computers with slower processors but larger cache memory are often found to
be faster in system performance tests than computers with faster processors but more limited
cache.
7
SECTION 1 – COMPONENTS OF A COMPUTER
CPU
Cache
Main memory
A-Level only
Pipelining
Pipelining is a technique used by some processors to improve performance. Without pipelining, the steps
in the Fetch-Execute cycle take place one after the other. While the next instruction is being fetched, the
ALU, the arithmetic part of the processor, is idle.
Using pipelining, the computer architecture allows the next instructions to be fetched at the same time as
the processor is performing arithmetic or logic operations, holding them in a buffer close to the processor
1-1 until the instruction can be performed.
Processor pipelining is sometimes divided into an instruction pipeline and an arithmetic pipeline.
The instruction pipeline consists of the stages in which an instruction is moved through the processor,
including its being fetched, buffered and then executed. The arithmetic pipeline represents the parts
of an arithmetic operation that can be broken down and overlapped as they are performed.
Pipelining is now common in microprocessors used in personal computers. Intel’s Pentium chip uses
pipelining to execute as many as six instructions simultaneously.
8
CHAPTER 2 – PROCESSOR PERFORMANCE
Data bus
The data bus transmits the data held in a word of memory, between processor components and memory.
The largest operand (which is either an address or an actual value) that can be held in a word is therefore
related to the size of the data bus. If the data bus is 16 bits wide, a word cannot hold an integer greater
than 216 -1, or more than two characters. A wider data bus can transmit larger values, or more characters
at a time, or allow more bits per instruction.
Addressing
Basic machine operation
mode
0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1
In assembly language, the operation code (opcode) will be expressed as a mnemonic such as ADD,
SUB, LDA (load into the accumulator) etc. With only six bits for the opcode, there cannot be more than
26 different instructions. The operand has to be held in only 8 bits. This would clearly not be practical in a
general purpose computer which is more likely to have a word size of 32, 64 or 128 bits.
Exercises 1-2
1. Name and describe briefly three of the main factors affecting processor performance. [9]
(b) The machine for which this program is written has limited addressing capability.
What are the highest and lowest memory addresses that can be addressed by this
machine?[2]
(c) What is the width of the address bus in this machine? [1]
9
SECTION 1 – COMPONENTS OF A COMPUTER
.
.
.
Data
Data
In a von Neumann machine, the same data bus is used to transfer both data and instructions.
Similarly, a single address bus is used to transfer the addresses of data and instructions. The same
word length is used for all memory, whether it holds data or instructions.
10
CHAPTER 3 – TYPES OF PROCESSOR
Harvard architecture
The Harvard architecture is a computer architecture with physically separate memories for instructions
and data. Harvard architecture is used extensively with embedded Digital Signal Processing (DSP)
systems. DSP applications include audio and speech signal processing, sonar and radar signal
processing, biomedical signal processing, seismic data processing and digital image processing.
• The two different memories can have different characteristics; for example, in embedded systems
instructions may be held in read-only memory while data memory requires read-write memory
• In some systems, there is much more instruction memory than data memory so a larger word size is
used for instructions
• the instruction address bus may be wider than the data bus
Embedded systems include special-purpose computers built into devices often operating in real time,
such as those used in navigation systems, traffic lights, aircraft flight control systems and simulators.
Harvard architecture can be faster than von Neumann architecture because data and instructions can be
fetched in parallel instead of competing for the same bus.
ALU
1-3
Instructions Control Data
memory unit memory
I/O
Harvard architecture
11
SECTION 1 – COMPONENTS OF A COMPUTER
Q1: Which kind of assembly language is generally easier for programmers to code in? An assembly
language for a CISC or RISC processor? Why?
12
CHAPTER 3 – TYPES OF PROCESSOR
Co-processor systems
A co-processor is an extra processor used to supplement the functions of the primary processor
(the CPU). It may be used to perform floating point arithmetic, graphics processing, digital signal
processing and other functions. It may not be a general-purpose processor with the ability to fetch its
own instructions, do input and output operations and so on. It generally carries out only a limited range
of functions.
Job despatcher
Job scheduler
Some browsers such as Google Chrome and Mozilla Firefox can run several concurrent processes, and
a quad-core CPU-based mobile device will deliver higher performance than a single- or dual-core device.
All four CPUs may operate when tabbed browsing is used, for example.
A-Level only
13
SECTION 1 – COMPONENTS OF A COMPUTER
A-Level only
A GPU is a form of co-processor, and can be used with a CPU to accelerate scientific, engineering,
analytics and other applications, offloading compute-intensive parts of an application to the GPU while
the reminder of the code runs on the CPU. From a user perspective, performance is significantly better.
Computer-intensive Functions
GPU
10% of code
CPU
Rest of Sequential
CPU code
Application code
+
Q2: The image on a computer screen is typically made up of about a million pixels at a common
resolution setting. Explain why a graphics card will improve the performance of a computer
1-3 running a 3-D game.
In March 2016 world champion Go player Lee Se-dol from South Korea was defeated by Google’s
DeepMind AlphaGo program. This was the first time a computer had been able to beat a human
player at the game. DeepMind started by taking
a huge database of professional Go matches and
training a program to try to predict what move
would come next in any given situation.
14
CHAPTER 3 – TYPES OF PROCESSOR
Exercises
1. (a) (i) Give the name of the computer architecture that uses the fetch-execute cycle with
a single control unit. [1]
(ii) Registers used during the fetch-execute cycle include the current instruction
register (CIR), memory address register (MAR), memory data register (MDR) and
program counter (PC).
Place ticks in the table to show which statements are correct during processing. [4]
CIR MDR PC
Holds a binary value
Always holds only an address
May change more than once during a single cycle
May pass a value to the MAR
(b) (i) Compare a Complex Instruction Set Computer (CISC) architecture with a Reduced
Instruction Set Computer (RISC) architecture. [4]
(ii) Explain one advantage, other than cost, of RISC compared with CISC. [2]
15
SECTION 1 – COMPONENTS OF A COMPUTER
Barcodes
Barcodes first started appearing on grocery items in the 1970s, and today they are used for identification
in thousands of applications from tracking parcels, shipping cartons, passenger luggage, blood, tissue
and organ products around the world to the sale of items in shops and the recording of the details
of people attending events. Keeping track of anything accurately is now almost unimaginable
without barcodes.
1-4
There are two different types of barcode: Linear barcodes such as the one shown above and 2D
barcodes such as the Quick Response (QR) code, which can hold more information than the 1D barcode.
A 2D barcode
2D barcodes are used for example in ticketless entry to concerts, or access through gates to board
a Eurostar train or passenger airline. They are also used in mobile phone apps that enable the user to
take a photo of the code which may then provide them with further information such as a map of their
location, product details or a website URL.
Barcode readers
There are four different barcode readers available, each using a slightly different technology for reading
and decoding a barcode. The four types are pen-type readers, laser scanners, CCD readers and camera-
based readers.
16
CHAPTER 4 – INPUT DEVICES
Pen-type readers
In a pen-type reader, a light source and a photo diode are placed next to each other in the tip of a pen.
To read a barcode, the tip of the pen is dragged across all the bars at an even speed. The photo diode
measures the intensity of the light reflected back from the light source and generates a waveform that is
used to measure the widths of the bars and spaces in the barcode.
Because of their simple design, pen-type scanners are the most durable type of barcode scanner, and
can be tightly sealed against dust, dirt, and other environmental hazards. However, their applications are
limited because they must come into direct contact with a barcode to read it.
Their small size and low weight makes this type of barcode scanner ideally suited for use with portable
1-4
(laptop) computers or very low volume scanning applications.
Laser scanners
Laser scanners work in the same way as pen scanners except they use a laser beam as the
light source. They are available in a variety of forms, the most familiar being the in-counter units
in supermarkets. They are reliable and economical for low-volume applications.
A laser scanner
17
SECTION 1 – COMPONENTS OF A COMPUTER
Camera-based readers
A camera-based imaging scanner uses a camera and image processing techniques to decode a 1D or
2D bar code. An imaging scanner can read a barcode on any surface, printed or onscreen, and can also
read a code that is damaged or poorly printed. They are used in multiple applications such as:
Digital cameras
A digital camera uses a CCD or CMOS (Complementary Metal Oxide Semiconductor) sensor comprising
millions of tiny light sensors arranged in a grid. The binary data from each sensor is recorded onto the
camera’s memory card so that the image can be reproduced using suitable software at a computer.
A CCD sensor tends to produce higher quality images and they are used in higher end cameras. They are
also more reliable since the technology has been around for much longer. This however, is at the cost of
1-4
power consumption, using up to 100 times that of a CMOS sensor.
Q1: Suggest a suitable sensor type for use in a mobile phone camera and give a reason why.
Mastercard is testing a new app that allows customers to make purchases online by taking a selfie rather
than entering a password. Currently, Mastercard customers enter a password at the point of sale to
verify their identity, but these can be forgotten, stolen or intercepted.
Participants in the trial are prompted to take a photograph of their face using the Mastercard app, which
is then converted to a binary code using facial recognition technology. This is then compared with a
stored code and if the two match up, the purchase is approved.
18
CHAPTER 4 – INPUT DEVICES
1-4
RFID chip
Exercises
1. Describe three different input devices that are used by police for crime detection and prevention. [6]
2. Describe three different input devices used at a self-checkout in a supermarket, stating for what
purpose each of them is used. [6]
19
SECTION 1 – COMPONENTS OF A COMPUTER
Output devices
Output devices take data produced by the computer and turn it into a form that humans can understand.
This could be, for example, written or spoken text, an image on a screen, music or a multimedia
presentation. A different type of output device is an actuator, which might respond to an input signal to
turn on a sprinkler, open or close windows in a greenhouse, or perform any number of other actions.
Common output devices include screens, printers, multimedia projectors, speakers and actuators.
Screens
There are various different screen technologies used for computers, phones and other devices.
LCD monitors
Liquid crystal display (LCD) monitors contain groups of red, green and blue diodes to form each pixel.
The screen is typically back-lit using light-emitting diodes (LEDs). These have several advantages over
older technology:
20
CHAPTER 5 – OUTPUT DEVICES
OLED screens can be used wherever LCD screens are used, for TV and computer screens, MP3 and
cell phone displays. In the future they may be used to make inexpensive animated billboards, super-thin
pages for electronic books and magazines, as paintings on a wall that can be updated from a computer
or even in clothes – so-called “wearable technology”.
They have many advantages over LCDs:
• when made of plastic rather than glass, they are theoretically flexible enough to print onto clothing
• they are much thinner
• they are brighter and need no backlighting, so they consume less power, which translates into longer
battery life in a portable device
• LCDs can be slow to refresh (a problem in fast-moving sports or computer games), OLEDs respond
up to 200 times faster
• they produce truer colours through a much bigger viewing angle, unlike LCDs where the colours
darken and disappear if you look from the side
One drawback is that OLEDs do not last as long, tending to wear out around four times faster than
LCDs. They are also very sensitive to water, which is a potential problem in a cellphone.
Printers
Laser printers
Laser printers offer high-quality, high-speed printing. Their function is similar to that of a photocopier,
using powdered ink called toner.
1-5
This type of printer is becoming increasingly affordable and is frequently used as a home printer, in
businesses and in professional printing services. Colour laser printers are far more expensive to run
than black and white versions. They contain four toner cartridges (Cyan, Magenta, Yellow and Black or
CMYK) and the paper must go through a similar process to the black-only printer four times; once for
each colour.
The usage of laser printers for print jobs other than text is limited by the quality of the print produced,
which at about 1200 dpi makes photorealistic prints impossible and best left to inkjet printers.
Paper Exit
Laser
Mirror unit
Toner
Hopper
Print
Drum
Heated Fuser
Paper Tray
21
SECTION 1 – COMPONENTS OF A COMPUTER
Inkjet printers
Inkjet printers work by spraying minute dots of ink onto paper to create an image. Depending on the
resolution (dots per inch) of the model, the number of colour cartridges used and the quality of the
paper being used, they can produce excellent, photo-realistic images. They are cheaper than laser
printers but much slower, and the ink cartridges have to be replaced quite frequently.
Given the choice, it is preferable to use a laser printer when a lot of text needs to be printed, and an
inkjet printer to produce high quality photographic images.
3-D printers
3D printers have been used to create car and aeroplane parts, medical equipment, prosthetic limbs,
fashion accessories and a multitude of other items. They have even, controversially, been used to
produce working firearms and other weapons.
They are used for creating spare parts for obsolete equipment and to produce prototypes of new
products. They can be used in many situations where a one-off item is required, for example to fill in the
missing parts of a dinosaur skeleton or a 2000-year old artefact.
1-5
Multimedia projectors
What are the benefits of using a multimedia presentation in a classroom? There are many benefits both to
teachers and students:
• in the bad old days 20 or more students would crowd around a desk trying to catch a glimpse of
what the teacher was demonstrating on a 16” screen.
• copying down notes written on a chalkboard or whiteboard was a chore
• having an image to focus on while the teacher is explaining something can aid concentration
• watching educational videos or even live webcams adds interest to the lesson
22
CHAPTER 5 – OUTPUT DEVICES
From the teacher’s point of view, being able to prepare the lesson in advance and deliver it to several
different groups without having to write the same thing on the board every lesson, means the lessons are
consistent in quality. With the aid of a projector, the teacher can present text, graphics, audio and video
on the screen, display images or videos from the Internet, display PC applications or programs, and use
the screen interactively, adding impact to every lesson.
Multimedia projectors are now viewed as essential classroom tools.
Computer speakers
PCs, smartphones and other portable devices generally have a basic inbuilt speaker which can be used
to output music, voice or sound tracks from a video. High quality speakers can be bought separately and
when in use, they disable the inbuilt speakers.
1-5
Apart from playing music and video soundtracks, uses include giving verbal instructions in a sat-nav
system, reading text from the screen for visually impaired people, giving warning beeps and notification
alerts (e.g. when you receive an email).
Actuators
Actuators are motors that are commonly used in
conjunction with sensors to control a mechanism,
for example:
23
SECTION 1 – COMPONENTS OF A COMPUTER
Exercises
1. Computer software is used in Geography lessons to teach students about weather systems.
(a) (i) State the purpose of an input device in a computer system when using
this software. [1]
(ii) State the purpose of an output device in a computer system when using this software. [1]
(b) Describe how the following forms of output will be used by the software.
2. State, with reasons, what type of printer you would recommend for the following applications:
(a) Invoice/delivery note printed on 3-part paper with 2 carbon copies. [3]
3. What type of screen would you recommend for an in-flight entertainment system?
1-5 Give reasons for your choice. [5]
24
CHAPTER 6– STORAGE DEVICES
Hard disk
A hard disk uses rigid rotating platters coated with magnetic material. Ferrous (iron) particles on the disk
are polarised to become either a north or south state. This represents 0 and 1. The disk is divided into
tracks in concentric circles, and each track is subdivided into sectors. The disk spins very quickly at 1-6
speeds of up to 10,000 RPM. Like an old record player, a drive head (like the needle on a record player)
moves across the disk to access different tracks and sectors. Data is read from or written to the disk as
it passes under the drive head. When the drive head is not in use, it is parked to one side of the disk in
order to prevent damage from movement. A hard disk may consist of several platters, each with its own
drive head.
Track Sector
Cluster of
four sectors
Read / Write
Platters head
Although hard disks are less portable than optical or solid state media, their huge capacity makes them
very suitable for desktop purposes. Smaller, denser surface areas spinning under the read-write heads
mean that newer 3.5 inch disks have capacities of up to 640GB.
25
SECTION 1 – COMPONENTS OF A COMPUTER
Optical disk
Optical disks come in three different formats: read-only (e.g. CD-ROM), recordable (e.g. CD-R) and
rewritable (e.g. CD-RW. An optical disk works by using a high powered laser to “burn” (change the
chemical properties of) sections of its surface, making them less reflective. A laser at a lower power is
used to read the disk by shining light onto the surface and a sensor is used to measure the amount of
light that is reflected back. A read-only CD-ROM disk pressed during manufacture has pits in its surface.
Those areas that have not been pitted, are called lands. At the point where a pit starts or ends, light is
scattered and therefore not reflected so well. Reflective and non-reflective areas are read as 1s and 0s.
There is only one single track on an optical disk, arranged as a tight spiral.
A CD-ROM holds about 700MB of data, whereas a Blu-Ray disk (designed to supersede the DVD disk)
can hold 50GB. These disks are the same size; their added capacity is owing to the shorter wavelength
in the laser they use. This creates much smaller pits, enabling a greater number to fit in the same space
along the track and also means that the track can be more tightly wound, and therefore much longer.
Recordable disks use a reflective layer with a transparent dye coating that becomes less reflective when
a spot laser “burns” a spot in the track.
Rewriteable compact disks use a laser and a magnet in order to heat a spot on the disk and then set
its state to become a 0 or a 1 using the magnet before it cools again. A DVD-RW uses a phase
change alloy that can change between amorphous and crystalline states by changing the power
of the laser beam.
1-6
Optical storage is very cheap to produce and easy to send through the post for distribution purposes.
It can however be corrupted or damaged easily by excessive sunlight or scratches.
Q1: What type of storage medium is commonly used to hold a feature-length film?
26
CHAPTER 6– STORAGE DEVICES
Inside, however, instead of platters and a read-write head, there is an array of chips arranged on a board.
These components are put into the standard size “housing” so that they fit into existing laptops and
desktop PCs. Solid state memory comprises millions of NAND flash memory cells, and a controller that
manages pages and blocks of memory. Each cell works by delivering a current along the bit and word
lines to activate the flow of electrons from the source towards the drain. The current on the word line
however is strong enough to force a few electrons across an insulated oxide layer into a floating gate.
Once the current is turned off, these electrons are trapped. The state of the NAND cell is determined
by measuring the charge in the floating gate. No charge (with no electrons) is considered a 1 and some
charge is considered a 0.
Data is stored in pages (typically 4KiB each), grouped into blocks of say, 512KiB. NAND flash memory
cannot overwrite existing data. The old data must be erased before data can be written to the same 1-6
location, and although data can be written in pages, the technology requires the whole block to be
erased. As writing to a specific block of NAND cells cannot be done directly, a separate block is created
to mirror the data to be transferred to the solid state memory and the data is then written to the new
block. The contents of the original block are marked as “invalid” or “stale” and are erased when the user
wants to write new data to the drive.
Word Line
+ Control Gate
Oxide Layers Floating Gate
Bit Line
Source Drain
Ground +
Although capacity is still relatively low, solid state media have faster access speed than hard disks. With
no need to move a read-write head across the disk, one piece of data can be accessed just as quickly as
any other, even if they are not close together.
SSDs consume far less power than traditional hard drives, meaning that in a laptop, for example, battery
life is extended and they stay cooler. In addition, they are less susceptible to damage.
They are also silent in operation, lighter and highly portable – all considerable advantages in personal
devices such as mobile phones and MP3 players for example.
Q2: Look up some specifications and prices for hard drives and SSDs.
27
SECTION 1 – COMPONENTS OF A COMPUTER
Virtual storage
In some cases, the computer’s RAM may not be not large enough to store all these programs
simultaneously, so the hard disk is used as an extension of memory – called virtual memory. MS Word
may be open on your desktop but if you are not actually using it at a particular time, the operating system
may copy the Word software and data to hard disk to free up RAM for the browser software, the VB
compiler or whatever you as the user have requested. When you switch back to Word, the operating
system will reload it into memory.
Exercises
1. (a) Describe how data is written to and read from a CD-R disk.
(b) A school has archived all its students’ reports on to CD-R. Some years later, a copy of a
1-6 particular student’s reports is requested. Unfortunately it is found that the documents cannot
be opened.
2. If you are considering purchasing a high-end desktop or laptop you might be offered the option
of a solid-state drive (SSD) rather than a traditional hard disk drive.
(a) Describe briefly how a solid-state drive differs from a hard disk in its operation. [6]
(b) Ignoring any differences in price and assuming that both drives have the same capacity,
state four reasons why you might choose the solid-state drive. [4]
28
Section 2
Systems software
In this section:
Chapter 7 Functions of an operating system 30
29
SECTION 2 – SYSTEMS SOFTWARE
Application
2-7 software Operating
User Hardware
system
Q1: A PC is a multitasking machine. What does this mean? Is it the same thing as multiprocessing?
30
CHAPTER 7 – FUNCTIONS OF AN OPERATING SYSTEM
Memory management
A PC allows a user to be working on several tasks at the same time. You may be listening to music via
a streaming site such as Spotify, entering a Python program, checking your emails every so often and
running Word so that you can document your program design.
Each program, open file or copied clipboard item, for example, must be allocated a specific area of
memory whilst the computer is running. Should a user wish to switch from one application to another
in a separate window, each application must be stored in memory simultaneously. The allocation and
management of space is controlled by the operating system.
A Page 2 158
Physical memory
(RAM)
Segmentation is the logical division of address space into varying length segments which depend on
the program structure. As with paging, it is possible to load only a part of a program into memory initially.
Virtual memory
Memory is not limitless, so as more and more jobs are loaded into memory, the operating system may
swap pages of temporarily inactive jobs out to disk, thus using secondary storage as an extension of
memory to make room for the next job which has a share of processor time.
If a large number of jobs are loaded and the computer has insufficient memory, you may notice a
deterioration in performance as pages are swapped in and out of RAM, to the point where the operating
system is spending most of its time swapping pages in and out and performance slows right down.
On a PC you can look at “System Information” to see how much RAM and virtual memory is available.
31
SECTION 2 – SYSTEMS SOFTWARE
Interrupts
An interrupt is a signal from a software program, hardware device or internal clock to the CPU.
A software interrupt occurs when an application program terminates or requests certain services from
the operating system. A hardware interrupt may occur, for example, when an I/O operation is complete
or an error such as `Printer out of paper’ occurs.
Interrupts are also triggered regularly by a timer, to indicate that it is the turn of the next process to have
processor time (see ‘Processor scheduling’ below). It is because a processor can be interrupted that
multi-tasking can take place.
Interrupts are assigned priorities, and lower priority interrupts may be disabled while a higher priority
interrupt is being serviced. Examples of interrupts in descending order of priority, are given below:
• Power-fail interrupt
• Clock interrupt
• An I/O device sends a signal requesting service or signalling end of I/O operation
Once the interrupt has been serviced, the original values of the registers are retrieved from the stack and
the process resumes from the point that it left off.
2-7
A test for the presence of interrupts is carried out at the end of each fetch-decode-execute cycle.
No
Q2: Do you notice that response time slows down on your PC when you have a lot of programs
running? What would be the effect of installing more RAM on your computer? Why?
Processor scheduling
With computers able to run multiple applications simultaneously, the operating system is responsible
for allocating processor time to each one as they compete for the CPU. While one application is busy
using the CPU for processing, the OS can queue up the next process required by another application to
make the most efficient use of the processor. A computer with a single processor can only process one
instruction at a time, but by carrying out small parts of multiple larger tasks in turn, the processor can
give the appearance of carrying out several tasks simultaneously. This is what is meant by multi-tasking.
The scheduler is the operating system module responsible for making sure that processor time is used
as efficiently as possible. Of course, this is a much more complex task on a large multi-user network
where many users may, for example, be accessing the same database or running different applications.
32
CHAPTER 7 – FUNCTIONS OF AN OPERATING SYSTEM
Round robin
In round robin scheduling, processes are despatched on a first in first out (FIFO) basis, with each process
in turn being given a limited amount of CPU time called a time slice or quantum. If the process does not
complete before its time expires, or before a higher priority interrupt occurs, the despatcher gives the
CPU to the next process.
In order to do this, the operating system sets an interrupting clock or interval timer to generate
interrupts at specific times. This method of scheduling helps to guarantee a reasonable response time to
all users of the system. In some systems, a system of priorities may allow a high priority job to have more
than one consecutive time slice when their turn comes round
Processor time shared
A B C A B C B C
time slices
First come first served
Jobs are processed in the order in which they arrive, with no system of priorities. 2-7
Shortest remaining time
The process with the smallest estimated time to completion is run next. This tends to reduce the number
of waiting jobs, and the number of small jobs waiting behind big jobs. Its disadvantage is that it requires
knowledge of how long a job will take, so the user has to estimate the job time. This is possible for
batch jobs such as payroll, which are performed regularly and usually run overnight, or for any scientific,
commercial or other jobs which are run regularly.
33
SECTION 2 – SYSTEMS SOFTWARE
Peripheral management
Different applications will require different input or output devices throughout their operation. If you send a
file to print, the operating system will need to communicate with the printer to check that it is switched on
and online, check that it is a printer and not, say, the keyboard and begin communication to send it the
correct data to print.
The data to be printed will then be transferred to an area of memory called a buffer, so that the CPU
can continue with another task. The purpose of the buffer is to compensate for the difference in speed
between the printer, or other output device, and the CPU.
Exercises
1. (a) An operating system uses interrupts which have priorities.
Describe the sequence of steps which would be carried out by the interrupt handler when an
interrupt is received and handled. [6]
(b) The operating system of a personal computer supports multi-tasking. One of the operating
2-7 system functions is memory management.
Describe two different strategies which could be used to manage the available memory. [6]
2. (a) An operating system uses scheduling. One method of scheduling is first come, first served.
(i) Explain why the first come, first served scheduling method may not be efficient. [2]
(ii) Explain why an interrupt is necessary during the transfer of data from the computer
to the printer. [3]
34
CHAPTER 8 – TYPES OF OPERATING SYSTEM
server
2-8
Router / firewall
LAN
Internet
LAN LAN
server
A multi-tasking system
A multi-tasking operating system may run on a standalone computer such as a PC or laptop.
The Windows operating system, for example, can run many jobs simultaneously, switching between
them so that each one appears to be the only one running. You may be playing music, entering a Python
or VB program, and checking your emails occasionally. At any one time if you look at the Task Manager
(press Ctrl-Shift-Esc) you will probably find it has several programs in memory, most of which are not
currently executing.
35
SECTION 2 – SYSTEMS SOFTWARE
CPU
Most mobile operating systems are tied to specific hardware. Smartphones have two operating
systems – the main system operating the user interface and running the application software
and a second, low-level proprietary real-time operating system which operates the radio and
other hardware. These low-level systems have a range of security vulnerabilities permitting others
to gain control over a mobile device.
36
CHAPTER 8 – TYPES OF OPERATING SYSTEM
2-8
The operating system on the aircraft or similar safety-critical system must have the following features:
• it must respond very quickly to any inputs or sensors
• it must be able to deal with many inputs simultaneously
• it must have “failsafe” mechanisms designed to detect and take appropriate action if a hardware
component fails
• it must incorporate redundancy – that is, if one component fails, it must automatically switch to
backup hardware
37
SECTION 2 – SYSTEMS SOFTWARE
Device drivers
A device driver is a computer program that provides a software interface to a particular hardware device.
This enables operating systems to access hardware functions without needing to know the details of
the hardware being used. When you attach a new printer to your computer, for example, you will have
to install the device driver program that comes with it before it will work. Sometimes the OS will do this
automatically if it detects that the printer is one for which it already has a driver.
Drivers are hardware dependent and operating system specific. A driver typically communicates with
the device via the system bus or communications subsystem to which the hardware connects. When
a calling program invokes a routine in the driver, the driver issues commands to the device. Once the
device sends data back to the driver, the driver may invoke routines in the original calling program.
Virtual machine
2-8
2-7 A virtual machine can be defined as any instance where software is used to take on the function of the
machine, including executing intermediate code or running an operating system within another to emulate
different hardware.
Exercises
1. List four features of the user interface which you would expect to find on a smartphone
but not on a PC. [5]
2. Compare and contrast the functions of operating systems designed for a personal computer
and a satellite-navigation system in a car. In this question you will also be assessed on your
ability to use good English and to organise your answer clearly in complete sentences,
using specialist vocabulary where appropriate. [7]
38
CHAPTER 9 – THE NATURE OF APPLICATIONS
Categories of software
Software may be grouped into separate categories, illustrated in the figure below.
Software
Systems Applications
Software Software
Classification of software
Software can be broadly classified into systems software and applications software.
Systems software
System software is the software needed to run the computer’s hardware and application programs.
This includes the operating system, utility programs, libraries and programming language translators.
Libraries and programming language translators will be considered in the next chapter.
Operating system
In the last two chapters we looked at different types of operating system and the function of an operating
system. The OS is a set of programs that lies between applications software and the computer hardware,
and has many different functions, including:
• resource management – managing all the computer hardware including the CPU, memory, disk
drives, keyboard, monitor, printer and other peripheral devices
• provision of a user interface (e.g. Windows) to enable users to perform tasks such as running
application software, changing settings on the computer, downloading and installing new
software, etc.
39
SECTION 2 – SYSTEMS SOFTWARE
Utility programs
Utility software is system software designed to optimise the performance of the computer or perform
tasks such as backing up files, restoring corrupted files from backup, compressing or decompressing
data, encrypting data before transmission, providing a firewall, etc.
Disk defragmentation
A disk defragmenter is a program that will reorganise a magnetic hard disk so that files which have
been split up into blocks and stored all over the disk will be recombined in a single series of sequential
blocks. This makes reading a file quicker. The software utility Optimise Drives, previously called Disk
Defragmenter, runs automatically on a weekly schedule on the latest versions of Windows. You can also
optimise drives on your PC manually.
Automatic backup
Several free automatic backup utilities are available for personal and commercial use. An automatic
2-9
2-7 backup utility will allow the user to specify
• Where you want to store the backup (the destination)
• What you want to backup (the sources)
• How you want to run the backup (using full backup that zips the files, or mirror backup that doesn’t
zip them)
• When you want to run the backup (you can schedule it to run automatically or run it manually)
You can then run the backup manually (typically by using a function key) or schedule it to run
automatically. (See for example http://www.fbackup.com/ )
Automatic updating
An automatic update utility makes sure that any software installed on the computer is up-to-date. For any
software already installed on the computer, the automatic update utility will regularly check the Internet
for updates. These will be downloaded and installed if they are newer than the version already on
the computer.
Firewalls and antivirus software must be updated regularly as new viruses and threats are constantly
being devised and discovered.
Application software should also be updated as there will be bug fixes and improvements that become
available to people with a licence for that package.
Virus checker
A virus checker utility checks your hard drive and, depending on the level of protection offered,
incoming emails and internet downloads, for viruses and removes them. Windows 8.1 comes with built-in
virus protection called Windows Defender.
40
CHAPTER 9 – THE NATURE OF APPLICATIONS
Compression software
Several utility programs are supplied as part of the operating system. These include utilities to copy, move
and delete files, create, move and delete folders, provide screensavers. Other utility programs such as
WinZip for compressing and sharing files have to be purchased from independent suppliers.
Zipped or compressed files can be transmitted much more quickly over the Internet. Sometimes there is
a limit to the size of a file which can be transmitted – if you have a 15Mb photograph, you will not be able
to email it to a friend if there is a 5Mb limit on the attachments they can receive. Even if they can receive
the file, it may take several minutes to download if they do not have a broadband connection.
Applications software
Applications software can be categorised as general-purpose, special-purpose or custom-written
(bespoke) software.
General-purpose software such as a word-processor, spreadsheet or graphics package, can be used 2-9
for many different purposes. For example, a graphics package may be used to produce advertisements
or animations, manipulate photographs, draw vector or bitmapped images.
Special-purpose software performs a single specific task or set of tasks. Examples include payroll
and accounts packages, hotel booking systems, fingerprint scanning systems, browser software and
hundreds of other applications. Software may be bought “off-the-shelf”, ready to use, or it may be
specially written by a team of programmers for a particular organisation. If, say, a hotel wants to buy
some visitor booking software, they may be able to find a ready-made package that is quite suitable, or
they may want a bespoke software package that will satisfy their particular requirements.
Less expensive since the cost is shared among all More costly and requires expertise to analyse
the other people buying the package document requirements
May contain a lot of unwanted features, and some Features customised to user requirements and other
desirable but non-essential features may be missing features can be added as needs arise
Well documented, well-tested and error-free May contain errors which do not surface immediately
41
SECTION 2 – SYSTEMS SOFTWARE
Q1: Name an open source operating system and open source word processing package
Selecting an application
How would you select suitable software for a particular purpose? You might use some of the following
criteria:
• Does it provide all the necessary functionality?
• Does it run on the available hardware?
• Is it available “off the shelf” or will it have to be specially written?
• How much will it cost?
• Is it well-used, tried and tested?
Q2: (a) What criteria would you use when deciding which word-processing software to
install on your PC?
(b) What criteria might a school use when deciding on a system for the school library?
42
CHAPTER 9 – THE NATURE OF APPLICATIONS
Exercises
1. (a) Software can be classified as either system or application software. What is meant by
2. A company sells widgets via an online web store. The process of updating the website and
processing sales involves many different types of software.
Complete the table below by writing one software category beside each use. You should
not use a category more than once. [4]
Software Category
3. Describe three reasons why a company might choose to purchase an “off-the-shelf” special
purpose software package rather than a suite of programs written specifically for their needs. [6]
State the purpose of each of the following types of utility software and describe how the
student would use them.
43
SECTION 2 – SYSTEMS SOFTWARE
Assembler
Assembly code is a low level language, with each instruction in assembly code almost always being
equivalent to one machine code instruction. The machine code instructions that a particular computer
can execute (the instruction set) are completely dependent on its hardware, and therefore each different
type of processor will have a different instruction set and a different assembly code.
Before an assembly code program can be executed, it must be translated into the equivalent
2-10
2-7 machine code. This is done by a program called an assembler. The assembler program takes each
assembly code instruction and converts it to the 0s and 1s of the corresponding machine code
instruction. The input to the assembler is called the source code and the output (machine code)
the object code.
Compiler
A compiler is a program that translates a high-level language such as Visual Basic, Python etc. into
machine code. The code written by the programmer, the source code, is input as data to the compiler,
which scans through it several times, each time performing different checks and building up tables of
information needed to produce the final object code. Different hardware platforms will require different
compilers, since the resulting object code will be hardware-specific. For example, Windows and the Intel
microprocessors comprise one platform, Apple and PowerPC processors another, so separate compilers
are required for each.
The object code can then be saved and run whenever needed without the presence of the compiler.
44
CHAPTER 10 – PROGRAMMING LANGUAGE TRANSLATORS
Interpreter
An interpreter is a different type of programming language translator. Once the programmer has written
and saved a program, and instructs the computer to run it, the interpreter looks at each line of the source
program, analyses it and, if it contains no syntax errors, translates it into machine code and runs it.
For example, the following Python program contains an error at line 5.
1 a = 1
2 b = 2
3 c = a + b
4 print("a + b = ", c)
5 d = a - n
6 print("a - b = ", d)
7 print("goodbye")
When the program runs, it produces the following output:
a + b = 3
Traceback (most recent call last):
File "C:/Users/A Level sample programs/prog1.py", line 5, in <module>
d = a - n
NameError: name ‘n’ is not defined
The program produces output at line 4, gets as far as line 5 and then crashes.
However, it is not always quite that simple. If we modify the program to introduce a syntax error at line 6,
(missing closing bracket) the interpreter does not attempt to run any of the program until this is fixed.
2-10
1 a = 1
2 b = 2
3 c = a + b
4 print("a + b = ", c)
5 e = a - b
6 print("a - b = ", d
7 print("goodbye")
When the program runs, it does not execute any of the code but produces the following output:
From this we can deduce that the translator has scanned through the whole program checking for certain
types of error before executing any of it.
45
SECTION 2 – SYSTEMS SOFTWARE
Bytecode
Many languages are not only compiled or only interpreted – there are various possibilities in between.
Interpreting each line of code just before executing it has become much less common. Most interpreted
languages such as Python and Java use an intermediate representation which combines compiling
and interpreting. The resulting bytecode is then executed by a bytecode interpreter.
The bytecode may be compiled once and for all (as in Java) or each time a change in the source code is
detected before execution (as in Python).
A big advantage of bytecode is that you can achieve platform independence; any computer that
can run Java programs has a Java Virtual Machine (JVM), a piece of software which masks inherent
differences between different computer architectures and operating systems. The JVM understands
bytecode and converts it into the machine code for that particular computer.
A second advantage of using, for example, Java bytecode is that it acts as an extra security layer
between your computer and the program. You can download an untrusted program and you then
execute the Java bytecode interpreter rather than the program itself, which guards against any
malicious programs.
It is also possible to compile from Python into Java bytecode (using the Jython compiler) and then use
the Java interpreter to interpret and execute it.
Q1: Why would a company or an individual programmer not want to distribute the source code
when they sell a software package?
Disadvantages of an interpreter
The program may run slower than a compiled program, because each statement has to be translated to
machine code each time it is encountered. So if a loop of 10 statements is performed 20 times, all 10
statements are interpreted 20 times.
46
CHAPTER 10 – PROGRAMMING LANGUAGE TRANSLATORS
A-Level only
Stages of compilation
There are three stages of compilation: lexical analysis, syntax analysis, and code generation and
optimisation. These stages are described below.
Lexical analysis
Lexical analysis performs the following functions.
1. Superfluous spaces are removed.
print (total_mark, average) will be converted to
print(total_mark,average)
2. All comments, identified for example by # or //, will be removed from the program.
3. Some simple error-checking is performed, not; for example:
• an illegal identifier (such as X&Y or ten% in Python) would be flagged as an error
• the lexical analyser will detect an attempt to assign an illegal value to a constant, such as a value
of the wrong type or one that causes overflow or underflow
(The lexical analyser will not detect misspelt keywords or undeclared variables; this is the job of the
syntax analyser.)
4. All keywords, constants and identifiers used in the source code are replaced by ‘tokens’ (unique
symbols). For example, numbers will be converted to their run-time representation, and identifiers will
be replaced by a pointer to an address in the symbol table. Keywords such as input, print will be
2-10
replaced by a single item-code.
item name kind of item type of item run-time address or value pointer
1 input keyword
2 pi constant real 3.14159
3 radius variable real (?)
4 = operator
5 area variable real
6 numSides array integer (?) (?)
7 * operator
8
47
SECTION 2 – SYSTEMS SOFTWARE
A-Level only
The statements input(radius)
area = pi * radius * radius
could be ‘tokenised’ and stored as the lexical string 1 3
5 4 2 7 3 7 3
Q2: What further entries to the symbol table will the lexical analyser make on encountering the
statement
circumference = 2 * pi * radius
Add the entries to the symbol table and then tokenise the statement.
Note that the lexical analyser puts the identifier and its run-time address in the symbol table, so that it
can replace them in the source code by ‘tokens’. It will not fill in the ‘kind of item’ and ‘type of item’; this
is done later by the syntax analyser.
48
CHAPTER 10 – PROGRAMMING LANGUAGE TRANSLATORS
A-Level only
effect as that specified by the source program but not by the same means. The disadvantages of code
optimisation are:
• it will increase compilation time, sometimes quite considerably
• it may sometimes produce unexpected results. Consider the following program extract, which is
supposed to measure the speed of the object program. Assume GetTime is a function which
returns the current time set in the operating system:
start = GetTime;
for count = 1 to 100000
x = 0
#endfor
finish = GetTime
print(start, finish);
The effect of code optimisation may be to detect that it is quite unnecessary to perform the loop 100000
times to set x equal to 0, and optimise the code so that it is only done once!
Use of libraries
Library programs are ready-compiled programs which can be run when needed, and which are grouped
together in software libraries. In Windows these often have a .dll extension. Most compiled languages
have their own libraries of pre-written functions which can be invoked in a defined manner from within the
user’s program.
Q4: What library programs or routines have you used in any of the programs you have written?
How is the library invoked by the program?
49
SECTION 2 – SYSTEMS SOFTWARE
Exercises
1. A programmer is asked to write a program and can choose between using a low-level language or an
imperative high-level language.
Outline the major differences between these two types of languages, naming an example of each.
• a situation when each one would be the most appropriate choice [10]
Explain the need for intermediate code and its purpose in a virtual machine.
The quality of written communication will be assessed in your answer to this question. [8]
(b) State three benefits of using library routines when a program is written. [3]
Using lines of code from the above program to illustrate your answer, state two things that would be
done in each of the following stages of compilation:
4. The process of compilation involves a number of stages. Name the stage at which each of the
following would be detected.
(b) An arithmetic operator is applied to an operator of the data type Boolean. [1]
50
Section 3
Software development
In this section:
Chapter 11 Systems analysis methods 52
51
SECTION 3 – SOFTWARE DEVELOPMENT
Analysis
Before a problem can be solved, it must be defined. The requirements of the system that solves the
problem must be established. In the case of a data processing system, or for example the construction
of a website, this could cover:
Design
Depending on the type of project, the systems designer may consider some or all of the following:
• processing: the algorithms and appropriate modular structure for the solution, specifying
modules with clear documented interfaces
• data structures: how data will be held and how it will be accessed – for example in a dynamic data
structure such as a queue or tree, or in a file or database
• output: content, format, sequence, frequency, medium (e.g. screen or hard copy) etc.
• input: volume, frequency, documents used, input methods
• user interface: screens and dialogues, menus, special-purpose requirements
• security: how the data is to be kept secure from accidental corruption or deliberate
tampering or hacking
• hardware: selection of an appropriate configuration
52
CHAPTER 11 – SYSTEMS ANALYSIS METHODS
Alpha testing
Alpha testing is carried out by the software developer's in-house testing team. It is essential because it often
reveals both errors and omissions in the system requirements definition. The user may discover that the
system does not in fact have the required functionality because the requirements were not specified carefully 3-11
enough, or because the developer has overlooked or misunderstood something in the specification.
Beta testing
When a new package is being developed for release as a software package, beta testing is often used.
This involves giving the package to a number of potential users who agree to use the system and
report any problems to the developers. Microsoft, for example, delivers beta versions of its products to
hundreds of sites for testing. This exposes the product to real use and detects problems and errors that
may not have been anticipated by the developers. The product can then be modified and sent out for
further beta testing until the developer is confident enough in the product to put it on the market.
Implementation
Coding and testing will be carried out, errors traced and corrected. When all is thought to be satisfactory
the software will be installed on the user’s system and more testing will be done. At this stage new
weaknesses and omissions are almost bound to surface and more work will be carried out.
Evaluation
The evaluation may include a post-implementation review, which is a critical examination of the system
three to six months after it has been put into operation. This waiting period allows users and technical
staff to learn how to use the system, get used to new ways of working and understand the new
procedures required. It allows management a chance to evaluate the usefulness of the reports and
on-line queries that they can make, and go through several ‘month-end’ periods when various routine
reports will be produced. Shortcomings of the system, if there are any, will be becoming apparent at all
levels of the organisation, and users will want a chance to air their views and discuss improvements.
The solution should be evaluated on the basis of effectiveness, usability and maintainability.
53
SECTION 3 – SOFTWARE DEVELOPMENT
• a comparison of the system’s actual performance with the anticipated performance objectives
• an assessment of each aspect of the system against preset criteria
• errors which were made during system development
• unexpected benefits and problems
Analysis
Design
3-11
Implementation
Evaluation
Maintenance
Spiral model
The Spiral Model uses the same structured steps but introduces the idea of developing the software in
iterative (repeating) stages. At the start of the process the requirements are defined and the developers
work towards an initial prototype. Each successive loop around the spiral generates a refined prototype
until the product is finished.
prot
o
lement Ev type
Imp alu 2
n
at
ig Im e
ign pleme
etc
s
De
s
nt
De
e
An
e
Evaluat
aly
se
Ev
An
s
Analy
se
y
alua
alys
Anal
te Ana e 1
Design
e De
ment
proto Imp
t
ate
lua
sig
alu
le
va
typ
lys
n
mp
E
Ev
e
Impl ent
yp I
em
lem
De nt
3
sig
n
en n Impleme
e
g
t si
Eva De otot
luate e p r
Analys
54
CHAPTER 11 – SYSTEMS ANALYSIS METHODS
Each time around the spiral the following activities are performed:
Agile modelling
At all the stages of analysis, design and implementation, an agile approach may be adopted, as the
stages of software development may not be completed in a linear sequence. It might be that some
analysis is done and then some parts of a system are designed and implemented while other parts are
still being analysed and then, for example, implementation and testing may be intermixed. The developer
may then go back to design another aspect of the system.
Throughout the process, feedback will be obtained from the user; this is an iterative process during
which changes made are incremental as the next part of the system is built. Typically the software
developers do just enough modelling at the start of the project to make sure that the system is clearly
understood by both themselves and the users.
At each stage, a prototype is built with user participation to ensure that the system is being developed in
line with what the user wants. The success of the software development depends on
• keeping the model simple, and not trying to incorporate features which may come in useful at a
later date
• rapid feedback from the user
• understanding that user requirements may change during development as they are forced to
consider their needs in detail
• being prepared to make incremental changes as the model develops
Extreme programming
Extreme programming (XP) is a software development methodology which is intended to improve
software quality and responsiveness to changing customer requirements. It is a type of agile software
development in which frequent “releases” of the software are made in short development cycles. This is
intended to improve productivity and introduce checkpoints at which new customer requirements can
be adopted.
55
SECTION 3 – SOFTWARE DEVELOPMENT
• workshops and focus groups to gather requirements rather than a formal requirement document
• the use of prototyping to continually refine the system in response to user involvement and feedback
• producing within a strict time limit each part of the system, which may not be perfect but which is
good enough
• reusing any software components which have already been used elsewhere
3-11 Extreme programming and rapid application development are good methodologies for large
projects where there is a danger of getting bogged down or sidetracked by suggested improvements, so
that developers are continually chasing a moving target.
Exercises
1. A systems analyst/developer is planning a system for the administration of student courses to be
used in an office in a college.
Describe three tasks that may be carried out by the analyst to establish the requirements of
the system. [6]
2. (a) Explain what is meant by the prototyping/agile approach to system analysis and design. [4]
(b) What are the advantages of this approach? [4]
(c) (i) Describe briefly two other approaches to systems development. [6]
(ii) Describe the advantages and disadvantages of each of these approaches. [4]
(iii) State circumstances in which each of the methods you have described
would be appropriate. [2]
3. Explain the difference between black box testing and white box testing. [4]
56
CHAPTER 12 – WRITING AND FOLLOWING ALGORITHMS
Properties of an algorithm
A recipe for chocolate cake, a knitting pattern for a sweater or a set of directions to get from A to B, are
all algorithms of a kind.
Computational algorithms
A good algorithm has the following properties:
• It has clear and precisely stated steps that produce the correct output for any set of valid inputs
• It should allow for invalid inputs
• It must always terminate at some point
• it should execute efficiently, in as few steps as possible
• It should be designed in such a way that other people will be able to understand it and modify
it if necessary
• Internet-related algorithms. Algorithms are used to manage and manipulate the huge amount
of data stored on the Internet. How does a search engine find all the pages on which particular
information resides in a fraction of a second?
• Route-finding algorithms. Given two locations, how does a route-finder determine the shortest
or best route between the two points? There may be thousands of possible routes. This type of
algorithm is used not only for driving a vehicle from A to B, but also for many other applications, for
example, finding the best route to transmit packets of data from A to B over a network.
• Compression algorithms. These are used to compress data files so that they can be transmitted
faster or held in a smaller amount of storage space. For example, MP3 files are compressed so that
you can hold thousands of tracks on a mobile phone.
• Encryption algorithms. When someone purchases something over the Internet and sends their
credit card number and other personal details to the store, the data needs to be encrypted so that
even if it is intercepted, it cannot be read.
57
SECTION 3 – SOFTWARE DEVELOPMENT
Q2: Write pseudocode for the above algorithm to find the square root of an integer, when you know
that the answer is an integer. Write and test the program for the integer 19321.
Q3: Which of the properties of a good algorithm, stated at the start of the chapter, does this
algorithm not satisfy?
The algorithm described will do the job, but a better solution is based on the well-known binary
3-12
3-11 search algorithm.
58
CHAPTER 12 – WRITING AND FOLLOWING ALGORITHMS
Find
square root
The chart represents the blocks of program code that we will use to solve the problem. The solution is
short, so it’s not necessary to put each block in a separate subroutine.
number = 19321
low = 1
high = number
guess = int((low+high)/2)
nsquared = guess**2 BLOCK 1 SEQUENCE
3-12
Q4: Add statements to calculate and print the number of guesses it took to find the answer.
Try writing and running the program for different values of number.
Is there a formula for calculating how many guesses it should take to find the square root?
Interpreting algorithms
A useful skill is to be able to look at someone else’s algorithm and decide what it does and how it works.
Of course, if the programmer has put in lots of useful comments, used meaningful variable names and
split a complicated algorithm into separate modules, that should not be too difficult!
59
SECTION 3 – SOFTWARE DEVELOPMENT
60
CHAPTER 12 – WRITING AND FOLLOWING ALGORITHMS
Exercises
1. In a football league, the results of each match are input to the computer, which updates each team’s
points.
In the case of a draw, each team (Team A and Team B) gets one point.
If Team A wins, then Team A gets 3 points and Team B gets no points.
2. Expert jugglers learn new juggling patterns according to certain rules represented by numbers. In this
example, the rules for patterns of three numbers are:
Rule 1: the total value of the numbers in the list must be a multiple of 3
Rule 2: No number must be one less than the previous number, even if the pattern is
repeated indefinitely.
441
651 (5 is one less than the previous number, so this does not obey rule 2)
6 2 7 (when this is repeated, 6 2 7 6 2 7 6 2 7… 6 is one less than the previous number, so this
does not obey rule 2)
(a) State why the following lists of 3 numbers are not valid patterns of numbers.
(i) 5 1 6 [1]
(ii) 4 4 2 [1]
61
SECTION 3 – SOFTWARE DEVELOPMENT
3. José works for a company that provides loans to its customers. When customers take out a
loan they decide how much money to borrow and for how many years.
The interest rate is currently 10% but it may change in the future.
José writes the following program to calculate the monthly payment for a loan.
01 program loanCalculator
02
03 CONST INTEREST_RATE = 10
04
05 begin
06 amount = input("Enter amount: ")
07 years = input("Enter years: ")
08 annualInterest = amount * interestRate / 100
09 totalToPay = (annualInterest * years) + amount
10 monthlyPayment = totalToPay / (years * 12)
11 print("Monthly Payment:", monthlyPayment)
12 end
(a) Using the code above, show the value that will be output if the inputs are:
Amount: 600
Years: 5
(i) State why the parentheses in line 09 are not essential. [1]
Identify the constant, and explain why a constant has been used. [3]
62
CHAPTER 12 – WRITING AND FOLLOWING ALGORITHMS
(d) The company also offers a savings plan. Customers pay a fixed amount each year into
the savings plan. At the end of each year, the company adds the value of the savings plan
at the start of the year to the amount paid, and then adds interest of 10% to obtain the final
value for the year.
For example, if a customer saves £100 each year, the value of the savings plan for 5 years is
shown in the table below
Write an algorithm which allows the user to input the amount saved each year and the number
of years, and outputs the growth of the savings plan in the format shown above. [7]
3-12
63
SECTION 3 – SOFTWARE DEVELOPMENT
A-Level only
Programming paradigms
A programming paradigm is a style of computer programming. Different programming languages support
tackling problems in different ways, and there are four major programming paradigms, each supported by
a number of different languages. Some languages such as Python, Delphi and Java support more than
one programming paradigm.
Different types of problem require different types of language, and hundreds of different languages have
been developed for different types of application. Assembly language was the first language to
be developed after machine code, and the next step was the development of procedural languages
in the 1960s.
A
Procedural languages
A procedural language has built-in data types such as integer, real or floating point numbers,
character, Boolean and string. In addition, it typically has data structures such as array and record.
Programmers can define their own abstract data types such as queue, stack, tree, or hash table all
of which you will study during this course.
Consider an abstract data structure such as a stack. This can be visualised like a stack of plates.
You can only add an item to the top of a stack, and you can only remove an item from the top.
64
CHAPTER 13 – PROGRAMMING PARADIGMS
The programmer might decide there is a limit to how large the stack is allowed to get, so you can’t add
to a full stack, and obviously, you can’t remove an item from an empty stack.
This abstract data structure can be implemented in different ways. In Python, it could be implemented
with the built-in list data structure. In Pascal, you could use an array, with a pointer to the top of the
stack. The important thing is, that someone using this data structure should not need to know how it is
implemented, any more than they need to know how a square root is worked out when they press the √
(square root) button on a calculator. All the user needs to know is the state and behaviour of the
data structure.
Clearly, it is a waste of time for every programmer who needs to use a stack to have to decide how to
implement it and write their own subroutines to add and remove items from it. This is where an object-
oriented approach comes in.
A-Level only
Object-oriented languages
In an object-oriented language, we define a class as the description of what the data looks like (the
state) and what the data can do (the behaviour). The user of a class sees only the state and behaviour of
a data item. Data items are called objects, where an object is an instance of a class.
Programming in an object-oriented language requires thinking in terms of the objects that will carry out
the required tasks, rather than thinking about data structures and algorithms. We are familiar with the
concept of objects in the physical world – they could be cats, dogs, plates, cars, patients, doctors,
students, and so on.
In a hospital system, objects might be patient, ward, doctor, nurse and so on. Each of these objects can
be defined as a class, with its own set of behaviours. Each individual ward will be a single instance of 3-13
the class called ward. The class will have attributes such as name, number_of_beds, number_of_
patients, location, type. A particular instance of the ward may have name Bramford, number_
of_beds 6, location Block E, number_of_patients 5, type Children’s. Its behaviours might
include admit_patient, discharge_patient.
Inheritance
Below is a simple example of a class and its subclasses. Suppose an object-oriented program used
by an estate agent defines a class called Property. Property has attributes including address,
owner, type, number_of_bedrooms, price.
The class Property has two subclasses called Property_For_Rent and Property_For_Sale.
The subclasses have the same attributes as the superclass Property, and in addition, each has some
attributes of its own. The subclasses are said to inherit properties from the superclass, and we can draw
an inheritance diagram.
Property
Property_For_Rent Property_For_Sale
Class diagram
65
SECTION 3 – SOFTWARE DEVELOPMENT
A-Level only
Example
A class called DataStructure is created in an object-oriented language. The DataStructure class
has two subclasses called Stack and Queue. Both Stack and Queue inherit attributes name, size,
isEmpty and isFull from the superclass. They also inherit methods called addItem, removeItem.
DataStructure
Stack Queue
Classes are defined differently in different programming languages, but in the pseudocode that is used on
this course, the superclass might be defined like this:
class DataStructure
private size
private isFull
private isEmpty
public procedure new(structureSize)
size = structureSize
endprocedure
public procedure addItem(parameter)
3-13
3-11 (instructions to add an item to end of data structure)
endprocedure
public function removeItem
(instructions to remove first item from data structure)
endfunction
endclass
In the definition, attributes are generally described as private. This means that users cannot directly
access them. They are changed through statements within the various methods. Methods fall into one
of two categories – functions, which return a value, and procedures, which do not. For example, when
an item is to be added to a data structure, the item which is to be added is passed as a parameter,
but nothing is returned. If an item is to be removed, no parameter is needed, and the item removed is
returned from the function.
The class Stack could be defined as follows:
class Stack inherits DataStructure
public function removeItem
(instructions to remove item from end of stack)
endfunction
The attributes size, isFull, isEmpty and the methods new, addItem are inherited from the
DataStructure class.
66
CHAPTER 13 – PROGRAMMING PARADIGMS
A-Level only
Polymorphism
Polymorphism refers to a programming language’s ability to process objects differently depending on
their class.
A class of objects has behaviours or methods, all of which will be inherited by its subclasses.
In this example the class Stack defines its own method removeItem. This is because, although
there is a method of the same name in the superclass which it could inherit, it will process a Stack
object differently. In a stack, the last item in the stack will be removed. In the superclass, assume that
the method removeItem removes the first item in the data structure. In the case of a queue, this is
fine, but it is not what is required for the stack. However, both Stack and Queue objects carry out the
method addItem in an identical way, adding the item to the end of the data structure. Hence, the method
addItem does not need to be redefined in the Stack class definition.
This is what is meant by polymorphism; the subclass Stack redefines the method removeItem
defined in the superclass DataStructure to process objects in the class differently.
The attributes size, isFull, isEmpty are all defined in the superclass DataStructure.
These attributes cannot be accessed directly if they are declared private; they can only be accessed
through the class methods. This is known as encapsulation.
• The object-oriented methodology forces designers to go through an extensive planning phase, which
makes for better designs with fewer weaknesses
• Encapsulation: the source code for an object can be written, tested and maintained independently of
the code for other objects
• Once an object is created, knowledge of how its methods are implemented is not necessary in order
for a programmer to use it
• New objects can easily be created with small differences to existing ones
• Reusability: objects that are already defined, coded and tested may be used in many
different programs
• OOP provides a good framework for code libraries with a range of software components that can
easily be adapted by a programmer
• Software maintenance: an object-oriented program is much easier to maintain than one written in a
procedural language because of its modular structure
A
67
SECTION 3 – SOFTWARE DEVELOPMENT
A-Level only
Exercises
1. A programming paradigm is a style of computer programming. Procedural programming, supported
by languages such as Python or Pascal, which have a series of instructions that tell the computer
what to do with the input in order to solve the problem, is one example of a paradigm.
Name and briefly describe two other programming paradigms, giving an example of an
application of each and a language which supports it. [6]
2. (a) Explain what is meant by the term class in object-oriented programming. [2]
(b) An institution categorises its staff as either Academic or Administration. Administration
staff may be either Salaried or HourlyPaid.
(iv) Explain how this might apply to a method called CalculatePay in the
class Administration. [2]
3. The system used by a garden centre to store and retrieve details of its products is written in an
object-oriented language. Part of the design is shown on the class diagram.
3-13
3-11
Product
ProductCode
Name
Price
setPrice( )
findPrice( )
Plant Tool
FlowerColour Type
Variety
findVariety( )
Explain the terms class, derived class, inheritance and encapsulation, using examples from the
garden centre.
The quality of written communication will be assessed in your answer to this question. [8]
68
CHAPTER 14 – ASSEMBLY LANGUAGE
Mnemonic Numeric
Instruction Description
code code
ADD ADD 1xx Add the contents of the memory address to the Accumulator 3-14
Subtract the contents of the memory address from the
SUB SUBTRACT 2xx
Accumulator
Store the value in the Accumulator in the memory address
STA STORE 3xx
given.
Load the Accumulator with the contents of the memory
LDA LOAD 5xx
address given
BRANCH Branch - use the address given as the address of the next
BRA 6xx
(unconditional) instruction
BRANCH
BRZ IF ZERO 7xx Branch to the address given if the Accumulator is zero
(conditional)
BRANCH IF
Branch to the address given if the Accumulator is zero or
BRP POSITIVE 8xx
positive
(conditional)
INP INPUT 901 Input into the accumulator
OUT OUTPUT 902 Output contents of accumulator
HLT Halt 0 Stops the execution of the program.
DAT DATA Used to indicate a location that contains data.
Table 14.1
69
SECTION 3 – SOFTWARE DEVELOPMENT
Q1: Write an assembly code program to input a number x and calculate and output 6x – 5.
Branch instructions
The flow of the program can be altered using a conditional or unconditional branch instruction.
The conditional branch instructions BRP (Branch if positive), BRZ (Branch if zero) cause a branch to
a given label in the program depending on the value held in the accumulator.
An unconditional branch instruction, (BRA) will cause a branch whatever the value held in
the accumulator.
70
CHAPTER 14 – ASSEMBLY LANGUAGE
Example 2
Compare the two numbers held in memory locations num1 and num2, and output the larger. If they are
equal, either one can be output.
LDA num1
SUB num2
BRP firstmax
LDA num2
OUT num2
HALT
firstmax LDA num1
OUT num1
HLT
Q2: Write an assembly code program to input two numbers and output the minimum number.
Example 3
Write an assembly code program which performs integer division. The program inputs two numbers big
and small, and outputs the result of big divided by small, ignoring the remainder.
There is no division instruction in this instruction set, so we have to repeatedly subtract small from
big, adding 1 to a variable which we will call answer, until big becomes less than zero. Each time we
subtract, we add 1 to a variable called answer.
INP ; Input the number 1 3-14
STA one ; store in one
INP ; Input the number 0
STA answer ; Store in answer
INP ; Input the divisor
STA small ; Store in small
INP ; Input the number to be divided
STA big ; store value in big
next SUB small ; subtract small from ACC which contains big
STA big ; Store in big
BRP more ; Branch if ACC positive or zero to more
LDA answer
OUT ; Output the answer in ACC
HLT ; Halt
more LDA answer ; Load answer into ACC
ADD one ; Increment ACC
STA answer ; Store in answer
LDA big ; load what is left of big
BRA next ; Branch to next
x DAT
one DAT
big DAT
small DAT
answer DAT
71
SECTION 3 – SOFTWARE DEVELOPMENT
Addressing
Basic machine operation
mode
0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1
The LMC instruction set has only 11 instructions, and the imaginary machine has only 100 memory
locations. The maximum data value is 999, which can be held in 10 bits. Four bits would be enough to
store the operation code, and 7 bits would be enough to store the operand. A word size of 16 bits would
be plenty big enough to hold an instruction or a data value.
In a real computer, there will be considerably more than 11 instructions in the instruction set. It will
include, for example, multiply and divide in the arithmetic instructions, and shift instructions to shift bits
left or right.
3-14
3-11 There will also normally be up to 16 registers in which calculations can be carried out, rather than a
single accumulator.
A-Level only
Addressing modes
The operation code (opcode) consists of binary digits representing the basic operation such as ADD or
LOAD, and a 2-digit code representing the addressing mode.
There are four different addressing modes, which are indicated by the bit pattern that is the last two bits
of the opcode:
using immediate addressing, the operand is the actual value to be operated on, say 3 or 75
using direct addressing, the operand holds the memory address of the value to be operated on. This
is the only addressing mode used in the LMC assembly language
using indirect addressing, the operand is the location (typically a register) which holds the address of
the data we want. This enable a larger range of addressable locations.
using indexed addressing, the address of the operand is obtained by adding to the contents of a
general register (called the index register) a constant value. The number of the index register and the
constant value are included in the instruction code. Indexed addressing mode is used to access an array
whose elements are in successive memory locations.
72
CHAPTER 14 – ASSEMBLY LANGUAGE
A-Level only
Examples of the use of each of these are given below.
Suppose contents of accumulator, index register and a section of memory are as follows:
Accumulator ACC holds 25
Index register holds 6
Register 0 (R0) contains 0
Memory location Contents
1
2 3
3
4 8
5
6 15
7 32
8 27
Exercises
1. (a) In a particular machine code, the opcode is stored in 6 bits and the operand is stored in
12 bits. What is the maximum number of operations in the machine’s instruction set? [1]
A-Level only
(b) Explain, with the aid of examples, the difference between immediate, direct and
indirect addressing. [4]
A
2. Using instructions ADD x (Add number stored in x to the accumulator)
write an assembly language program that adds together the values stored in memory locations
num1 and num2, storing the resulting total in memory location num3. [3]
3. Write an assembly language program which counts and outputs the number of values entered
by the user, and the total of the values input. End of input is signalled by dummy value 0. You
may assume that memory locations called increment, total and numvals contain
1, 0 and 0 respectively. (Use LMC assembly language instructions.) [7]
73
Section 4
Exchanging data
In this section:
Chapter 15 Compression, encryption and hashing 75
74
CHAPTER 15 – COMPRESSION, ENCRYPTION AND HASHING
Chapter 15 – Compression,
encryption and hashing
Objectives
• Know why sound and images are often compressed
• Understand how other files can be compressed
• Understand the difference between lossless and lossy compression
A • Explain the advantages and disadvantages of different compression techniques
A • Explain run length encoding and dictionary based compression
A • Define symmetric and asymmetric encryption
A • Understand how and why hashing may be used to encrypt data
Lossy compression
Lossy compression works by removing non-essential information. The two JPG images below are clearly
identifiable as the same thing, but one has been heavily compressed, displaying untidy and blocky
compression artefacts as a consequence. Nevertheless, we can make out the subject of the image well,
but the degree to which they are compressed comes at the cost of quality.
75
SECTION 4 – EXCHANGING DATA
The compression of sound and video works in a similar way. MP3 files use lossy compression to remove
frequencies too high for most of us to hear and to remove quieter sounds that are played at the same
time as louder sounds. The resulting file is about 10% of original size, meaning that 1 minute of MP3
audio equates to roughly 1MB in size.
Voice is transmitted over the Internet or mobile telephone networks using lossy compression and
although we have no problem in understanding what the other person is saying, we can recognise the
difference in quality of a voice over a phone rather than in person. The apparent difference is lost data.
Lossless compression
Lossless compression works by recording patterns in data rather than the actual data. Using these
patterns and a set of instructions on how to use them, the computer can reverse the procedure and
reassemble an image, sound or text file with exact accuracy and no data is lost. This is most important
with the compression of program files, for example, where a single lost character would result in an error
in the program code. A pixel with a slightly different colour would not be of huge consequence in most
cases. Lossless compression usually results in a much larger file than a lossy file, but one that is still
significantly smaller than the original.
Q1: What type of compression is likely to be used for the following: a website image, a zipped file of
long text documents and images, a PDF instruction manual?
A-Level only
For this section of the balloon image, the encoding for the first row might crudely translate to:
6 green, 8 yellow and 17 orange, using one binary value for the colour value and another for the number
of contiguous matching pixels in the run. This would reduce the data necessary to store this row to 6
bytes (00000110 00000001 00001000 00000010 00010001 00000011) rather than 31 bytes assuming a
bit depth of 8 and values for each colour of 00000001, 00000010 and 00000011.
76
CHAPTER 15 – COMPRESSION, ENCRYPTION AND HASHING
Using the dictionary table above, the saying “Do unto others as you would have others do unto
you” would be compressed as 1 2 3 4 5 6 7 3 8 2 5 or in binary using only 49 bits. This compares 4-15
to 51 characters or 51 bytes – a reduction of 92%. This still ignores the fact that the dictionary must
also be stored with the text, but with a longer body of text to be compressed, a dictionary becomes
quite insignificant in size compared with the original, and the original message can still be
reassembled perfectly.
Encryption
Encryption is the transformation of data from one form to another to prevent an unauthorised third party
from being able to understand it. The original data or message is known as plaintext. The encrypted
data is known as ciphertext. The encryption method or algorithm is known as the cipher, and the
secret information to lock or unlock the message is known as a key.
The Caesar cipher and the Vernam cipher offer polar opposite examples of security. Where the Vernam
offers perfect security, the Caesar cipher is very easy to break with little or no computational power.
There are many other methods of encryption – some of which may take many computers many years
to break, but almost all of these are still breakable and the principles behind them are similar.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
77
SECTION 4 – EXCHANGING DATA
A-Level only
Q2: Using the table above, what is the ciphertext for ‘JULIUS CAESAR’ using a shift of 5?
Q3: What word can be translated from the following ciphertext using a key of -2: ZYBECP
You will no doubt be able to see the ease with which you can decrypt a message using this system.
DGYDQFH WR ERUGHU DQG DWWDFN DW GDZQ
Even if you had to attempt a brute force attack on the message above, there are only 26 different
possibilities. Otherwise you might begin by guessing the likelihood of certain characters first and go from
there. Using cryptanalysis on longer messages, you would quickly find the most common ciphertext letter
and could start by assuming this was an E, for example; or perhaps an A. (Hint.)
One-time pad
The encryption key or one-time pad must be equal to or longer in characters than the plaintext, be truly
4-15 random and be used only once. One-time pads are used in pairs where the sender and recipient are both
party to the key. Both must meet in person to securely share the key and destroy it after encryption or
decryption. Since the key is random, so will be the distribution of the characters meaning that no amount
of cryptanalysis will produce any meaningful results.
sing the ASCII chart and the XOR operator, what ciphertext character will be produced
Q4: U
from the letter E with the key w?
Using this method, the message “Meet on the bridge at 0300 hours” encrypted using a one-time pad
of +tkiGeMxGvnhoQ0xQDIIIVdT4sIJm9qf will produce the ciphertext:
78
CHAPTER 15 – COMPRESSION, ENCRYPTION AND HASHING
A-Level only
The encryption process will often produce strange symbols or unprintable ASCII characters as in the
above example, but in practice it is not necessary to translate the encrypted code back into character
form, as it is transmitted in binary. To decrypt the message, the XOR operation is carried out on the
ciphertext using the same one-time pad, which restores it to plaintext.
Encrypted message
Recipient’s public key used to Data encrypted with user’s public key can only
encrypt data before sending be decrypted with the user’s private key
Q5: Governments sometimes demand copies of encryption keys in order to decrypt messages if
necessary. What reasons are there for and against governments doing this?
79
SECTION 4 – EXCHANGING DATA
A-Level only
Hashing
A hashing function provides a mapping between an arbitrary length input and a usually fixed length or
smaller output. Unlike the encryption techniques described above, it is one-way; you cannot get back
to the original. This is useful for storing encrypted PINs and passwords so that they cannot be read
by a hacker. To verify a user’s password, the software applies the hash function to the user input and
compares it with the one stored.
Methods of hashing are discussed in Section 7, Chapter 37.
Q6: Assuming their private key has not been compromised, a digital signature authenticates the
sender beyond legal doubt. How might this help protect against viruses?
Hoax digital signatures could be created using a bogus private key claiming to be that of a trusted
individual. In order to mitigate against this, a digital certificate verifies that a sender’s public key is
formally registered to that particular sender.
80
CHAPTER 15 – COMPRESSION, ENCRYPTION AND HASHING
A-Level only
Digital certificates
While digital signatures verify the trustworthiness of message content, a digital certificate is issued by
official Certificate Authorities (CAs) such as Symantec or Verisign and verifies the trustworthiness of
a message sender or website. This certificate allows the holder to use the Public Key Infrastructure
or PKI. The certificate contains the certificate’s serial number, the expiry date, the name of the holder,
a copy of their public key, and the digital signature of the CA so that the recipient can authenticate the
certificate as real. Digital certificates operate within the Transport layer of the TCP/IP protocol stack.
TCP/IP is covered in Chapter 22.
A
Exercises
1. (a) Explain why compression is considered necessary for images on the web. [2]
(b) Explain why lossy compression techniques would not be suitable for use with files
containing large bodies of text. [1]
(c) Suggest a suitable lossless method for compressing text. [1]
A-Level only
2. (a) Explain the difference between lossy and lossless data compression. [2]
(b) Run-length encoding (RLE) is a pattern substitution compression algorithm. Data is stored
in the format (colour,run).
a. (0,1),(1,5),(0,1), 4-15
b. (1,7),
c. (1,1),(0,2),(1,1),(0,2),(1,1),
d. (1,7),
e. (0,1),(1,1),(0,1),(1,1),(0,1),(1,1),(0,1),
f. (0,1),(1,1),(0,1),(1,1),(0,1),(1,1),(0,1),
g. (0,1),(1,1),(0,3),(1,1),(0,1)
Reassemble the encoded sequence above to form a 7x7 web icon image in the grid below. [3]
(c) RLE encoding is a lossless compression method. Give one disadvantage of lossless
compression over lossy methods for the compression of images. [1]
3. (a) State what is meant by symmetric encryption and explain with the aid of an example how
it can be implemented. [4]
(b) (i) Explain what is meant by asymmetric encryption. [4]
(ii) Explain why this form is more secure than symmetric encryption. [2]
A
81
SECTION 4 – EXCHANGING DATA
Example 1
A dentist’s surgery employs several dentists, and an appointments system is required to allow patients to
make appointments with a particular dentist.
Entities in this system include Dentist, Patient and Appointment. The attributes of Dentist may include
Title, Firstname, Surname, Qualification.
Attributes of Patient may include Title, Firstname, Surname, Address, Telephone.
Entity descriptions
An entity description is normally written using the format
Entity1 (Attribute1, Attribute2…)
The entity description for Dentist is therefore written
Dentist (Title, Firstname, Surname, Qualification)
82
CHAPTER 16 – DATABASE CONCEPTS
Q3: Is National Insurance Number a suitable primary key for Patient? If not, why not?
Secondary key
A database needs to be set up so that it can be searched quickly. An index of all the primary keys in
the database, and where the record is held, is automatically maintained by the database software.
However, more than one index may be needed.
If for example a patient rings up to make an appointment with the dentist, they are unlikely to know their
patient ID, A secondary index on surname is likely to be held.
• One-to-one Examples of such a relationship include the relationship between Husband and
Wife, Country and Prime Minister. 4-16
• One-to-many Examples include the relationship between Mother and Child, Customer and
Order, Borrower and Library Book.
• Many-to-many Examples include the relationship between Student and Course, Stock Item and
Supplier, Film and Actor.
in charge of
Headteacher School One-to-one
treats
Dentist Patient One-to-many
orders
Customer Product Many-to-many
83
SECTION 4 – EXCHANGING DATA
Foreign key
A foreign key is an attribute that creates a join between two tables. It is the attribute that is common to
both tables, and the primary key in one table is the foreign key in the table to which it is linked.
Example: In the one-to-many relationship between Dentist and Patient, the entity on the ‘many’ side of
the relationship will have DentistID as an extra attribute. This is the foreign key.
Dentist Patient
DentistID* PatientID*
Title Title
FirstName FirstName
Surname Surname
Qualification Address
Telephone
DentistID
Note that the primary key is indicated by an asterisk, and the foreign key is shown in italics.
takes
Student Course Many-to-many
In this case, an extra table is needed to link the Student and Course tables. We could call this
StudentCourse, or Enrolment, for example.
The three tables will now have attributes something like those shown below:
Student (StudentID, Name, Address)
Enrolment (StudentID, CourseID)
Course (CourseID, Subject, Level)
Composite key
In this data model, the table linking Student and Course has two foreign keys, each linking to one of
the two main tables. The two foreign keys also act as the primary key of this table. A primary key which
consists of more than one attribute is called a composite primary key.
84
CHAPTER 16 – DATABASE CONCEPTS
Example 2
A hospital inpatient system may involve entities Ward, Nurse, Patient and Consultant. A ward is
staffed by many nurses, but each nurse works on only one ward. A patient is in a ward and has many
nurses looking after them, as well as a consultant, who sees many patients on different wards.
holds
Ward Patient
staffed by sees
Nurse Consultant
Referential integrity
When tables are linked in a relational database, it is important to ensure that, for example, a particular
component is not deleted if it is used in a product in the Product table. This is known as referential
integrity.
The screenshot above shows a relationship being created in MS Access between two tables linked
by School ID.
85
SECTION 4 – EXCHANGING DATA
Exercises
1. An estate agent keeps a database of all the properties it has for sale, the owners of the properties,
and all the prospective buyers.
Details about the properties for sale, including address, number of bedrooms, type of property,
asking price are held in a table called Property.
(a) Suggest a suitable primary key for the Property table. [1]
(b) Suggest two attributes in the Property table that may be defined as secondary keys,
justifying why each should be defined in this way. [4]
Data on prospective buyers include name, telephone, address, type of property required,
lower and upper limit for price.
(d) Write entity descriptions for each of the entities Property, Vendor, Buyer and Viewing.
In each case, identify any primary and foreign keys. [8]
(e) Draw an entity relationship diagram showing relationships between these four entities. [4]
4-16
4-15
2. A library plans to set up a database to keep track of its members, books and loans. Entities are
defined as follows:
(a) Draw an entity relationship diagram showing the relationships between the entities. [3]
(b) A relational database is created with tables for each of these entities. The key in the Loan
table is made up of two fields.
What is the name given to a key that is made up of multiple attributes? [1]
(c) What is meant by a foreign key? Identify a foreign key in one of the tables. [3]
86
CHAPTER 16 – DATABASE CONCEPTS
3. An exam board wants to set up a database to hold data about its courses, exam papers, exam
entries, candidates and results. For the purpose of this exercise, assume that each candidate
can sit each exam once only. A course may have several exam papers (Comp 1, Comp 2, etc.).
The data to be stored for the candidate are CandidateNumber, FirstName, Surname, DateOfBirth.
The data to be stored for the course are CourseID, Subject, Level.
The data held for each individual exam paper includes CourseID, ExamPaperID, DateOfExam,
Title, TotalMarks, ExamPaperWeighting.
(b) Draw an entity-relationship diagram showing the relationships between the entities. [3]
(c) Write an entity description for a Results entity which will store the exam mark that candidates
receive for each exam paper. [2]
4 (a) Discuss the suitability of flat files and relational databases for use by a family at home and
for use in a large mail order company.
The quality of written communication will be assessed in your answer to this question. [8]
(b) In any relational database, primary and foreign keys are used.
87
SECTION 4 – EXCHANGING DATA
Indexing
In order that a record with a particular primary key can be quickly located in a database, an index of
primary keys will be automatically maintained by the database software, giving the position of each
record according to its primary key.
One or more secondary indexes may be defined when the database is created, for any attribute that is
often used as a search criterion. For example, in the above table both Author and Title might be defined
88
CHAPTER 17 – RELATIONAL DATABASES AND NORMALISATION
as secondary keys. This would speed up searches on either of these fields, which would otherwise have
to be searched sequentially.
• no data is unnecessarily duplicated (i.e. the same data item held in more than one table)
• data is consistent throughout the database (e.g. a customer is not recorded as having different
addresses in different tables of the database). Consistency should be an automatic consequence
of not holding any duplicated data. This means that anomalies will not arise when data is inserted,
amended or deleted.
• the structure of each table is flexible enough to allow you to enter as many or as few items (for
example, components making up a product) as required
• the structure should enable a user to make all kinds of complex queries relating data from
different tables
4-17
There are three basic stages of normalisation known as first, second and third normal form.
Example 1
A company manufacturing soft toys buys the component parts (fake fur, glass eyes, stuffing, growl etc.)
from different suppliers. Each component may be used in the manufacture of several different toys (teddy
bear, dog, duck etc.) Each component comes from a sole supplier.
Sample data to be held in the database is shown in the table:
Table 17.1
89
SECTION 4 – EXCHANGING DATA
A-Level only
As the first stage in normalization, we need to note that there are repeating groups of attributes in this
table; for example, ProductID 123 has three components with IDs ST01, G56 and FF77. We need to split
the data into two tables to get rid of the repeating groups.
Note that a table in a relational database may be referred to as a relation.
Two entities, Product and Component, can be identified. These have the following relationship:
has
Product Component
4-15
4-17 Similarly, Component (CompID, CompName, SupplierID, SupplierDetails, ProductID)
is no good either because each component is used in a number of different products.
One obvious solution (and unfortunately a bad one) springs to mind. How about allowing space for four
components in the record for each product?
Product (ProductID, ProductName, CostPrice, SellingPrice, CompID1, CompQty1,
CompID2, CompQty2, CompID3, CompQty3, CompID4, CompQty4)
This table contains repeating attributes, which are not allowed in first normal form. The attributes
ComponentID and CompQty are repeated four times. The table is therefore NOT in first normal form.
It would be represented in standard notation with a line over the repeating attributes:
Product (ProductID, ProductName, CostPrice, SellingPrice, CompID, CompQty)
To put the data into first normal form, the repeating attributes must be removed.
90
CHAPTER 17 – RELATIONAL DATABASES AND NORMALISATION
A-Level only
The design is now in 1NF because it contains no repeating attribute or groups of attributes.
Q2: raw three tables representing these three entities and put the test data from Table 1 in the
D
correct tables.
orders
A B
will become:
A Link B
91
SECTION 4 – EXCHANGING DATA
A-Level only
The entity relationship diagram showing the relationships between these four tables in third normal form
is shown below. Each entity has its own table.
Supplier
Example 2
A school plans to keep records of Sports Day events for different years in a database. The data that
needs to be held for each event in a particular year is illustrated in the following table:
Q4: Show how the database may be normalised by writing entity descriptions for each relation.
Draw an entity relationship diagram.
No data redundancy
One of the aims of normalising a database design is to remove the possibility of redundant data from any
of the tables. Redundant data is data that appears in more than one database table, which can cause
inefficiencies and inconsistencies in the data, as explained in the next paragraph.
92
CHAPTER 17 – RELATIONAL DATABASES AND NORMALISATION
A-Level only
Deleting records
A normalised database with correctly defined relationships between tables will not allow records in a
table on the ‘one’ side of a one-to-many relationship to be deleted accidentally. For example, a customer
who still has unresolved transactions on file cannot be deleted. This will prevent accidental deletion of a
customer who has an unpaid invoice recorded, for example. A
Exercises
1. The publisher of several magazines has a relational database in which the details of each magazine
are held. One of the tables in the database holds details of all the major articles in each magazine.
(a) Write a description for entities Magazine and Article, showing for each table the primary key, a
foreign key if applicable, and at least two other attributes, using the format
(b) Suggest, with a reason, an attribute in either table which it would be useful to define as a
secondary index. [2]
A-Level only
4-17
2. A college department wishes to create a database to hold information about students and the
courses they take. The relationship between students and courses is shown in the following entity
relationship diagram.
attends
Student Course
(a) Show how the data may be rearranged into relations which are in third normal form. [6]
(b) State two properties that the tables in a fully normalised database must have. [2]
93
SECTION 4 – EXCHANGING DATA
A-Level only
3. A museum has permanent displays but also runs a programme of special events. People may pay
an annual fee to become Friends of the Museum. Friends can attend events, which they must book
in advance. This, and other data about the museum, is stored in a relational database. Part of the
entity-relationship (E-R) diagram is shown.
(a) (i) State the type of relationship between FRIEND and TICKET. [1]
(ii) Explain the use of primary and foreign keys in FRIEND and TICKET. [4]
(b) When the database was being designed, an initial version of the diagram showed a direct
relationship between FRIEND and EVENT.
Draw this initial E-R diagram with FRIEND and EVENT only. [1]
4-18
4-15
94
CHAPTER 18 – INTRODUCTION TO SQL
A-Level only
SQL
SQL, or Structured Query Language (pronounced either as S-Q-L or Sequel) is a declarative
language used for querying and updating tables in a relational database. It can also be used to create
tables. In this chapter, we will look at SQL statements used in querying a database.
The tables shown in Tables 18.1, 18.2 and 18.3 below will be used to demonstrate some SQL
statements. The tables are part of a database used by a retailer to store details of CDs in a database that
will allow information about the CDs to be extracted. The four entities CD, CDSong, Song and Artist are
connected by the following relationships:
CD CDSong Song
4-18
Artist
Figure 18.1
The CD table is shown below.
95
SECTION 4 – EXCHANGING DATA
A-Level only
Example 1
SELECT CDTitle, RecordCompany, DatePublished
FROM CD
WHERE DatePublished BETWEEN #01/01/2015# AND #31/12/2015#
ORDER BY CDTitle
This will return the following records:
Conditions
Conditions in SQL are constructed from the following operators:
96
CHAPTER 18 – INTRODUCTION TO SQL
A-Level only
97
SECTION 4 – EXCHANGING DATA
A-Level only
ArtistID ArtistName
A123 Fred Bates
A134 Maria Okello
A154 Bobby Harris
A315 Jo Morris
A318 JJ
A334 Rapport
Using SQL you can combine data from two or more tables, by specifying which table the data is held in.
For example, suppose you wanted SongTitle, ArtistName and MusicType for all Art Pop music. When
more than one table is involved, SQL uses the syntax tablename.fieldname. (The table name is
optional unless the field name appears in more than one table.)
SELECT Song.SongTitle, Artist.ArtistName, Song.MusicType
FROM Song, Artist
WHERE (Song.ArtistID = Artist.ArtistID) AND (Song.MusicType = "Art Pop")
The condition Song.ArtistID = Artist.ArtistID provides the link between the Song and
Artist tables so that the artist’s name corresponding to the ArtistID in the Song table can be found in the
Artist table. This will produce the following results:
4-18
4-15
SongTitle ArtistName MusicType
Volcano Maria Okello Art Pop
Gentle Waves Maria Okello Art Pop
Right Here Maria Okello Art Pop
Clouds Jo Morris Art Pop
Here with you Bobby Harris Art Pop
SQL JOIN
JOIN provides an alternative method of combining rows from two or more tables, based on a common
field between them. The query above could be written as follows:
SELECT Song.SongTitle, Artist.ArtistName, Song.MusicType
FROM Song
JOIN Artist
ON Song.ArtistID = Artist.ArtistID
WHERE Song.MusicType = "Art Pop"
Q2: Write an SQL query which will give the SongTitle, ArtistName, MusicType of all songs by JJ or
Rapport, sorted by ArtistName and SongTitle.
The fourth table in the database is the table CDSong which links the songs to one or more of the CDs.
98
CHAPTER 18 – INTRODUCTION TO SQL
A-Level only
CDNumber SongID
CD14356 S1234
CD14356 S1258
CD14356 S1415
CD19998 S1234
CD19998 S1389
CD19998 S1423
CD19998 S1456
CD25364 S1256
CD25364 S1392
CD34512 S1392
CD34512 S1234
CD34512 S1389
CD34512 S1444
CD77233 S1256
CD77233 S1344
CD77233 S1399
CD77233 S1456
4-18
Table 18.4: CDSong table
Example 2
We can make a search to find the CDNumbers and titles all the CDs containing the song Waterfall, sung
by JJ.
SELECT Song.SongID, Song.SongTitle, Artist.ArtistName, CDSong.CDNumber,
CD.CDTitle
FROM Song, Artist, CDSong, CD
WHERE CDSong.CDNumber = CD.CDNumber
AND CDSong.SongID = Song.SongID
AND Artist.ArtistID = Song.ArtistID
AND Song.SongTitle = "Waterfall"
This will produce the following results:
Note that in the SELECT statement, it does not matter whether you specify Song.SongID or CDSong.
SongID since they are connected. The same is true of CDSong.CDNumber and CD.CDNumber.
The Boolean conditions CDSong.SongID = Song.SongID and Artist.ArtistID = Song.
ArtistID are required to specify the relationships between the data tables. (See the entity relationship
diagram in Figure 18.1.)
99
SECTION 4 – EXCHANGING DATA
A-Level only
Exercises
1. A school keeps records of school trips on a database. There are four tables on the database
named PUPIL, TRIP, TEACHER, PUPILTRIP, defined as follows:
(a) Draw an entity relationship diagram showing the relationship between the entities. [4]
(i) find the first name and surname of all pupils who went on a trip with TripID 14. [4]
(ii) find all the trips for which the teacher with surname “Black” has been in charge, giving
teacher’s title and surname, trip description and start date, sorted in descending order
of start date. [4]
(iii) find the firstnames and surnames of all the pupils who went on any trip with “Year 7” in
the description (e.g. “Year 7 Geography field trip” in May 2015, showing the firstname
and surname of the teacher in charge. [6]
4-19
100
CHAPTER 19 – DEFINING AND UPDATING TABLES USING SQL
A-Level only
Chapter 19 – D
efining and updating tables
using SQL
Objectives
A • Be able to use SQL to define a database table
A • Be able to use SQL to update, insert and delete data from multiple tables of a relational database
Example 1
Use SQL to create a table named Employee, which has four columns: EmpID (a compulsory int field
which is the primary key), Name (a compulsory character field of length 10), HireDate (an optional date
field) and Salary (an optional real number field).
CREATE TABLE Employee
(
EmpID INTEGER NOT NULL, PRIMARY KEY,
EmpName VARCHAR(20) NOT NULL,
HireDate DATE, 4-19
Salary CURRENCY
)
Data types
Some of the most commonly used data types are described in the table below. (The data types vary
depending on the specific implementation.)
101
SECTION 4 – EXCHANGING DATA
A-Level only
Q1: Use SQL to create a table called Student which is defined as follows:
StudentID 6 characters (Primary key)
Surname 20 characters
FirstName 15 characters
DateOfBirth Date
4-19
Q2: Write an SQL statement to add a new column named YearGroup, of type Integer.
Example 2
Suppose that an extra table is to be added to the Employee database which lists the training courses
offered by the company. A third table shows which date an employee attended a particular course.
Course
Employee Course
Attendance
102
CHAPTER 19 – DEFINING AND UPDATING TABLES USING SQL
A-Level only
The structure of the Course table is:
CourseID 6 characters, fixed length (Primary key)
CourseTitle 30 characters maximum (must be entered)
OnSite Boolean
Q3: Write the SQL statements to create the Course table. 4-19
Note that if all the fields are being added in the correct order you would not need the field names in the
brackets above to be specified. INSERT INTO Employee would be sufficient
Example: a
dd a record for employee number 1125, Cully, who was hired on 1/1/2001. Salary and
Department are not known.
INSERT INTO Employee (EmpID, Name, HireDate)
VALUES ("1125", "Cully", #1/1/2001#)
103
SECTION 4 – EXCHANGING DATA
A-Level only
Example: Update the record for Bloggs, who has moved to Administration.
UPDATE Employee
SET Department = "Administration"
WHERE EmpID = "1122"
104
CHAPTER 19 – DEFINING AND UPDATING TABLES USING SQL
A-Level only
Exercises
1. A car dealer accepts orders for new vehicles from its customers, and puts in an order to the
manufacturer for the customised vehicle(s). There may be more than one vehicle on the customer
order if for example a company is replacing its fleet of hire cars. When a car arrives, a member of staff
telephones or emails the customer to inform them that it is ready for collection.
Details of the vehicles, customers and orders are to be stored in a relational database using the
following four relations:
(ii) Why is it important that the relations in a relational database are in Third Normal Form? [2]
(b) On the incomplete entity relationship diagram below show the degree of any three relationships
that exist between the entities. [3]
4-19
Vehicle CustomerOrder
Customer CustomerOrderLine
(c) Complete the following SQL statement to create the Vehicle relation, including the key field.
(d) A fault has been identified with all cars of Model 10765. The manager needs
a list of the names and telephone numbers of all the customers who have purchased this
type of car so that they can be contacted and the car recalled for modification. This list
should contain no additional details and must be presented in alphabetical order of the
names of the customers.
105
SECTION 4 – EXCHANGING DATA
Capturing data
Before data is added to a database, it has to be captured or input by some means or other. Manual
methods include transcribing data from a form that has been filled in, for example by a customer ordering
items from a catalogue or a market researcher filling in forms on the High Street.
Cheques paid in at a bank are scanned using magnetic ink character recognition (MICR); the bank
number, customer account number and cheque number are printed in special magnetic ink along the
bottom of the cheque. The amount of the cheque has to be manually entered by the bank clerk.
Some forms such as lottery tickets, multiple choice questionnaires or exams may be read using optical
mark recognition (OMR), and other types of form using OCR Optical Character Recognition.)
4-20
Other automated methods include smart card readers, scanners such as those used at airports to scan
passports and barcode readers or scanners.
Q1: Describe some other automated methods of capturing data to be stored on a database.
106
CHAPTER 20 – TRANSACTION PROCESSING
Exchanging data
A common method of transferring data between one computer system and another (usually via the
Internet) without the need for human intervention is EDI (Electronic Data Interchange). Using standardised
message formatting, documents can be exchanged electronically. Transaction software processes
the information and the software on the receiving end looks up details of, for example, items to be
purchased, price, buyer’s name and address etc. in an order processing system.
EDI can be used in countless different applications, such as by Exam Boards to send results to schools,
or by insurance companies to check that an applicant has a driver’s licence.
In the context of databases, a single logical operation on data is defined as a transaction. For example,
a customer booking a cinema ticket, and making an online payment using a credit card, is a single
transaction even though it involves multiple actions.
4-20
The database system has to ensure that it is not possible to complete only part of a transaction, for
example booking the cinema ticket without paying for it. ACID (Atomicity, Consistency, Isolation,
Durability) is a set of properties that guarantees that transactions are processed reliably.
Atomicity
Atomicity requires that a transaction must be processed in its entirety or not at all. Atomicity must
guarantee that in any situation, including power cuts or hard disk crashes, it is not possible to process
only part of a transaction.
Consistency
The consistency property ensures that no transaction can violate any of the defined validation rules for
maintaining the integrity of the database. When a database is created, referential integrity rules will be
specified between linked tables (see Chapter 16). Thus it will not be possible, for example, to record a
mark in a RESULTS table for a student who is not in the STUDENT table in the database. Similarly, it will
not be possible to delete a record from the STUDENT table if they have marks on the RESULTS table.
Isolation
The isolation property ensures that concurrent execution of transactions leads to the same results as if
transactions were processed one after the other.
Durability
The durability property ensures that once a transaction has been committed, it will remain so, even
in the event of a power cut. For example, if the online sale of a cinema ticket is in the process of being
completed, it should not be possible for the number of seats sold to be updated but the customer’s debit
card not processed. As each part of the transaction is completed, it is held in a buffer on disk until all
elements of the transaction are completed. Only then will the changes to the database tables be made.
107
SECTION 4 – EXCHANGING DATA
A-Level only
Q2: What state will the record be in? (i.e. which address and credit limit will it hold?)
There are several methods which may be employed to avoid updates being lost.
Record locks
Record locking is the technique of preventing simultaneous access to objects in a database in order to
prevent updates being lost or inconsistencies in the data arising. In its simplest form, a record is locked
4-20 whenever a user retrieves it for editing or updating. Anyone else attempting to retrieve the same record is
denied access until the transaction is completed or cancelled.
Problems with record locking
If two users are attempting to update two records, a situation can arise in which neither can proceed,
known as deadlock. Suppose a bank clerk is updating Customer A’s record with a transfer to Customer
B’s account. Meanwhile a second bank clerk is trying to update Customer B’s record, as he needs to
transfer money to Customer A’s account.
User1 User2
locks Customer A’s record locks Customer B’s record
tries to access Customer B’s record tries to access Customer A’s record
waits .. waits ..
DEADLOCK!
The DBMS must recognise when this situation has occurred and take action. Serialisation, timestamp
ordering or commitment ordering may be used.
Serialisation
This is a technique which ensures that transactions do not overlap in time and therefore cannot interfere
with each other or lead to updates being lost. A transaction cannot start until the previous one has
finished. It can be implemented using timestamp ordering.
108
CHAPTER 20 – TRANSACTION PROCESSING
A-Level only
Timestamp ordering
Whenever a transaction starts, it is given a timestamp, so that if two transactions affect the same object
(for example record or table), the transaction with the earlier timestamp should be applied first.
In order to ensure that transactions are not lost, every object in the database has a read timestamp and
a write timestamp, which are updated whenever an object in a database is read or written.
When a transaction starts, it reads the data from a record causing the read timestamp to be set. When
it writes the updated data back to the record it will check the read timestamp. If this is not the same as
the value that was saved when this transaction started, it will know that another transaction is also taking
place on the record. A range of potential problems can thus be identified and avoided.
Commitment ordering
This is another serialisation technique used to ensure that transactions are not lost when two or more
users are simultaneously trying to access the same database object. Transactions are ordered in terms
of their dependencies on each other as well as the time they were initiated. It can be used to prevent
deadlock by blocking one request until another is completed.
Redundancy
Very many organisations such as banks, airport systems, hospitals, and others cannot afford to have their
computer systems go down even for a few seconds, with consequent loss of transaction data. These
organisations maintain two or even three identical systems in different geographical locations, so that
every transaction is written to two or three different storage facilities. This built-in hardware redundancy
protects agianst loss of data in the event of power failure or other disasters.
4-20
If one system fails, the backup system automatically takes over and processing can continue.
Exercises
1. (a) Explain how, in a client-server database with multiple users, an update made by one user
may not be recorded if the database management system does not have measures in place
to ensure the integrity of the database. [3]
(b) Explain what is meant by deadlock and how this can arise. [2]
(b) Name and describe briefly a method of preventing this from happening. [2]
(b) Describe what is meant by the ACID model in database theory. [6]
109
Section 5
Networks and web technologies
In this section:
Chapter 21 Structure of the Internet 111
110
CHAPTER 21 – STRUCTURE OF THE INTERNET
Give one example of where the Internet can be used without the World Wide Web.
Q1:
3000 35
33%
30
2500 28.1%
Percentage of world's population
2500
Internet users (Millions)
25
2000
2040
20
1500 15.7%
15
1000
1018 10
5.8%
500
0.4% 5
16 361
0 0
1995 2000 2005 2010 2015
111
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
5-21
Trans-continental Internet connections, TeleGeography
http://www.domainname.com/folder/subfolder/webpage.html#element
URL
112
CHAPTER 21 – STRUCTURE OF THE INTERNET
www.arin.net www.ripe.net
IS THE
NATING
OR THE
GIONAL www.apnic.net
RNET
TRIES www.lacnic.net
@theNRO
www.afrinic.net
<root>
Generic TLDs .com .edu .org .uk .fr .de Country TLDs
A hierarchical domain system from Top Level Domains (TLDs) to 3rd Level Domains (3LDs)
113
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
Each domain name has one or more equivalent IP addresses. The DNS catalogues all domain names
and IP addresses in a series of global directories that domain name servers can access in order to find
the correct IP address location for a resource. When a webpage is requested using the URL a user
enters, the browser requests the corresponding IP address from a local DNS. If that DNS does not
have the correct IP address, the search is extended up the hierarchy to another larger DNS database.
The IP address is located and a data request is sent by the user’s computer to that location to find the
web page data. A webpage can be accessed within a browser by entering the IP address if it is known.
Try entering 74.125.227.176 into a browser.
mail.websitename.co.uk
Website domain name
Q2: Why are IP addresses not used to access websites instead of alphanumeric addresses?
130.142.37.108
The IP address indicates where a packet of data is to be sent or has been sent from. Routers can use
this address to direct the data packet accordingly. If a domain name is associated with a specific IP
address, the IP address is the address of the server that the website resides on.
114
CHAPTER 21 – STRUCTURE OF THE INTERNET
File/Print Printer
server
Computer Computer
Computer Computer
Computer
Computer Computer
Switch
Computer Computer
115
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
5-21 For example, any variety of Ethernet uses a logical bus topology when components communicate,
regardless of the physical layout of the cable.
Q3: (a) What topology does your school use? It may be a hybrid of different topologies.
(b) Can you tell whether the physical layout differs from the logical layout?
Wi-Fi
Wi-Fi is a local area wireless technology that enables you to connect a device such as a PC, smartphone,
digital audio player, laptop or tablet computer to a network resource or to the Internet via a wireless
network access point (WAP). An access point has a range of about 20 metres indoors, and
more outdoors.
In 1999, the Wi-Fi Alliance was formed to establish international standards for interoperability and
backward compatibility. The Alliance consists of a group of several hundred companies around the world,
and enforces the use of standards for device connectivity and network connections.
116
CHAPTER 21 – STRUCTURE OF THE INTERNET
A-Level only
Internet
connection
117
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
Exercises
1. A Uniform Resource Locator (URL) is the address of a resource on the Internet. For example,
http://www.pgonline.co.uk/courses/alevel/computing_test.html.
2. A village hall committee is considering purchasing a lease on a web domain to set up a new
website to advertise their events. They have been advised to contact an Internet registrar.
(b) What is the primary role of an Internet Service Provider (ISP)? [1]
(b) Suggest two items of hardware that would be required to create a wireless LAN. [2]
5-22
5-21
118
CHAPTER 22 – INTERNET COMMUNICATION
Circuit switching
Circuit switching creates a direct link between two devices for the duration of the communication.
The public telephone system is an example of a circuit switched network. When a caller dials a
number, various switches in telephone exchanges set up a path between the caller and the recipient.
The connection is set up for the entire duration of the call including periods of silence and pauses.
This enables two people to hold a call without any delay in the delivery of speech.
If two computers use the circuit switching principle, bandwidth is wasted during the periods when no
data is being sent. The two devices must also transmit and receive data at the same rate, so circuit
switched networks can only connect computers or devices that operate at the same transfer rate. 5-21
On the other hand, since this is an exclusive connection between the two devices for the duration of the
communication, data segments (or packets) arrive in the same order that they are sent, simplifying the
process of reconstructing the message at the recipient end.
Because switches are used to connect and disconnect the circuits, electrical interference is produced
and although this is not a serious problem for speech, it may produce corrupt or lost data if the path is
being used to transmit data. If this is likely to be a problem, a leased line may be used instead.
Packet switching
Packet switching is a method of communicating packets of data across a network on which other
similar communications are happening simultaneously. Website data that you receive arrives as a series
of packets and an email will leave you in a series of packets.
Data packets
Data that is to be transmitted across a network is broken down into more manageable chunks called
packets. The size of each packet in a transmission can be fixed or variable, but most are between 500
and 1500 bytes. Each packet contains a header and a payload containing the body of data being sent.
Some packets may also use a trailer section with a checksum or Cyclical Redundancy Check (CRC)
to detect transmission errors by creating and attaching a hash total calculated from the data contained
in the packets. In essence, this hash total commonly involves adding up the total number of 1s in the
transmission. The CRC checksum is recalculated for each packet upon receipt and matched to help
verify that the payload data has not changed during transmission. If the CRC totals differ, the packet is
refused with suspected data corruption and a new copy is requested from the sender.
119
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
The header (much like the box(es) of a consignment you might send or receive through the post) includes
the sender’s and the recipient’s IP addresses, the protocol being used with this type of packet and the
number of the packet in the sequence being sent, e.g. packet 1 of 8. They also include the Time To Live
(TTL) or hop limit, after which point the data packet expires and is discarded.
Header
Header
Header
Trailer
Trailer
Trailer
Payload Payload Payload
The payload of the packet contains the actual data being sent. Upon receipt, the packets are
reassembled in the correct order and the data is extracted.
3 1
1 3
3 2 1
2 1
Packets
3
Router / Node
2 1 3 2
2 3
Q2: What information is included in the packet header to enable the receiving computer to
reassemble packets in the correct order?
120
CHAPTER 22 – INTERNET COMMUNICATION
A-Level only
Routers
Each node in the diagram above represents a router. Routers are used to connect at least two networks,
commonly two LANs or WANs, or to connect a LAN and its ISP’s network. The act of traversing between
one router and another across a network is referred to as a hop. The job of a router is to read the
recipient’s IP address in each packet and forward it on to the recipient via the fastest and least congested
route to the next router, which will do the same until the packet reaches its destination. Routers use
routing tables to store and update the locations of other network devices and the most efficient routes
to them. A routing algorithm is used to find the optimum route. The routing algorithm used to decide
the best route can become a bottleneck in network traffic since the decision making process can be
complicated. A common shortest path algorithm used in routing is Dijkstra’s algorithm. (See Chapter 64.)
When a router is connected to the Internet, the IP address of the port connecting it must be registered
with the Internet registry because this IP address must be unique over the whole Internet.
Gateways
Routing packets from one network to another requires a router if the networks share the same protocols, for
example TCP/IP. Where these protocols differ between networks, a gateway is used rather than a router
to translate between them. All of the header data is stripped from the packet leaving only the raw data and
new data is added in the format of the new network before the gateway sends the packet on its way again.
Gateways otherwise perform a similar job to routers in moving data packets towards their destination.
A
Media Access Control (MAC) addresses
Every computer device, whether it’s a PC, smartphone, laptop, printer or other device which is capable
of being part of a network, must have a wired or wireless Network Interface Card (NIC). Each NIC has a 5-21
unique Media Access Control address (MAC address), which is assigned and hard-coded into the card
by the manufacturer and which uniquely identifies the device. The address is 48 bits long, and is written
as 12 hex digits, for example:
00-09-5D-E3-F7-62
You can find out the MAC address of your PC by selecting Command Prompt from the Start menu in
Windows, and then typing ipconfig /all. This will display the physical address, i.e. MAC address.
Q3: Do some research to find out whether you can change the MAC address of your PC.
Why might someone want to do this?
121
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
• Application layer
• Transport layer
• Network layer
• Link layer
Terminal A Terminal B
Application Router Router Application
“Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.”
Albert Einstein
122
CHAPTER 22 – INTERNET COMMUNICATION
123
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
At the receiving end, the MAC address is stripped off by the link layer, which passes the packets on
to the network layer. The IP addresses are then removed by the network layer which passes them on
to the transport layer to remove the port numbers and reassemble the packets in the correct order.
The resulting data is then passed to the application which presents the data for the user. Since routers
operate on the network layer, source and destination MAC addresses are changed at each router node.
Packets, therefore, move up and down the lower layers in the stack as they pass through each router or
switch between the client and the server as shown in Figure 22.1.
Q4: Imagine you are sending a friend a consignment of 5000 widgets in five boxes via a shipping
agent. What information would you, the shipping agent, an intermediary depot and the delivery
drivers write on the boxes or on a cover note inside? How does this relate to the TCP/IP stack?
5-22
5-21
124
CHAPTER 22 – INTERNET COMMUNICATION
Q5: Georgina is trying to contact her brother Nick, who is currently travelling overseas. Georgina
has no idea where Nick is exactly and sends an email to his webmail account. Explain, with
reference to the protocols involved, how Nick is able to pick up Georgina’s message.
Exercises 5-21
1. All Internet connected devices communicate via the TCP/IP protocol stack. This has four
layers – the application, transport, network and link layers.
(a) Describe the roles of each layer when two devices are communicating over the Internet. [8]
(b) (i) Give the names of one piece of network hardware that operates on the Network layer. [1]
(ii) Give the names of one piece of network hardware that operates on the Link layer. [1]
2. Major parts of the Internet run on a packet switched network that relies on routers and
gateways to communicate.
(b) A data packet contains a header and a payload. The header contains data that it used to
route the packet to its destination.
State three data items that might be contained in a data packet’s header. [3]
(c) Explain the difference between a router and a gateway. [2]
125
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
A • Discuss worms, Trojans and viruses and the vulnerabilities that they exploit
Firewalls
A firewall is a security checkpoint designed to prevent unauthorised access between two networks,
usually an internal trusted network and an external, deemed untrusted, network; often the Internet.
Firewalls can be implemented in both hardware and/or software. A router may contain a firewall.
A typical firewall consists of a separate computer containing two Network Interface Cards (NICs), with
one connected to the internal network, and the other connected to the external network. Using special
firewall software, each data packet that attempts to pass between the two NICs is analysed against
preconfigured rules (packet filters), then accepted or rejected. A firewall may also act as a
proxy server.
Packet filtering
Packet filtering, also referred to as static filtering, controls network access according to network
5-23 administrator rules and policies by examining the source and destination IP addresses in packet headers.
If the IP addresses match those recorded on the administrator’s ‘permitted’ list, they are accepted. Static
filtering can also block packets based on the protocols being used and the port numbers they are trying
to access. A port is similar to an airport gate, where an incoming aircraft reaches the correct airport (the
computer or network at a particular IP address) and is directed to a particular gate to allow passengers
into the airport, or in this case to download the packet’s payload data to the computer.
Client Server
24.120.63.37:80
address:
Destination
192.168.0.2:1040
address:
Source
Packet header
Certain protocols use particular ports. Telnet, for example, is used to remotely access computers and
uses port 23. If Telnet is disallowed by a network administrator, any packets attempting to connect
through port 23 will be dropped or rejected to deny access. A dropped packet is quietly removed,
whereas a rejected packet will cause a rejection notice to be sent back to the sender.
126
CHAPTER 23 – NETWORK SECURITY AND THREATS
A-Level only
Proxy servers
A proxy server intercepts all packets entering and leaving a network, hiding the true network addresses
of the source from the recipient. This enables privacy and anonymous surfing. A proxy can also maintain
a cache of websites commonly visited and return the web page data to the user immediately without the
need to reconnect to the Internet and re-request the page from the website server. This speeds up user
access to web page data and reduces web traffic. If a web page is not in the cache, then the proxy will
make a request of its own on behalf of the user to the web server using its own IP address and forward
the returned data to the user, adding the page to its cache for other users going through the same proxy
server to access. A proxy server may serve hundreds, if not thousands of users.
24.120.63.37
address:
Destination
address:
Source
72.214.61.117
Packet header
Packet header
Cache
5-23
Proxy servers are often used to filter requests providing administrative control over the content that users
may demand. A common example is a school web-proxy that filters undesirable or potentially unsafe
online content in accordance with the school usage policies. Such proxies may also log user data with
their requests.
Encryption
Encryption is one way of making messages travelling over the Internet secure. Different encryption
methods are covered in Section 4, Chapter 15.
127
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
The Cascade virus caused text characters to fall from the top of the screen
A worm can reside within a data file of another application and will usually enter the computer through a
vulnerability or by tricking the user into opening a file; often an attachment in an email. Rather than simply
infecting other files like a virus on your own machine, a worm can replicate itself and send copies to other
users from your computer; commonly by emailing others in your electronic address book.
Owing to the ability of a worm to copy itself, worms are often responsible for using up bandwidth, system
memory or network resources, causing computers to slow and servers to stop responding.
Q1: Look up the ILOVEYOU, Melissa, Blaster and Cascade viruses or worms. Why should you
exercise caution in opening attachments in emails or data files containing macro code?
Trojans
A Trojan is so-called after the story of the great horse of Troy, according to which soldiers hid inside a
5-23 large wooden horse offered as a gift to an opposition castle. The castle guards wheeled the wooden
horse inside their castle walls, and the enemy soldiers jumped out from inside the horse to attack. A
Trojan is every bit as cunning and frequently manifests itself inside a seemingly useful file, game or
utility that you want to install on your computer. When installed, the payload is released, often without
any obvious irritation. A common use for a Trojan is to open a back door to your computer system
that the Trojan creator can exploit. This can be in order to harvest your personal information, or to use
your computer power and network bandwidth to send thousands of spam emails to others. Groups of
Internet-enabled computers used like this are called botnets. Unlike viruses and worms, Trojans cannot
self-replicate.
Giovanni Domenico Tiepolo - The Procession of the Trojan Horse in Troy, c.1760
128
CHAPTER 23 – NETWORK SECURITY AND THREATS
A-Level only
System vulnerabilities
Malware exploits vulnerabilities in our systems, be they human error or software bugs. People may
switch off their firewalls or fail to renew virus protection which will create obvious weaknesses in their
systems. Administrative rights can also fail to prevent access to certain file areas which may otherwise be
breached by viral threats. Otherwise cracks in software where data is passed from one function, module
or application to another, (which is often deemed to have been checked and trusted somehow by the
source) may open opportunities for attackers.
People are often the weakest point in security. Passwords are no guarantee of protection against
unauthorised access since these are sometimes written down, guessed or dishonestly ‘blagged’ using
social engineering techniques to persuade the password holder to divulge their
authentication credentials.
Q2: In what ways are social engineering methods used to fraudulently elicit a password or
authentication details from a user?
Regular operating system and antivirus software updates will also help to reduce the risk of attack.
Virus checkers usually scan for all other malware types and not just viruses, and since new variants are
created all the time to exploit vulnerabilities in systems software, it is vital that your system has the latest
protection. In the worst cases, a lack of monitoring and protection within a company can make
national headlines.
Exercises
1. A large company network uses a firewall as part of its security.
(b) The company also uses anti-virus software as protection against worms, viruses and Trojans.
(i) Give one reason why the anti-virus software should be kept up-to-date. [1]
(ii) State the difference between worms, viruses and Trojans. [3]
2. Malicious attacks on systems are frequently identified and blocked by various systems.
(a) How might a proxy server reduce the risks of malware attacks on a network? [1]
(b) Give three methods that school systems administrators can use to reduce the threat of malware. [3]
(c) Explain how the use of a proxy server may make access to websites faster for users. [2]
A
129
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
5-24
HTML Tags
HTML is made up of tags written in angle brackets, often in opening and closing pairs,
e.g. <html> and </html>.
A standard web page comprises two sections – a head and a body. The head contains the title of the
webpage that may appear in a window header or browser tab, and any script that may enrich your page
content. The body contains the main content of the page, defining text, images and hyperlinks. An HTML
file can be created using a text editor such as Notepad, or using software such as Adobe Dreamweaver.
130
CHAPTER 24 – HTML AND CSS
Q1: Create a simple HTML file in Notepad as shown above. Change the <title> text and add text
to the <body> section. What effect does this have when viewed in a browser?
CSS Script
CSS is a scripting language similar to HTML that is used to describe the layout and styles of a web page.
Styles can be applied to existing HTML elements such as <h1>, <p> or <div>.
131
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
The styles for the id selector called page are listed within curly brackets within the CSS document:
#page{max-width:800px; margin: 20px auto; padding: 30px;
background-color: #cc6633;}
(Refer to line 8 of the HTML script on the next page, and lines 13-19 of the CSS script overleaf.) Any
HTML content within the page divider will be styled accordingly.
Identifiers
Identifiers are defined with a hash tag (#) preceding the id name, e.g. #header (CSS lines 21-26).
Identifiers must be unique to every webpage. In this ‘Germ theory’ example, #header is a good example
of a unique element since a webpage is likely only to contain one header.
Classes
Classes work in a similar way to an identifier but use a full stop as a prefix to the class name e.g. .list
(CSS Script lines 35-38). Classes can be used multiple times on a webpage. In the example within this
chapter, there are two lists which share common formatting unique to the list element such as the font
colour. This can be defined in the CSS and applied to all list <div> regions on the page. See HTML
Script lines 22 and 32.
5-24
<div
<h1> id="header">
<h2>
<div
<h3> id="page">
img {border}:
double 10px
white;}
<div id="left-
column">
<div id=
<div "right-
class="list"> column">
132
CHAPTER 24 – HTML AND CSS
HTML Script
5-24
Q2: Explain the function of the HTML code on lines 14 and 16.
Q3: The webpage owners would like to change the font colour of the numbered and bulleted text to
white. Explain what change needs to made in order to achieve this.
133
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
CSS script
5-24
134
CHAPTER 24 – HTML AND CSS
Exercises
1. A website has the following HTML code.
<html>
<head>
<title>Garden Roses</title>
</head>
<body>
<h1 style="font-family:Arial; color:red">Species</h1>
<p>There are over 100 species of rose.</p>
<!—Part b -->
<ul>
<li>Climbing roses</li>
<li>Shrub roses</li>
<li>Rambling roses</li>
</ul>
</body>
</html>
(a) Sketch and annotate the website as it would appear in a browser. [4]
5-24
(b) The site owner would like to add a hyperlinked image rose.jpg in place of the comment
<!—- Part b -->. The image would link to the website http://www.roses.com.
Write the code to enable this. [3]
(c) Heading 1 <h1> has had some styles applied using inline CSS.
(i) Give one advantage of using CSS styles within the HTML document. [1]
(ii) Give two advantages of using an external CSS style sheet. [2]
(d) An external CSS style sheet is added to the web page. This contains three rules. Describe
what effect if any these rules will have on the appearance of the web page. Where there is no
effect, this should be stated.
(e) The text within the <ul> tags needs to be styled in green with the intention that any other lists
added to the page share the same style. Explain how this can be achieved. [3]
(a) Explain the difference between them giving an example of when each might be used. [2]
(b) Explain how a CSS style defined as a class or identifier may be applied to a specific section of
HTML content. [2]
135
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
Web forms
Web forms enable websites to collect user input data and selections. Input types include textboxes,
check boxes and radio buttons, for example.
5-25
Input can be validated and submitted to the website owner’s database or processed as part of a search
query to find, for example, train times or your nearest shop branch when you enter a postcode.
136
CHAPTER 25 – WEB FORMS AND JAVASCRIPT
The HTML script below should be compared with the screenshot of the page below.
<h1>Register</h1>
<form action="process.php" method="post">
<label>Enter your email to register:</label>
<input type="text" id="email" value="" size="40" />
<button type="submit">Submit</button>
<button type="reset">Reset</button>
</form>
Q1: The buttons are created in the browser window using built-in formatting. How might
customised styling be applied?
JavaScript
JavaScript is a script language that uses all of the same programming constructs that are familiar in
languages such as Python and VB. It should not be confused with the language Java. JavaScript is
commonly used to add interactivity to websites, including the manipulation of page objects, animations,
navigation tools and form validation. JavaScript is interpreted rather than compiled. Compilers produce
object code which is specific to a particular type of processor. JavaScript needs to be translated
into the object code for whichever computer the browser is running on, and will be translated by the
interpreter when the web page is displayed. An interpreter in the browser reads the JavaScript code,
interprets each line and runs it. Some of the latest browsers however, use ‘Just-In-Time’ compilation
which compiles JavaScript into executable bytecode just before execution.
Input
JavaScript can be used to process input data on the client’s computer. This may change the local page
interactively or post data to a server. The advantages of processing input data before it is posted to a
server are that:
• the local computer can validate erroneous data before submission to a database
• a busy server is relieved from having to process everything itself.
137
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
Output
JavaScript can reference and interact with HTML elements to edit, style or move them. For example, a
validation script may change a ‘postcode’ input label to become red if a user has entered invalid data:
document.getElementById("postcode").style.color="red";
The HTML form elements are given ids in order for the JavaScript to reference them. (See lines 16-19
of the HTML form script below.) Buttons are given onClick attributes in order to execute JavaScript
5-25 functions when they are pressed. Their type has also been changed to become "button" rather than
submit or reset actions. (See lines 20-21.)
138
CHAPTER 25 – WEB FORMS AND JAVASCRIPT
JavaScript code
JavaScript functions and commands are added to HTML documents within <script> tags.
5-25
JavaScript output
JavaScript commands can access and edit HTML elements outside of the <script> tags, and write
directly to the web page document using the command document.write("Hello World"); for
example. The attribute .innerHTML of an HTML element can be edited directly. (See line 28 above.)
139
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
Another method is to cause the browser to display a pop-up alert box with a custom message requiring
the user’s attention. Line 54 displays an alert box once the user has submitted valid details.
Q4: What are the identifiers of two variables used in the JavaScript code?
Q5: The function setupForm is called and executed using the command setupForm(); on line 63.
(a) What is the purpose of the function?
5-25 (b) Looking at the HTML form script, when else is this function used?
Validation
Validation routines are commonly built in to webpages using JavaScript since the script is executed
locally on the client’s machine. The function validateForm() checks the user input and either
changes form labels and styling in response to an invalid entry, or displays the alert box above.
140
CHAPTER 25 – WEB FORMS AND JAVASCRIPT
Arrays in JavaScript
JavaScript arrays can hold any type of data. In this example there are two arrays – one to hold a set of
three captcha images and the other to hold the answers to each of them.
var captcha=["captcha1.jpg","captcha2.jpg","captcha3.jpg"];
var captchaAnswer=["weasel","moose","polecat"];
On line 35, a math function generates an average number between 1 and the length of the array (i.e. 3
in this case), and assigns it to variable j. JavaScript array indexes begin at 0, so 1 is subtracted from j
using the simplified command j-- to decrement j by 1 in order to reference array indices 0-2.
Exercises
1. A website contains Javascript code.
(a) Describe what is meant by the term JavaScript. [2]
(b) Explain why JavaScript is usually interpreted rather than compiled. [2]
2. The website www.postrates.com offers a rate check service for sending letters and parcels.
The homepage contains a button hyperlinked to the following webpage:
5-25
141
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
Search engines
Search engines such as Google are systems that locate resources on the Internet. These resources could
be web pages, documents, images or other files.
Q1: What resources and file types might be accessed via your school website?
5-26
Q2: If you create a new website, why might it not appear in search results immediately?
Meta tags and descriptions are a list of keywords or concise phrases specified by the website owner
that are built into each webpage. Descriptions are displayed with the page title in search results as
shown above. These can be defined in the HTML documents within the <head> section to help searches.
142
CHAPTER 26 – SEARCH ENGINE INDEXING
A-Level only
Search results
There are believed to be over 200 factors affecting search results that may help position your own
website nearer the top of the results list. Other than metatags and descriptions, these include:
• using keywords in the <title> tag
• the age of your website and date of last update (or frequency of updates)
• the number and relevancy of keywords appearing in <h1> tags and
• the relevancy of the domain name to the content
D
B
A
E
C
Using PageRank, B has a higher page rank than C because it is a more authoritative source.
By 2015, Google was processing 40,000 search queries every second, worldwide. David Vise, the author
of The Google Story noted that “Not since Gutenberg* … has any new invention empowered individuals,
and transformed access to information, as profoundly as Google.”
A-Level only
Q3: Page X has outbound links to ten well-regarded and high-ranking websites. Page Y has 10 inbound
links from the same ten websites. Which is more likely to have a higher page rank and why?
Calculating PageRank
PageRank is effectively a popularity contest between websites defined by the number of votes or inbound
links they receive, with a weighting to give more importance to some votes than others. This weighting
is swayed by either the number of outbound links a site has or the importance (or PageRank) of a site.
A website with a good reputation and high PageRank will have a higher weighting assigned to its ‘votes’
but its total vote is shared or diluted amongst all of the sites it links to.
The PageRank algorithm itself is defined as:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
where:
Q4: Google’s PageRank algorithm uses a damping factor. What is the purpose of the damping factor?
Example 1
In this simplest of examples with a hypothetical world wide web consisting of just two web pages,
pages A and B would have equal ranking if there is one inbound and one outbound link between them.
A B
This can be calculated using the PageRank algorithm to give an equal ranking of 1:
d = 0.85
PR(A) = (1 – d) + d(PR(B)/1) PR(A) = 0.15 + 0.85 * 1 = 1
PR(B) = (1 – d) + d(PR(A)/1) PR(B) = 0.15 + 0.85 * 1 = 1
144
CHAPTER 26 – SEARCH ENGINE INDEXING
A-Level only
Example 2 shows the iterative process used to calculate and recalculate the PageRank (PR) of a group of
webpages where the starting point is unknown.
Example 2
As the number of web pages grows, more complex link structures are created. After the addition of
one extra web page, the PageRank is recalculated and adjusted to reflect the new pages and links.
A B
C D
Q5: What factors may result in a web page A’s rank rising or falling over time as it is revised?
145
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
Exercises
1. The owner of website www.inflatablecastle.com is trying to improve the positioning of his homepage
inflatablecastle.com/index.html in search engine listings.
(a) Other than PageRank, give three design factors that may affect the company homepage’s
positioning in search results. [3]
(b) With reference to the diagram below, explain which page is likely to have the highest
PageRank. You are not expected to perform any calculations. [2]
inflatablecastle.com A
5-26
C B
(c) Looking at the algorithm, what factors directly influence the PageRank of the homepage
index.html at inflatablecastles.com? [2]
(d) PageRank uses a damping factor d in its algorithm. Explain the purpose of d. [2]
2. Search engines provide a listing of all web pages with content relevant to a set of search terms.
(b) With reference to the screenshot below, state which line of code contains metatags. [1]
146
CHAPTER 27 – CLIENT-SERVER AND PEER-TO-PEER
A-Level only
Client-server networking
In a client-server network, one or more computers known as clients are connected to a powerful
central computer known as the server. Each client may hold some of its own files and resources such as
software, and can also access resources held by the server. In a large network, there may be several
servers, each performing a different task.
5-27
Computer Computer Laptop
Printer
Switch
• File server holds and manages data for all the clients
• Print server manages print requests
• Web server manages requests to access the Web
• Mail server manages the email system
• Database server manages database applications
In a client-server network, the client makes a request to the server which then processes the request.
147
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
Cloud computing
Cloud computing refers to a growing service-based industry providing access to software or files via the
Internet using the client-server model. File storage companies such as DropBox, OneDrive or Google
Drive offer file storage facilities where users’ files are kept on remote servers. Other companies offer
software via the cloud, a provision known as Software as a Service (SaaS). Microsoft, for example offers
cloud-based Office applications. Accounting packages are also available through website logins where all
the company data and application are stored offsite.
Q1: Cloud-based storage facilities such as DropBox and Google Drive store files for users. What are
the advantages of using cloud-based services? What happens when a user requests a file?
Peer-to-peer networks
In a peer-to-peer network, there is no central server. Individual computers are connected to each other,
either locally or over a wide area network so that they can share files. In a small local area network, such
as in a home or small office, a peer-to-peer network is a good choice because:
• it is cheap to set up
5-27 • it enables users to share resources such as a printer or router
• it is not difficult to maintain
Terminal
Peer-to-peer networks are also used by companies providing, for example, video on demand. A problem
arises when thousands of people simultaneously want to download the latest episode of a particular TV
show. Using a peer-to-peer network, hundreds of computers can be used to hold parts of the video and
so share the load. This is the main principle behind dozens of torrent websites that enable the sharing of
files, often containing copyright material.
148
CHAPTER 27 – CLIENT-SERVER AND PEER-TO-PEER
A-Level only
File transfer
Napster 5-27
client
Napster
client
Song
request
Napster
Napster
client
Central
INTERNET
Index
server
Napster Your
client computer
Q2: Look up the Copyright, Designs and Patents Act 1988. What types of work are protected by
this Act? For what period of time is a work protected by copyright?
149
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
Web servers
A client will send a request message to a server which should respond with the data requested or a
suitable message otherwise.
HTTP Server
HTTP Client HTTP Client
Request Request
Response Response
This is commonly seen when a client browser sends an HTTP request to a web server for dynamic web
page data or a web resource, or when using a web page with an online search facility such as checking
availability via a booking form.
5-27
The page data is sent back from the HTTP server by way of response and the browser renders the web
page on the client’s computer.
Client-side processing
Client-side processing describes situations where data is processed on the client computer, rather than
on the server. This may happen because the client computer has specific software that can process the
information, or to lighten the load on the server’s processor. Processing data on the client-side can also
improve security as it can avoid unnecessary data transfer. JavaScript is a client-side language and is
frequently used to provide interactivity on a web page. Client-side processing can also adjust styles for
different platforms or screen sizes.
150
CHAPTER 27 – CLIENT-SERVER AND PEER-TO-PEER
A-Level only
JavaScript validation
JavaScript is commonly used for processing data on the client side to validate data entry before it is
sent to the server.
<script>
function validate() {
var airport = document.forms["departure"]["arrival"];
if (airport.value == "") {
airport.style.borderColor = "red";
alert("Departure and arrival airports cannot be left blank.");
return false;
}
}
</script>
Q3: What are the advantages of validating data on the client side before it is sent to the server?
Server-side processing
Servers often process an enormous volume of data on behalf of multiple clients. They can also process
the data much faster than a client computer. There are specific languages that are used for server-side
processing such as SQL or PHP. Search requests (e.g. for a search engine or a company database) may 5-27
be sent to the server where they may be applied to a database using SQL. Database search results are
then sent back to the client browser. Validation may also be carried out on the server where an invalid
entry must be compared with data already on a server database. Examples may include checking user
credentials, or looking up valid airport locations. JavaScript may also be circumvented mailicously so
server-side validation is important for the integrity of server data.
Server-side validation
151
SECTION 5 – NETWORKS AND WEB TECHNOLOGIES
A-Level only
Q4: How might you design a mobile GPS navigation app in order to optimise its use, given the
advantages and disadvantages of thin- and thick-client systems?
Advantages Disadvantages
5-27 Easy to set up, maintain and add terminals Reliant on the server, so if the server goes
to a network with little installation required down, the terminals lose functionality.
locally.
Requires a very powerful, and reliable
Software and updates can be installed on server which is expensive.
Thin-client the server and automatically distributed to
each client terminal. Server demand and bandwidth increased.
More secure since data is all kept centrally Maintaining network connections for
in one place. portable devices consumes more battery
power than local data processing.
Robust and reliable, providing greater More expensive, higher specification client
up-time. computers required.
152
CHAPTER 27 – CLIENT-SERVER AND PEER-TO-PEER
A-Level only
Exercises
1. Explain the difference between client-server and peer-to-peer networking, and give an example of
where each might be used. [6]
2. A travel agency is planning to install a new computer system based on the client-server model, for
its agents to use for flight and hotel bookings and enquiries at multiple workstations.
After some consideration, the company has decided to use a thin-client network.
(c) How would the decision to use a thin- rather than thick-client network affect the choice
of hardware? [2]
3. A company is designing a website which will allow its customers to place orders online.
The individual web pages that describe each product will be generated dynamically using
server-side scripting.
4. A website is set up to enable users to access on-demand television programs. Users can sign 5-27
up to the website and download a recent series episode or film. Programs are downloaded and
stored on the user’s device. When others choose to download the same program, parts of the
program data may come from multiple devices belonging to other users.
(b) (i) Give one advantage to the company of this model. [1]
(c) JavaScript is used to validate that the user’s email address is in a valid format when a
booking is made.
(ii) Client- and server-side validation should happen in partnership. Explain why it is
important to validate the email address again once it reaches the server. [1]
153
Section 6
Data types
In this section:
Chapter 28 Primitive data types, binary and hexadecimal 155
154
CHAPTER 28 – PRIMITIVE DATA TYPES, BINARY AND HEXADECIMAL
Chapter 28
Primitive data types, binary and hexadecimal
Objectives
• List and define primitive data types
• Represent positive integers in binary and hexadecimal
• Convert between binary, hexadecimal and denary
Number bases
Our familiar decimal (or denary) number system uses the numbers 0 through 9 and therefore has a
base of 10. Binary uses only the numbers 0 and 1 and has a base of 2. Hexadecimal uses a base
of 16 with numbers 0-9 and letters A to F. A number’s base can be written as a subscript to denote
its value in the correct number system. For example 1110 denotes the number eleven in denary. 112 would
denote a binary value, (with a denary equivalent of three) and 1116 would denote a hexadecimal value.
(17 in denary.)
155
SECTION 6 – DATA TYPES
The principle is exactly the same in the binary number system. As we move from left to right, each digit is
worth twice as much as the previous one, instead of ten times as much.
128 64 32 16 8 4 2 1
1 1 0 0 1 0 1 1
128 + 64 + 8 + 2 + 1 = 203
The minimum and maximum values that can be represented in n bits using unsigned binary are 0 and
2n – 1 respectively.
Q1: Convert the binary numbers 0011 1001 and 1111 1111 into denary.
128 64 32 16 8 4 2 1
0 1 0 0 1 0 0 1 = 1001
156
CHAPTER 28 – PRIMITIVE DATA TYPES, BINARY AND HEXADECIMAL
157
SECTION 6 – DATA TYPES
Hex colour
code
364DB2
Exercises
1. A school keeps data about each of its pupils. State the most suitable data type for each of the
following data items:
Pupil’s surname
A single letter indicating whether they are male or female
The amount owed for school trips
6-28 The number of school trips they have participated in
Whether or not the pupil is entitled to free school meals [5]
3. How many different denary numbers can be represented using 8-bit binary? [1]
5. Why are bit patterns often displayed using hexadecimal instead of binary? [2]
1 0 1 0 0 1 1 1
Figure 1
What is the denary equivalent of the contents of this memory location if it represents an
unsigned binary integer? [1]
7. What is the hexadecimal equivalent of the binary pattern shown in Figure 1? [1]
158
CHAPTER 29 – ASCII AND UNICODE
The number of values that can be represented with n bits is 2n. Two bits can represent 4 different values:
00, 01, 10 and 11. Three bits can represent 8 values and four bits can represent 16 different values,
since 2 x 2 x 2 x 2 = 16. 6-29
Unit nomenclature
Although we frequently refer to 1024 bytes as a kilobyte, it is in fact a kibibyte. To avoid any confusion
between references to 1024 bytes rather than 1000 bytes, an international collaboration between
standards organisations decided in 1996 that kibi would represent 1024, and kilo would represent 1000.
Kibi is a combination of the words kilo and binary. The same is true of the other familiar names Mega,
Giga and Tera being replaced by mebi, gibi and tebi. The table below outlines the nomenclature for
increasing quantities of bytes, in which a KiB is a kibibyte and a MiB, a mebibyte.
159
SECTION 6 – DATA TYPES
ASCII DEC Binary ASCII DEC Binary ASCII DEC Binary ASCII DEC Binary
NULL 000 000 0000 space 032 010 0000 @ 064 100 0000 ` 096 110 0000
SOH 001 000 0001 ! 033 010 0001 A 065 100 0001 a 097 110 0001
STX 002 000 0010 " 034 010 0010 B 066 100 0010 b 098 110 0010
ETX 003 000 0011 # 035 010 0011 C 067 100 0011 c 099 110 0011
EOT 004 000 0100 $ 036 010 0100 D 068 100 0100 d 100 110 0100
ENQ 005 000 0101 % 037 010 0101 E 069 100 0101 e 101 110 0101
ACK 006 000 0110 & 038 010 0110 F 070 100 0110 f 102 110 0110
BEL 007 000 0111 ' 039 010 0111 G 071 100 0111 g 103 110 0111
BS 008 000 1000 ( 040 010 1000 H 072 100 1000 h 104 110 1000
HT 009 000 1001 ) 041 010 1001 I 073 100 1001 i 105 110 1001
LF 010 000 1010 * 042 010 1010 J 074 100 1010 j 106 110 1010
6-29 VT 011 000 1011 + 043 010 1011 K 075 100 1011 k 107 110 1011
FF 012 000 1100 , 044 010 1100 L 076 100 1100 l 108 110 1100
CR 013 000 1101 - 045 010 1101 M 077 100 1101 m 109 110 1101
SO 014 000 1110 . 046 010 1110 N 078 100 1110 n 110 110 1110
SI 015 000 1111 / 047 010 1111 O 079 100 1111 o 111 110 1111
DLE 016 001 0000 0 048 011 0000 P 080 101 0000 p 112 111 0000
DC1 017 001 0001 1 049 011 0001 Q 081 101 0001 q 113 111 0001
DC2 018 001 0010 2 050 011 0010 R 082 101 0010 r 114 111 0010
DC3 019 001 0011 3 051 011 0011 S 083 101 0011 s 115 111 0011
DC4 020 001 0100 4 052 011 0100 T 084 101 0100 t 116 111 0100
NAK 021 001 0101 5 053 011 0101 U 085 101 0101 u 117 111 0101
SYN 022 001 0110 6 054 011 0110 V 086 101 0110 v 118 111 0110
ETB 023 001 0111 7 055 011 0111 W 087 101 0111 w 119 111 0111
CAN 024 001 1000 8 056 011 1000 X 088 101 1000 x 120 111 1000
EM 025 001 1001 9 057 011 1001 Y 089 101 1001 y 121 111 1001
SUB 026 001 1010 : 058 011 1010 Z 090 101 1010 z 122 111 1010
ESC 027 001 1011 ; 059 011 1011 [ 091 101 1011 { 123 111 1011
FS 028 001 1100 < 060 011 1100 \ 092 101 1100 | 124 111 1100
GS 029 001 1101 = 061 011 1101 ] 093 101 1101 } 125 111 1101
RS 030 001 1110 > 062 011 1110 ^ 094 101 1110 ~ 126 111 1110
US 031 001 1111 ? 063 011 1111 _ 095 101 1111 DEL 127 111 1111
160
CHAPTER 29 – ASCII AND UNICODE
Unicode
By the 1980s, several coding systems had been introduced all over the world that were all incompatible
with one another. This created difficultly as multilingual data was being increasingly used and a new,
unified format was sought. As a result, a new 16-bit code called Unicode (UTF-16) was introduced.
This allowed for 65,536 different combinations and could therefore represent alphabets from dozens of
languages including Latin, Greek, Arabic and Cyrillic alphabets. The first 128 codes were the same as
ASCII so compatibility was retained. A further version of Unicode called UTF-32 was also developed to 6-29
include just over a million characters, and this was more than enough to handle most of the characters
from all languages, including Chinese and Japanese.
This meant that whilst there is now just one globally recognised system to maintain, one character in this
scheme uses four bytes instead of two, significantly increasing file sizes and data transmission times.
Q3: Using the Unicode UTF-16 system, how much memory would be used for the word ‘Mouse’?
Exercises
1. The ASCII system uses 7 bits to represent a character. The ASCII code in denary for the numeric
character '0' is 48; other numeric characters follow on from this in sequence.
(a) Using 7 bits, what is the ASCII code for the character '2' in binary? [1]
(b) How many different characters can be represented using ASCII? [1]
2. One character encoding scheme is Unicode. An alternative character encoding scheme is ASCII.
3. How many times greater is the storage capacity of a 1 terabyte hard disk drive than that of a
256 megabyte hard disk drive?
Show each stage of your working. [2]
161
SECTION 6 – DATA TYPES
Binary addition
Binary addition works in a similar way to denary addition. If two numbers added together are equal to or
greater than the base value, (in the case of denary, 10) then the ‘tens’ are carried. In binary, an addition
that equals 2 or more results in a carry over to the next column.
In binary, the rules for addition are as follows:
1. 0 + 0 = 0
2. 0 + 1 = 1
3. 1 + 0 = 1
4. 1 + 1 = 0 Carry 1 (This is 2 in denary or 10 in binary.)
5. 1 + 1 + 1 = 1 Carry 1 (This is 3 in denary or 11 in binary.)
6-30 Use the following worked example as a guide to where and how each of the rules is implemented.
1 1
1 1 0 0 12
+ 1 1 1 0 14
= 1 1 0 1 0 26
Carry bit
Rule 5
Rule 4
Rule 2 or 3
Rule 1
Overflow
In the following example, 8 bits are used to store the result of an addition. The result of the addition is
greater than 255, and an overflow error occurs where a carry from the most significant bit requires a
ninth bit.
1 1 1 1
1 1 1 0 0 0 1 0 226
+ 1 0 1 1 1 0 1 0 186
= 1 1 0 0 1 1 1 0 0 412
162
CHAPTER 30 – BINARY ARITHMETIC
Binary arithmetic using the sign and magnitude representation does not work as you would expect. A
much better way of representing numbers in binary is called two’s complement.
In binary:
11111101 = -3
11111110 = -2
11111111 = -1
00000000 = 0
00000001 = 1
00000010 = 2
00000011 = 3
Q3: Add together the two’s complement numbers 3 and -3. What is the result?
-(2(n-1)) … 2(n-1) - 1
With eight bits, the maximum denary range that can be represented is -128 to 127 because the leftmost
bit is used as a sign bit to indicate whether a number is negative. If the leftmost number is a 1, it is a
negative number. Thus 10000000 represents -128.
163
SECTION 6 – DATA TYPES
-9
Positive binary : 00001001
Flip the bits : 11110110
Add one : 1
11110111
11100101
Flip the bits : 00011010
6-30 Add one : 1
Convert : - 00011011
-27
Q7: Convert the following pairs of denary numbers to binary, and subtract the second
number from the first
(i) 14 and 27 (ii) 14 and 8
164
CHAPTER 30 – BINARY ARITHMETIC
In the binary example above, the left hand section before the point is equal to 5 (4+1) and the right hand
section is equal to ½ + ¼ (¾), or 0.5 + 0.25 = 0.75. So, using four bits after the point, 0101 1100 is 5.75
in denary. A useful table with some denary fractions and their equivalents is given below:
Q8: How is 19.25 represented using a single byte with 3 bits after the point?
It is worth noticing that this system is not only less accurate than the denary system, but some fractions
cannot be represented at all. 0.2, 0.3 and 0.4, for example, will require an infinite number of bits to the
right of the point. The number of fractional places would therefore be truncated and the number will not
be accurately stored, causing rounding errors. In our denary system, two denary places can hold all
values between .00 and .99. With the fixed point binary system, 2 digits after the point can only represent
0, ¼, ½, or ¾ and nothing in between.
165
SECTION 6 – DATA TYPES
The range of a fixed point binary number is also limited by the fractional part. For example, if you have
only 8 bits to store a number to 2 binary places, you would need 2 digits after the point, leaving only
6 bits before it. 6 bits only gives a range of 0-63. Moving the point one to the left to improve accuracy
within the fractional part only serves to half the range to just 0-31. Even with 32 bits used for each
number, including 8 bits for the fractional part after the point, the maximum value that can be stored is
only about 8 million. Another format called floating point binary can hold much larger numbers, with
greater accuracy.
Floating point form is covered in the next chapter.
Exercises
1. Represent the denary value -19 as an 8-bit two's complement binary integer. [2]
2. What is the largest positive denary value that can be represented using 8-bit two's
6-30 complement binary? [1]
3. Describe how 8-bit two's complement binary can be used to subtract one number from another
number. In your answer show how the calculation 25 – 49 would be completed using the method
that you have described. [2]
4. A computer stores the current temperature of a supermarket delivery van. The temperature in °C
is stored as a two’s complement integer using a single byte.
(a) Convert the freezer temperature value of -19 into binary. [2]
(b) State the range of temperature values that can be stored using 8 bits. [1]
5. A memory location contains the value 10101011. What is its denary equivalent if it represents
a two’s complement binary integer? [2]
166
CHAPTER 31 – FLOATING POINT ARITHMETIC
A-Level only
.
precision, or accuracy, of the fractional part and vice versa.
16 8 4 2 1 ½ ¼ ⅛
6-31
In the example above, only numbers which are multiples of 1/8 can be represented. The value 4.9, for
example would be ‘rounded’ to 4.875 or 00100111 with three fractional bits to the right of the point.
Q1: Using 1 byte to hold each number with the three least significant bits to the right of the point,
convert the following binary numbers to denary:
(a) 01010100 (b) 01011101 (c) 00111011 (d) 01010111
Q2: Convert the following numbers to 8-bit binary assuming four bits after the point:
(a) 2.75 (b) 10.875 (c) 7.5625 (d) 3.4375
Q3: What are the largest and smallest unsigned numbers that can be held in two bytes with four
bits after the point? See Figure 31.1
Figure 31.1
167
SECTION 6 – DATA TYPES
A-Level only
When ordinary denary numbers become very large, they are written in a more convenient scientific
notation m × 10n where m is known as the mantissa or coefficient, and n is the exponent or order of
magnitude. 5000 can therefore written as 0.5 x 104, and 42,750.254 can be written as 0.42750254 x 105,
moving the decimal point five places to the left.
This technique can easily be applied to binary numbers too, where the mantissa and exponent are
represented for example using 12 bits, with 8 bits for the mantissa and 4 bits for the exponent.
The leftmost bit of both the mantissa and the exponent is a sign bit, with 0 indicating a positive number,
and 1 a negative number. In a computer, of course, many more bits than this will be used to represent a
floating point number, with 32-, 64- and 128-bit floating point numbers all being common.
In all the examples below, eight bits are used for the mantissa and four bits for the exponent. The implied
binary point is to the right of the sign bit.
Sign
bit Mantissa Exponent
0 • 1 0 1 1 0 1 0 0 0 1 1
Q4: Convert the following floating point numbers to denary: You can use Figure 1 to help you.
(a) 0 • 1101010 0100 (b) 0 • 1001100 0011
Negative exponents
If the exponent is negative, the decimal point must be moved left instead of right.
The example above has a positive mantissa of 0.1000000 and a negative exponent of -2.
• Find the two’s complement of the exponent. (Remember that to convert a positive to negative binary
number using two’s complement you must flip the bits and add 1.) Exponent = -2
• Move the binary point of the mantissa two places to the left, to make it smaller. The mantissa is
therefore 0.001 (You can ignore the trailing zeros)
• Translate this to denary with the help of Figure 31.1. The answer is 0.125.
168
CHAPTER 31 – FLOATING POINT ARITHMETIC
A-Level only
Q5: Convert the following floating point number to denary: 0 • 1100000 1110
The example above has a negative mantissa of 1.0101101 and a positive exponent of 0101.
• Find the twos complement of the mantissa. It is 0.1010011, so the bits represent -0.1010011
• Translate the exponent to denary, 0101 = 5
• Move the binary point 5 places to the right to make it larger. The mantissa is -10100.11
• Translate this to binary with the help of Figure 31.1. The answer is -20.75.
Normalisation 6-31
Normalisation is the process of moving the binary point of a floating point number to provide the
maximum level of precision for a given number of bits. This is achieved by ensuring that the first digit after
the binary point is a significant digit. To understand this, first consider an example in denary.
In the denary system, a number such as 5,842,13010 can be represented with a 7-digit mantissa in many
different ways
0.584213 x 107 = 5,842,130
0.058421 x 108 = 5,842,100
0.005842 x 109 = 5,842,000
The first representation, with a significant (non-zero) digit after the decimal point, has the maximum
precision.
A number such as 0.00000584213 can be represented as 0.584213 x 10-5.
169
SECTION 6 – DATA TYPES
A-Level only
Example 1
Normalise the binary number 0.0001011 0101, held in an 8-bit mantissa and a 4-bit exponent.
• The binary point needs to move 3 places to the right so that there is a 1 following the binary point.
• Making the mantissa larger means we must compensate by making the exponent smaller, so subtract
3 from the exponent, resulting in an exponent of 0010.
• The normalised number is 0.1011000 0010
Example 2
Normalise the binary number 1.1110111 0001, held in an 8-bit mantissa and a 4-bit exponent.
• Move the binary point right 3 places, so that it is just before the first 0 digit. The mantissa is now
1.0111000
• Moving the binary point to the right makes the number larger, so we must make the exponent smaller
to compensate. Subtract 3 from the exponent. The exponent is now 1 – 3 = -2 = 1110
• The normalised number is 1.0111000 1110
A normalised negative number has a sign bit of 1 and the next bit is always 0.
The mantissa of a negative number in normalised form always lies between -½ and -1.
6-31
Example 3
What does the following binary number (with a 5-bit mantissa and a 3-bit exponent) represent in denary?
This is the largest positive number that can be held using a 5-bit mantissa and a 3-bit exponent, and
represents 0.1111 x 23 = 7.5
Example 4
The most negative number that can be held in a 5-bit mantissa and 3-bit exponent is:
Q7: Normalise the following numbers, using an 8-bit mantissa and a 4-bit exponent
(a) 0.0000110 0001
(b) 1.1110011 0011
170
CHAPTER 31 – FLOATING POINT ARITHMETIC
A-Level only
171
SECTION 6 – DATA TYPES
A-Level only
Q9: Using the above rules, add together the denary numbers 1.562 x 104 and 3.128 x 102.
Example 7
Convert the denary numbers 0.25 and 10.5 to normalised floating point binary form using an 8-bit
mantissa and a 4-bit exponent. Add together the two normalised binary numbers, giving the result in
normalised floating point binary form.
Step 1: The numbers in normalised form are:
0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1 0 1 0 0 0 1 0 0
Step 2: Write the mantissas with a binary point, and convert the exponents to denary, giving
0.1000000 exponent -1 and
0.1010100 exponent 4
Step 3: Make both exponents 4 and shift the binary points accordingly
0.0000010 (make the number smaller as you increase the exponent)
0.1010100
Step 4: Add the numbers, giving 0.1010110 exponent 4 (In this case it’s already normalised)
6-31
Result is 0 1 0 1 0 1 1 0 0 1 0 0
Q10: Add together the two binary numbers given below, leaving the result in normalised floating
point binary form.
0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 0 1 0 0 0 0 0 1 0
Example 8
Subtract the second of the two numbers given below from the first, giving the result in normalised floating
point binary form.
0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 1 0 1
172
CHAPTER 31 – FLOATING POINT ARITHMETIC
A-Level only
Step 4: Add the numbers
0.1000100
1.1011111
(1)0.0100011 exp 6 (ignore the carry)
Now normalise the number by moving the binary point right 1 place, which increases the
number, and decrease the exponent by 1
Result is 0 1 0 0 0 1 1 0 0 1 0 1
Exercises
1. A normalised floating point representation uses an 8-bit mantissa and a 4-bit exponent, both stored
6-31
using two’s complement format.
(a) This is a floating point representation of a number:
1 0 1 1 0 0 0 0 0 0 1 0
Mantissa Exponent
Calculate the denary number. Show your working. [2]
(b) Write the normalised representation of the denary value 12.75 in the boxes below:
2. Convert the following denary numbers to normalised floating point binary form, using an 8-bit
mantissa and a 4-bit exponent.
(a) -18.75 [2]
(b) 0.0625 [2]
A
173
SECTION 6 – DATA TYPES
A-Level only
Example 1
carry
bit
1 0 1 1 0 0 0 1 0 0 1 0 1 1 0 0 0 1
Before After
It is useful for examining the least significant bit of a number. After the operation, the carry bit can be
tested and a conditional branch executed.
6-32 Example 2
A logical shift left works in the same way, but the bits move left. The most significant bit (msb) moves into
the carry bit and a zero moves into the lsb. You can visualise the carry bit as being on the left of the byte.
carry
bit
0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 0
Before After
Q1: Shift the binary pattern 0100 0111 right twice and then left once. What are the contents of the
byte and the carry bit after these shifts?
Example 3
Shifting right has the effect of dividing by 2. If the sign bit is 1, 1 is moved in from the left instead of 0.
1 0 1 1 0 0 0 1 0 1 1 0 1 1 0 0 0 1
Before After
Q2: Convert the number -16 to binary, and divide by 8 using arithmetic shifts.
174
CHAPTER 32 – BITWISE MANIPULATION AND MASKS
A-Level only
Example 4
Shifting left multiplies by 2. The shift bypasses the sign bit, leaving the msb the same whatever the value
of the other bits.
0 1 0 1 1 0 0 0 1 0 1 1 1 0 0 0 1 0
Before After
Convert the number 14 to binary and then multiply it by 4 using arithmetic shifts. Convert the result
Q3:
back to denary.
Example 5
Multiply 9 by 5 using shifts and addition:
Multiply 9 x 1 0000 1001
Multiply 9 by 4 with 2 left shifts: 0010 0100
Add together: 0010 1101 = 45
carry bit
1 0 1 0 0 0 1 1 0
Example 6
Assume that R0 and R1, shown below, are two 8-bit registers being used as a double register to hold
a 16-bit binary integer, with R0 holding the high half of the number. Show how a combination of shift
instructions may be used to divide the 16-bit integer by 2.
0 1 0 1 1 0 0 1 1 1 0 1 1 0 0 0
R0 carry bit R1
Answer: First perform an arithmetic shift right on R0. 1 is shifted into the carry bit.
0 0 1 0 1 1 0 0 1 1 1 0 1 1 0 0 0
R0 carry bit R1
175
SECTION 6 – DATA TYPES
A-Level only
Then perform a circular shift on R1. This places the carry bit into the msb of R1, and the carry bit is
replaced with the lsb of R1.
0 0 1 0 1 1 0 0 0 1 1 1 0 1 1 0 0
R0 carry bit R1
Q5: A 16-bit integer is held in R0 and R1 which are being used as a double register. Show the effect
on the registers of doing an arithmetic shift right of 1 place in R0, followed by a circular shift
right of 1 place in R1.
1 0 0 1 1 0 1 1 0 1 0 1 1 1 0 1
R0 carry bit R1
Logical instructions
Boolean algebra is covered in Section 8, Chapters 40 and 41.
The instructions NOT, AND, OR and XOR (exclusive OR) have the following effects:
Explanation: In Boolean logic, 1 represents True and 0 represents False. A NOT instruction has only
one input. If the input is True (i.e. 1), the output is False (i.e. 0). With the AND gate, if both
inputs are True, (i.e.1) the output is True. Otherwise, the output is False. With the OR gate,
if either of the inputs is True (i.e. 1) the output is True. Otherwise, the output is False. With
the XOR gate, if either, but not both, of the inputs is True, the output is True. Otherwise,
the output is False.
Masks
The OR function may be used to set selected bits to 1 without affecting the other bits.
Example 7
A system has 8 lights that can be turned ON (output 1) of OFF (output 0), controlled by an 8-bit binary
code. At present, lights 1 to 4 are ON, lights 5 to 8 are OFF. Lights 5 and 6 are to be turned ON.
Light number 12345678
Present state 11110000
OR with 00001100
Result 1 1 1 1 1 1 0 0
The AND function may be used to mask particular bits, by setting them to zero.
176
CHAPTER 32 – BITWISE MANIPULATION AND MASKS
A-Level only
Example 8
The ASCII bit pattern for the number “5” is 0011 0101. Convert this to a pure binary number using a
mask.
We need to mask out the first four bits. This can be done with an AND operation.
Example 9
Convert an uppercase letter represented in ASCII to its lowercase equivalent.
The letter “C”, for example, is 0100 0011 in ASCII. The lowercase letter “c” is 0110 0011. We want to
change the third bit (counting from the left) from 0 to 1.
Exercises
6-32
1. An 8-bit word holds the binary pattern 10110010. Start with this bit pattern in each of
parts (a), (b) and (c). There is no need to show the contents of the carry bit.
(a) State the contents of the word after a logical left shift of 2 bits. [1]
(b) Interpreting the word as a number in two’s complement form, state the contents of the
word after an arithmetic right shift of 2 bits. [1]
(c) State the contents of the word after a circular shift left of 3 bits. [1]
2. In a particular computer, characters are represented in 8 bits using the ASCII code.
The codes for uppercase letters are from 0100 0001 for A to 0101 1010 for Z.
The codes for lowercase letters are from 0110 0001 for a to 0111 1010 for z.
Give an 8-bit mask and the appropriate logical operation which will:
(a) change any uppercase letter into its lowercase equivalent [2]
(b) change any lowercase letter into its uppercase equivalent. [2]
3. A 32-bit register holds a four byte value. The bytes are numbered so that the first byte is
leftmost. What mask and logical operator is required to achieve each of the following results:
177
Section 7
Data structures
In this section:
Chapter 33 Arrays, tuples and records 179
Chapter 39 Trees214
178
CHAPTER 33 – ARRAYS, TUPLES AND RECORDS
Data structures
Computer languages such as Python, Pascal and VB have built-in elementary data types such as
integer, real, Boolean and char. They also have some built-in structured data types such as string,
array and record. These are made up of a number of elements of a specified type such as integer, real
or string.
1-dimensional arrays
An array is defined as a finite, ordered set of elements of the same type, such as integer, real or char.
Finite means that there is a specific number of elements in the array. Ordered implies that there is a
first, second, third etc. element of the array.
For example, (assuming the first element of the array is myArray[0]:
myArray = [51, 72, 35, 37, 0, 3]
x = myArray[2] #assigns 35 to x
Example 1
7-33
Every year the RSPB organises a Big Garden Birdwatch to involve the public in
counting the number of birds of different types that they see in their gardens on a
particular weekend. During 30-31 January 2016, more than 8 million birds were
counted and reported.
The scientists add all the sightings together, and once the data has been analysed,
they can discover trends and understand how different birds and other wildlife are faring.
An array of strings could be used to hold the names of the birds, and an array of integers to hold the
results as they come in. As a simple example we will hold the names of 8 birds in an array:
birdName = ["robin", "blackbird", "pigeon", "magpie", "bluetit",
"thrush", "wren", "starling"]
We can reference each element of the array using an index. For example:
birdName[2] = "pigeon" #the index here is 2
Most languages have a function which will return the length of an array, so that
numSpecies = len[birdName]
will assign 8 to numSpecies.
179
SECTION 7 – DATA STRUCTURES
To find at which position of the array a particular bird is, we could use the following pseudocode
algorithm:
bird = input("Enter bird name: ")
birdFound = False
numSpecies = len(birdName)
for count = 0 to numSpecies - 1
if bird == birdName[count] then
birdIndex = count
birdFound = True
endif
next count
if birdFound == False then
print("Bird species not in array")
else
print("Bird found at",birdIndex)
endif
We need a second array of integers to accumulate the totals of each bird species observed. We can
initialise each element to zero.
birdCount = [0,0,0,0,0,0,0,0]
To add 5 to the blackbird count (the second element in the list) we can write a statement
birdCount[1] = birdCount[1] + 5
7-33 The following algorithm enables a member of the Birdwatch team to enter results as they come in from
members of the public.
birdName = ["robin", "blackbird", "pigeon", "magpie", "bluetit",
"thrush", "wren", "starling"]
birdCount = [0,0,0,0,0,0,0,0]
bird = input("Please input name of bird (x to end): ")
while bird != "x"
birdFound = False
for count = 0 to 7
if bird == birdName[count] then
birdFound = True
birdsObserved = input("number observed: ")
birdCount[count] = birdCount[count] + birdsObserved
endif
next count
if birdFound == False then
print("Bird species not in array")
endif
bird = input("Please input name of bird (x to end): ")
endwhile
#now print out the totals for each bird
for count = 0 to 7
print(birdName[count], birdCount[count])
next count
180
CHAPTER 33 – ARRAYS, TUPLES AND RECORDS
2-dimensional arrays
An array can have two or more dimensions. A two-dimensional array can be visualised as a table, rather
like a spreadsheet.
Imagine a 2-dimensional array called numbers, with 3 rows and 4 columns. Elements in the array can be
referred to by their column and row number, so that numbers[1,3] = 8 in the example below.
Example 2
Write a pseudocode algorithm for a module which prints out the quarterly sales figures (given in integers)
for each of 3 sales staff named Anna, Bob and Carol, together with their total annual sales. Assume that
the sales figures are already in the 2-dimensional array quarterSales. The staff names are held in a
1-dimensional array staff.
staff = ["Anna","Bob","Carol"]
quarterSales = [[100,110,120,110],
[350,355,360,360],
[200,210,220,220]]
7-33
for s = 0 to 2
annualSales = 0
#output staff name
(insert statement here)
for q = 0 to 3
print("Quarter ", q, quarterSales[s,q])
annualSales = annualSales + quarterSales[s,q]
next q
print("Annual sales: ", annualSales)
next s
Q2: What statement needs to be inserted after the comment #output staff name in order to
output the staff name?
181
SECTION 7 – DATA STRUCTURES
Tuples
A tuple is an ordered set of values, which could be elements of any type such as strings, integers or real
numbers, or even graphic images, sound files or arrays. Unlike arrays, the elements do not all have to be
of the same type. However, a tuple, like a string, is immutable, which means that its elements cannot be
changed, and you cannot dynamically add elements to or delete elements from a tuple.
In Python a tuple is written in parentheses, for example:
pupil = ("John", 78, "a")
You can refer to individual elements of a tuple, for example:
name = pupil[0]
but the following staement is invalid:
pupil[0] = "Mary"
Records
If you want to store data permanently so that you can read or update it at a future date, the data needs
to be stored in a file on disk. The most common way of storing large amounts of data conveniently is to
use a database, but sometimes you need to create and interrogate your own files.
Generally, a file consists of a number of records. A record contains a number of fields, each holding
one item of data. For example, in a file holding data about students, you might have the following
record structure:
The table shows a file containing three records, each record having 5 fields. In some languages, a record
type will be declared in the following manner:
studentType = record
integer ID
string firstname
string surname
date dateOfBirth
string class
end record
182
CHAPTER 33 – ARRAYS, TUPLES AND RECORDS
Exercises
1. Referring to the BirdWatch program given earlier in this chapter:
(a) Explain why the for… next loop repeated below is not the most efficient type of loop
in this situation. [1]
for count = 0 to 7
if bird == birdName[count] then
birdFound = True
birdsObserved = input("Enter number of birds observed: ")
birdCount[count] = birdCount[count] + birdsObserved
endif
next count
2. The birth weights in grams of 100 babies, which vary between 1500 to 4000 grams, are held
in an array weight.
Write pseudocode for an algorithm which calculates the average birth weight, and then prints
out the number of babies who are more than 500 grams below the average weight, together
with the average weight of these. [5]
3. The marks for 3 assignments, each marked out of 10, for a class of 5 students are to be input
into a two-dimensional array mark so that mark[3,1], for example, holds the second mark
achieved by the 4th student. Any missing assignments are given a mark of zero. 7-33
Draw a table representing this array, and fill it with test data. [2]
Write a pseudocode algorithm which allows the user to enter the marks for the class.
Calculate the average mark for each student, and the class average. [4]
4. In a certain game, treasure is hidden in a 10x10 grid. The grid coordinates are given by
grid[row,col] where grid[0,0] represents the top left hand corner and grid[9,9] the
bottom right corner. The grid coordinates of the treasure are signified by a 1 at grid[row,col].
All other grid elements are filled with zeros.
for row = 0 to 9
for col = 0 to 9
if grid[row, col] == 1 then
print("row ", row, " column ", col)
endif
next col
next row
Write pseudocode statements to initialise the grid and “hide the treasure” at a random location
inside the grid. [5]
183
SECTION 7 – DATA STRUCTURES
Chapter 34 – Queues
Objectives
• Understand the concept of an abstract data type
• Be familiar with the concept and uses of a queue
• Describe the creation and maintenance of data within a queue (linear, circular, priority)
A • Describe and apply the following to a linear, circular and priority queue
oo add an item
oo remove an item
oo test for an empty queue
oo test for a full queue
Queues
A queue is a First In First Out (FIFO) data structure. New elements may only be added to the end of a
queue, and elements may only be retrieved from the front of a queue. The sequence of data items in a
queue is determined, therefore, by the order in which they are inserted. The size of the queue depends
on the number of items in it, just like a queue at traffic lights or at a supermarket checkout.
Queues are used in a variety of applications:
• Output waiting to be printed is commonly stored in a queue on disk. In a room full of networked
computers, several people may send work to be printed at more or less the same time. By putting
the output into a queue on disk, the output is printed on a first come, first served basis as soon as
the printer is free.
• Characters typed at a keyboard are held in a queue in a keyboard buffer.
• Queues are useful in simulation problems. A simulation program is one which attempts to model
a real-life situation so as to learn something about it. An example is a program that simulates
customers arriving at random times at the check-outs in a supermarket store, and taking random
times to pass through the checkout. With the aid of a simulation program, the optimum number of
check-out counters can be established.
184
CHAPTER 34 – QUEUES
Operations on a queue
The abstract data type queue is defined by its structure and the operations which can be performed
on it. It is described as an ordered collection of items which are added at the rear of the queue, and
removed from the front.
front = 0 rear = 3
When Eli leaves the queue, the front pointer is made to point to Jason; the elements themselves do
not move. When Adam joins the queue, the rear pointer points to Adam. Think of a queue in a doctor’s
surgery – people leave and join the queue, but no one moves chairs.
front = 1 rear = 4
Q1: Complete the following table to show the queue contents and the value returned by the
function or method. The queue is named q and is of length 6.
Queue operation Queue contents Return value
q.isEmpty() [] True
q.enQueue("Blue") ["Blue"] (none)
q.enQueue("Red") ["Blue", "Red"] (none)
q.enQueue("Green")
q.isFull()
q.isEmpty() ["Blue", "Red", "Green"]
q.deQueue()
q.enQueue("Yellow")
185
SECTION 7 – DATA STRUCTURES
A-Level only
7-34
7-33 Implementing a linear queue
There are basically two ways to implement a linear queue in an array or list:
1. As items leave the queue, all of the other items move up one space so that the front of the queue
is always the first element of the structure, e.g. q[0]. With a long queue, this may require significant
processing time.
2. A linear queue can be implemented as an array with pointers to the front and rear of the queue. An
integer holding the size of the array (the maximum size of the queue) is needed, as well as a variable
giving the number of items currently in the queue. However, clearly a problem will arise as many items
are added to and deleted from the queue, as space is created at the front of the queue which cannot
be filled, and items are added until the rear pointer points to the last element of the data structure.
Q2: The queue of names pictured above containing Jason, Milly, Bob and Adam has space for six
names. What will be the situation when Jason and Milly leave the queue, and Jack joins it? How
many names are now in the queue? How many free spaces are left?
A circular queue
One way of overcoming the limitation of a static data structure such as an array is to implement the
queue as a circular queue, so that when the array fills up and the rear pointer points to the last element
of the array, say q[5], it will be made to point to the first element, q[0], when the next person joins
the queue, assuming this element is empty. This solution requires some extra effort on the part of the
programmer, and is less flexible than a dynamic data structure if the maximum number of items is not
known in advance.
186
CHAPTER 34 – QUEUES
A-Level only
Q3: A circular queue is implemented in a fixed size array of six elements, indexed from 0. Show
the contents of the queue and the front and rear pointers for a circular queue of 6 items when
(a) it is empty
(b) Ali, Ben, Charlie, Davina, Enid, Fred join the queue. Ali, Ben and Charlie leave, and
Greg joins the queue.
187
SECTION 7 – DATA STRUCTURES
A-Level only
Priority queues
In some situations where items are placed in a queue, a system of priorities is used. For example an
operating system might schedule jobs in order of priority, or a printer may give shorter print jobs priority
over longer ones.
7-34
7-33 A priority queue acts like a queue in that items are dequeued by removing them from the front of the
queue. However, the logical order of items within the queue is determined by their priority, with the
highest priority items at the front of the queue and the lowest priority items at the back. It is therefore
possible that a new item joins the queue at the front, rather than at the rear.
Q5: In what circumstances would an item join a priority queue at the front? In what circumstances
would the item join the queue at the rear?
Such a queue could be implemented by checking the priority of each item in the queue, starting at the
rear and moving it along one place until an item with the same or lower priority is found, at which point
the new item can be inserted.
A
188
CHAPTER 34 – QUEUES
Exercises
1. (a) Explain why a queue may be implemented as a circular queue.[2]
(b) Explain what is meant by a dynamic data structure and why an inbuilt dynamic data
structure in a programming language may be useful in implementing a queue. [2]
(c) Print jobs are put in a queue to be printed. The queue is implemented in an array, indexed
from 0, as a circular queue which can hold 5 jobs. Jobs enter the queue in the sequence Job1,
Job2, Job3, Job4, Job5. Pointers front and rear point to the first and last items in
the queue respectively.
(i) Draw a diagram to show how the print jobs are stored. Include pointers in your diagram. [3]
(ii) Two jobs are printed and leave the queue. Another job, Job6, joins the queue.
Draw a diagram representing the new situation. [2]
A-Level only
2. The size of some data structures is fixed when the structure is created.
(a) State the term used to describe such data structures.
Give one example of a type of data structure whose size is always fixed.
Give one advantage of using a fixed size data structure. [3]
(b) A queue data structure has two pointers called front and next which are defined as:
front points to the first item in the queue 7-34
next points to the next available space
The queue is defined as a first in, first out (FIFO) data structure.
(i) State the condition of the pointers when the queue is empty. [1]
(ii) Write an algorithm to remove one data item from a queue. [4]
(c) The queue may be represented by a fixed size data structure.
data structure
front next
Explain, with the aid of a diagram, what happens when attempting to add 3 data
items to the queue. [5]
OCR F453-01 Qu 5 June 2012
189
SECTION 7 – DATA STRUCTURES
Definition of a list
In computer science, a list is an abstract data type consisting of a number of items in which the same
item may occur more than once. The list is sequenced so can refer to the first, second, third,… item and
we can also refer to the last element of the list.
A list is a very useful data type for a wide variety of operations, and can be used, for example, to
implement other data structures such as a queue, stack or tree. Some languages such as Python have a
built-in list data type, so that for example a list of numbers could be shown as
Return
List operation Description Example list contents
value
isEmpty() Test for empty list a.isEmpty() [45, 13, 19, 13, 8] False
Add a new item to list to
append(item) a.append(33) [45, 13, 19, 13, 8, 33]
the end of the list
190
CHAPTER 35 – LISTS AND LINKED LISTS
Q2: Assume that list names holds the values James, Paul, Sophie, Holly, Nathan.
What does the list hold after each of the following consecutive operations?
(i) names.append("Tom")
(ii) names.pop(3)
(iii) names.insert(1, "Melissa")
Using an array
It is possible to maintain an ordered collection of data items using an array, which is a static data
structure. This may be an option if the programming language does not support the list data type and
if the maximum number of data items is small, and is known in advance.
The programmer then has to work out and code algorithms for each list operation. The empty array
must be declared in advance as being a particular length, and this could be used, for example, to hold a
priority queue.
Ken
7-35
Holly James Nathan Paul Sophie
Q3: Suggest a different algorithm for adding a new element to a sequenced list.
Q4: How could the given algorithm be adapted to insert an item in a priority queue?
191
SECTION 7 – DATA STRUCTURES
Q5: Why not simply leave the array element names[2] blank after deleting Ken?
First, items are moved up to fill the empty space by copying them to the previous spot in the array:
Finally the last element, which is now duplicated, is replaced with a blank.
A-Level only
Linked lists
Definition
A linked list is a dynamic data structure used to hold a sequence, as described below:
• The items which form the sequence are not necessarily held in contiguous data locations, or in the
order in which they occur in the sequence
• Each item in the list is called a node and contains a data field and a next address field called a link
or pointer field (the data field may consist of several subfields.)
• The data field holds the actual data associated with the list item, and the pointer field contains the
7-35
7-33 address of the next item in the sequence
• The link field in the last item indicates that there are no further items by the use of a null pointer
• Associated with the list is a pointer variable which points to (i.e. contains the address of) the first
node in the list
192
CHAPTER 35 – LISTS AND LINKED LISTS
A-Level only
The array is initialised prior to entering any names, and it will consist of just one linked list of free space.
After initialisation, nextfree points to the first free space in the list, Names[0].
A pointer named start will point to the first data item in the list. This will be initialised to null,
indicating that the list is empty. The last item in the free space list also has a pointer of null, indicating
that this is the last available free space in the list.
The array holding the linked list now looks like this:
After the names Browning, Turner, Johnson and Cray have been added, the array will look like this:
Notice that we now have two linked lists going; the list linking the nodes containing names and the list
linking the free nodes.
Q6: Show the state of the table and pointers after insertion of the name Allen. Write down the steps
involved in inserting a new name at the front of the list, so that alphabetical sequence
is maintained.
Inserting an item
We’ll now work out an algorithm for inserting a name into the middle of the list. As an example, we’ll
insert Mortimer between Johnson and Turner. The pointers will have to be changed so that it is linked into
the correct place.
193
SECTION 7 – DATA STRUCTURES
A-Level only
After insertion of Mortimer, the list will appear as in Figure 35.3.
Names
index name pointer
0 Browning 3
1 Turner null start = 0
2 Johnson 4
3 Cray 2
4 Mortimer 1 nextfree = 5
5 null
Figure 35.3
start 0 3 2 1
4 5 null
nextfree 4 5
After insertion:
start 0 3 2 1
5 Mortimer 1 null
nextfree 4 5
Figure 35.4
Extra steps will be needed to be added to the algorithm to cope with the special cases of inserting a
name at the very front of the list (e.g. Allen), or inserting the first name into an empty list.
194
CHAPTER 35 – LISTS AND LINKED LISTS
A-Level only
Before we go further and express this algorithm in more formal pseudocode, you need to make sure you
clearly understand the notation used.
Names[p].name holds the name in node[p], that is, the node pointed to by p
Names[p].pointer holds the value of the pointer in node[p]
Notice how you can ‘peek ahead’ using the pointers to see what name is in the next node, or even the
node after that one, and so on.
This is crucial because you need to know where you have come from (the previous node), when you get
to the node that has a name “greater” than the new one to be inserted.
Here's a simplified algorithm to add a new name to the list. The complications of inserting at the head of
a list and dealing with a full list are dealt with in the algorithm on the next page.
The comments in the algorithm refer to inserting the name Mortimer in the linked list shown in Figures 7-35
35.2 and 35.4.
Names[nextfree].name = newName //store name in next free node
p = start
follow pointers until Names[p].pointer points to a name > new name
temp = nextfree //put 4 in temp (Step 1)
nextfree = Names[temp].pointer //put 5 in nextfree (Step 2)
Names[temp].pointer = Names[p].pointer //put 1 in Mortimer’s
pointer field (Step 3)
Names[p].pointer = temp //put 4 in Johnson’s
pointer field (Step 4)
Diagramatically:
nextfree
Step 1 Step 2
temp Names[temp].pointer
Step 4 Step 3
Names[p].pointer
Figure 35.5
195
SECTION 7 – DATA STRUCTURES
A-Level only
196
CHAPTER 35 – LISTS AND LINKED LISTS
A-Level only
Deleting an item
Returning to the table as in Figure 36.4, shown again below, we will delete Johnson.
start 0 3 2 1 7-35
4 5 -1
nextfree 4 5
After deletion
start 0 3 2 1
2 5 null
nextfree 4 5
Figure 35.7
Here is the simplified algorithm:
p = start
follow pointers until Names[p].pointer points to the name to delete
temp = Names[p].pointer //put 2 in temp
Names[p].pointer = Names[temp].pointer //put 1 in Cray’s pointer
field
Names[temp].pointer = nextfree //put 4 in Johnson’s pointer
field
nextfree = temp //put 2 in nextfree
197
SECTION 7 – DATA STRUCTURES
A-Level only
Q8: Draw a diagram similar to Figure 35.5 to show the steps taken to adjust the pointers.
What are the special cases that the “Delete Item” algorithm will need to deal with?
Q9: Write an algorithm to count and print the number of items in the linked list Names.
.
198
CHAPTER 35 – LISTS AND LINKED LISTS
Exercises
1. A list data structure can be represented using an array.
The pseudocode algorithm below can be used to carry out one useful operation on a list.
p = 1
if ListLength > 0 then
while p <= ListLength AND List[p] < NewItem
p = p + 1
endwhile
for q = ListLength downTo p
List[q + 1] = List[q]
next q
endif
List[p] = NewItem
ListLength = ListLength + 1
(a) The initial values of the variables for one particular execution of the algorithm are shown in the
trace table below, labelled Table 1.
Draw the trace table for the execution of the algorithm. The first line is given and you will need to
draw extra rows.
Table 1
7-35
List
ListLength NewItem p q [1] [2] [3] [4] [5]
4 25 - - 18 21 42 53
[4]
A-Level only
(c) A list implemented using an array is a static data structure. The list could be implemented
using a linked list as a dynamic data structure instead.
Describe one difference between a static data structure and a dynamic data structure. [1]
2. (a) The birds Robin, Sparrow, Blackbird, are entered, in the order given, into a linked list so
that they may be processed alphabetically. Draw a diagram of this linked list. [2]
(b) Redraw the diagram after two additional items, Chaffinch and Goldfinch, are added. [2]
(c) Show the list implemented in an array of records, with each node holding a data item and a
pointer, after the addition of the new items. [4]
(d) Write a pseudocode algorithm to print out the birds in the list in alphabetical order. [4]
A
199
SECTION 7 – DATA STRUCTURES
Chapter 36 – Stacks
Objectives
• Be familiar with the concept and uses of a stack
• Be able to describe the creation and maintenance of data within a stack
• Be able to describe and apply the following operations: push, pop, peek (or top), test for empty
stack, test for full stack
• Be able to explain how a stack frame is used with subroutine calls to store return addresses,
parameters and local variables
Concept of a stack
A stack is a Last In, First Out (LIFO) data structure. This means that, like a
stack of plates in a cafeteria, items are added to the top and removed from
the top.
Applications of stacks
A stack is an important data structure in Computing. Stacks are used in calculations, and to hold return
addresses when subroutines are called. When you use the Back button in your Web browser, you will be
taken back through the previous pages that you looked at, in reverse order as their URLs are removed
7-36
7-33 from the stack and reloaded. When you use the Undo button in a word processing package, the last
operation you carried out is popped from the stack and undone.
Implementation of a stack
A stack may be implemented as either a static or dynamic data structure.
A static data structure such as an array can be used with two additional variables, one being a pointer to
the top of the stack and the other holding the size of the array (the maximum size of the stack).
1 Red
0 Blue
200
CHAPTER 36 – STACKS
Operations on a stack
The following operations are required to implement a stack:
A-Level only
The following pseudocode implements four of the stack operations using a fixed size array. 7-36
function isEmpty
if top == -1 then
return True
else
return False
endif
endfunction
function isFull
if top == maxSize then
return True
else
return False
endif
endfunction
procedure push(item)
if isFull then
print("Stack is full")
else
top = top + 1
s(top)= item
endif
endprocedure
201
SECTION 7 – DATA STRUCTURES
A-Level only
function pop
if isEmpty then
print("Stack is empty")
else
item = s(top)
top = top - 1
retrn item
endif
endfunction
Q2: Show the state of the stack and stack pointer after the following operations have been
performed on the stack containing (‘Blue’, ‘Red’):
(i) Pop
(ii) Pop
(iii) Push(‘Yellow’)
Some languages, such as Python, make it very easy to implement a stack using the built-in dynamic
list data structure, with the top of the stack being the last element of the list.
7-36
7-33
The function len(s) can be used to determine whether the stack is empty, and if it is not, pop() will
remove and return the top element. The built-in method append(item) will append or push an item
onto the top of the stack (the last element of the list).
202
CHAPTER 36 – STACKS
A-Level only
pushed onto the stack each time the routine is called are popped one after the other, each time the end
of the subroutine is reached. If the programmer makes an error and the recursion never ends, sooner or
later memory will run out, the stack will overflow and the program will crash.
Holding parameters
Parameters required for a subroutine (such as, for example, the centre coordinates, line colour and
thickness for a circle subroutine) may be held on the call stack. Each call to a subroutine will be given
separate space on the call stack for these values.
Local variables
A subroutine frequently uses local variables which are known only within the subroutine. These may also
be held in the call stack. Each separate call to a subroutine gets its own space for its local variables.
Storing local variables on the call stack is much more efficient than using dynamic memory allocation,
which uses heap space.
{
top of stack
stack pointer
Local variables for
drawLine
Stack frame for 7-36
Return address drawLine
Parameters for
{
drawLine
Exercises
1. A Last In, First Out (LIFO) data structure has a pointer called top.
(b) Name and briefly describe one type of error that could occur when attempting to add
a data item or remove a data item from the data structure. [2]
(c) Describe briefly one use of this type of data structure in a computer system. [2]
(d) Write a pseudocode procedure for reversing the elements of a queue with the aid of a stack. [6]
A
203
SECTION 7 – DATA STRUCTURES
A-Level only
Hashing
Large collections of data, for example customer records in a database, need to be accessible very
quickly without having to look through all the records. This can be done by holding an index of the
physical address on the file where the data is held. But how is the index created?
The answer is that a hashing algorithm is applied to the value in the key field of each record to
transform it into an address. Normally there are many more possible keys than actual records that need
to be stored. For example, if 300 records are to be stored, each having a unique 6-digit identifier or key,
1000 free spaces may be allocated to store the records.
One common hashing algorithm is to divide the key by the number of available addresses and take the
7-37 remainder as the address. Using the algorithm (address = key mod 1000):
453781 would be stored at address 781
447883 would be stored at address 883
134552 would be stored at address 552
What will happen when the record with key 631552 is to be stored? This will hash to the same address
as 134552 and is called a synonym. Synonyms are bound to occur with any hashing algorithm, and two
record keys hashing to the same address is referred to as a collision.
A simple way of dealing with collisions is to store the item in the next available free space. Thus 134552
would be stored at address 553, assuming this space is unoccupied.
Hash table
A hash table is a collection of items stored in such a way that they can quickly be located. The hash table
could be implemented as an array or list of a given size with a number of empty spaces. An empty hash
table that can store a maximum of 11 items is shown below, with spaces labelled 0,1, 2,…10.
0 1 2 3 4 5 6 7 8 9 10
Empty Empty Empty Empty Empty Empty Empty Empty Empty Empty Empty
Now assume we wish to store items 78, 55, 34, 19 and 29 in the table using the method described
above, using division by 11 and taking the remainder. Collisions are stored in the next available free slot.
First of all, calculate the hash value of each item to be stored.
204
CHAPTER 37 – HASH TABLES
A-Level only
0 1 2 3 4 5 6 7 8 9 10
55 78 34 Empty Empty Empty Empty 29 19 Empty Empty
Folding method
There are many other algorithms. The folding method divides the item into equal parts, and adds the
parts to give the hash value. For example, a phone number 01543 677896 can be divided into groups of
two, namely 01, 54, 36, 77, 89, 6. Adding these together, we get 263. If the table has fewer spaces than
the maximum possible sum generated by this method, say 100 cells, then the extra step of dividing by
100 needs to be applied.
Q2: Using the folding method and division by 100, complete the hash table below to show where
each number will be stored in a table of 100 spaces. (A sample 123456 is done for you.)
(i) 238464 (ii) 188947 (iii) 276084
Item “Folded” value Remainder Location in hash table
123456 12+34+56=102 2 2
238464
188947
276084
205
SECTION 7 – DATA STRUCTURES
A-Level only
Hashing a string
A hash function can be created for alphanumeric strings by using the ASCII code for each character.
A portion of the ASCII table is shown below:
To hash the word CAB, we could add up the ASCII values for each letter and, if there are 11 spaces in
the hash table, for example, divide by 11 and take the remainder as its hash value.
67 + 65 + 66 = 198 Hash value = 198 mod 11 = 0
so CAB goes in location 0 (assuming that location is empty).
Q3: (i) Using the above hashing algorithm, find the hash values of the following: BAG, TEA, EAT,
GAB. (ASCII code for ‘T’ = 84)
7-37 (i) What do you notice about the hash values associated with these words?
(iii) Can you suggest a modification of the hashing algorithm that may result in fewer
collisions?
Collision resolution
The fuller the hash table becomes, the more likely it is that there will be collisions, and this needs to be
taken into account when designing the hashing algorithm and deciding on the table size. For example,
the size of the table could be designed so that when all the items are stored, only 70% of the table’s cells
are occupied.
Rehashing is the name given to the process of finding an empty slot when a collision has occurred.
The rehashing algorithm used above simply looks for the next empty slot. It will loop round to the first
cell if the table of the end is reached. A variation on this would be to look at every third cell, for example
(the “plus 3” rehash). Alternatively, the hash value could be incremented by 1, 3, 5, 7… until a free space
is found.
Different hashing and rehashing methods will work more efficiently on different data sets – the aim is to
minimise collisions.
A-Level only
Dictionaries
A dictionary is an abstract data type consisting of associated pairs of items, where each pair consists of
a key and a value. It is a built-in data structure in Python and Visual Basic, for example. When the user
supplies the key, the associated value is returned. Items can easily be amended, added to or removed
from the dictionary as required.
In Python, dictionaries are written as comma-delimited pairs in the format key:value and enclosed in curly
braces. For example:
IDs = {342:’Harry’, 634:’Jasmine’, 885:’Max’,571:’Sheila’}
Operations on dictionaries
It is possible to implement a dictionary using either a static or a dynamic data structure. The
implementation needs to include the following operations:
Note that the pairs are not held in any particular sequence. The key is hashed using a hashing algorithm
and placed at the resulting location in a hash table, so that a fast lookup is possible.
A
207
SECTION 7 – DATA STRUCTURES
A-Level only
Exercises
1.Student records held by a school are stored in a database which organises the data in files
using hashing.
(a) In the context of storing data in a file, explain what a hash function is. [1]
(b) The system allows for a maximum of 1000 unique 6-digit integer student IDs in the file
holding current student records. Give an example of a hashing function that could be used
to find a particular record. Ignore collisions. [2]
2. A
bank has a number of safety deposit boxes in which customers can store valuable documents
or possessions. The details of which box is rented by a customer with a particular account number
are held in a dictionary data structure. Sample entries in the dictionary are:
(a) What value will be returned by a lookup operation using the key 1178612? [1]
(b) The dictionary is implemented using a hash table, using the algorithm
What value is returned by the hashing function when it is applied to account number 0093421? [1]
7-37
(c) What is the maximum number of entries that can be made in the dictionary? [1]
(ii) Give an example of how a collision might occur in this scenario, using sample account
numbers.[2]
(iii) Describe one way of dealing with collisions in the hash table. [1]
A
208
CHAPTER 38 – GRAPHS
A-Level only
Chapter 38 – Graphs
Objectives
A • Be aware of a graph as a data structure used to represent complex relationships
A • Be familiar with typical uses for graphs
A • Be able to explain the terms: graph, weighted graph, vertex/node, edge/arc, undirected graph,
directed graph
A • Know how an adjacency matrix and an adjacency list may be used to represent a graph
A • Be able to compare the use of adjacency matrices and adjacency lists
Definition of a graph
A graph is a set of vertices or nodes connected by edges or arcs. The edges may be one-way or two
way. In an undirected graph, all edges are bidirectional. If the edges in a graph are all one-way, the
graph is said to be a directed graph or digraph.
Bury St Edmunds
57 Framlingham
10
25 Wickham Market 7-38
45
31 56
9
Stowmarket
21
15
Ipswich Woodbridge
The edges may be weighted to show there is a cost to go from one vertex to another as in Figure 38.1.
The weights in this example represent distances between towns. A human driver can find their way
from one town to another by following a map, but a computer needs to represent the information about
distances and connections in a structured, numerical representation.
A
D
F E
209
SECTION 7 – DATA STRUCTURES
A-Level only
Implementing a graph
Two possible implementations of a graph are the adjacency matrix and the adjacency list.
B A B C D E F
5 3
A 5 4
A 6
D B 6 3
4
2 C 8
C
8 D 2
F E
E
F
In the case of an undirected graph, the adjacency matrix will be symmetric, with the same entry in (0,1)
as in (1,0), for example.
7-38 An unweighted graph may be represented with 1s instead of weights, in the relevant cells.
Q1: Draw an adjacency matrix to represent the weighted graph shown in Figure 38.1.
A {B:5, C:4}
B {C:6, D:3}
C {F:8}
D {E:2}
E {}
F {}
210
CHAPTER 38 – GRAPHS
A-Level only
The unweighted graph in Figure 38.2 would be represented as shown below, with the adjacency list
containing lists of nodes adjacent to each node. A dictionary data structure is not required here as there
are no edge weights.
A [B,C]
B [C,D]
C [F]
D [E]
E []
F []
The advantage of this implementation is that is uses much less memory to represent a sparsely
connected graph.
Q2: Draw an adjacency list to represent the unweighted graph shown in Figure 38.2, but assuming
this time that it is undirected.
Traversing a graph
There are two ways to traverse a graph so that every node is visited.
Depth-first traversal
In this traversal, we go as far down one route as we can before backtracking and taking the next route.
Consider the following graph:
A
C B
D K
F
J G
E
Figure 38.3
Starting at A, we can either go left or right. We will choose to go left whenever there is a choice of routes.
We visit C, F, J, H, D, G. We have already visited F so we have reached the end of this path. Back up to
D and visit E. Now we must retrace our steps via D, H, J, F, C, to A, and go down the alternative route to
B and K.
211
SECTION 7 – DATA STRUCTURES
A-Level only
Nodes were visited in the order A C F J H D G E B K.
This sequence involved some choices, so is not unique. Another depth-first route would be
A C D E H J F G B K.
Q3: Find another route that visits all nodes in a depth-first search.
Breadth-first search
With a breadth-first traversal, we visit all the neighbours of a node, and then all the neighbours of the
first node visited, all the neighbours of the second node and so on, before moving further away from the
start node.
Consider the graph below:
B C D
E F
7-38
Figure 38.4
Starting at A, we visit B, then C, then D (or we could have started by visiting C or D).
Then we move to B, which has no neighbours, so we back up to A and go to C. From C, we visit E
before returning to A. Next, we go to D and visit F. All nodes have now been visited, in the order A B C D
E F.
Q4: Find another route that visits all nodes in a breadth-first search.
Q5: Write down a possible route through the graph in Figure 38.3 using a breadth-first search.
Applications of graphs
Graphs may be used to represent, for example:
• computer networks, with nodes representing computers and weighted edges representing the
bandwidth between them
• roads between towns, with edge weights representing distances, rail fares or journey times
• tasks in a project, some of which have to be completed before others
• states in a finite state machine
• web pages and links (see Google’s PageRank algorithm in Section 5)
A
212
CHAPTER 38 – GRAPHS
A-Level only
Exercises
1. The figure below shows an adjacency matrix representation of a directed graph (digraph).
To
A B C D E
A 0 5 3 10 0
B 0 0 1 8 0
From
C 0 0 0 7 6
D 0 0 0 0 4
E 0 0 0 0 0
(a) Draw a diagram of the directed graph, showing edge weights. [3]
(c) Give one advantage of using an adjacency matrix to represent a graph, and one advantage
of using an adjacency list. Explain the circumstances in which each is more appropriate. [4]
B D 7-38
C
E G
F H
(a) Complete the adjacency matrix below to represent this graph. [4]
A B C D E F G H
A
B
C
D
E
F
G
H
(b) List the nodes in the order in which they would be visited using
213
SECTION 7 – DATA STRUCTURES
A-Level only
Chapter 39 – Trees
Objectives
A • Define a binary tree as a rooted tree in which each node has at most two children
A • Create and traverse a binary tree
A • create, search and traverse a binary search tree
Concept of a tree
Trees are a very common data structure in many areas of computer science and other contexts.
Like a tree in nature, a rooted tree has a root, branches and leaves, the difference being that a tree
in computer science has its root at the top and its leaves at the bottom.
Typical uses for rooted trees include:
The tree shown above has a root node, and is therefore defined as a rooted tree. Here are some terms
used in connection with rooted trees:
Node: The nodes contain the tree data
Edge: An edge connects two nodes. Every node except the root is connected by exactly one
edge from another node
Root: This is the only node that has no incoming edges
Child: The set of nodes that have incoming edges from the same node
Parent: A node is a parent of all the nodes it connects to with outgoing edges
Subtree: The set of nodes and edges comprised of a parent and all descendants of the parent.
A subtree may also be a leaf
Leaf node: A node that has no children
214
CHAPTER 39 – TREES
A-Level only
Q1: Identify the leftmost subtree, the parent of Frank and the children of Kate. How many parent
nodes are there in the tree? How many child nodes?
Note that a rooted tree is a special case of a connected graph. A node can only be connected to
one parent node, and to its children. It is described as having has no cycles because there can be no
connection between children, or between branches, for example from Ben to Anna or Petra to Kate.
17
8 22
4 12 19 30
5 14 25
To search the tree for the number 19, for example, we follow the same steps.
19 is greater than 17, so branch right.
19 is less than 22, so branch left. There it is!
215
SECTION 7 – DATA STRUCTURES
A-Level only
Q2: (a) Which nodes will be visited when searching for the number 14?
(b) Which nodes will be visited when searching for the number 21, which is not in the tree?
(c) Where will new nodes 10 and 20 be inserted?
• Pre-order traversal
• In-order traversal
• Post-order traversal
The names refer to whether the root of each sub-tree is visited before, between or after both branches
have been traversed.
Pre-order traversal
Draw an outline around the tree structure, starting to the left of the root. As you pass to the left of a node
(where the red dot is marked), output the data in that node.
17
7-39
8 22
4 12 19 30
5 14 25
The nodes will be visited in the sequence 17, 8, 4, 5, 12, 14, 22, 19, 30, 25
A pre-order traversal may be used to produce prefix notation, used in functional programming languages.
A simple illustration would be a function statement, x = sum a,b rather than x = a + b, in which
the operation comes before the operands rather than between them, as in infix notation.
216
CHAPTER 39 – TREES
A-Level only
In-order traversal
Draw an outline around the tree structure, starting to the left of the root. As you pass underneath a node
(where the red dot is marked), output the data in that node.
17
8 22
4 12 19 30
5 14 25
The nodes will be visited in the sequence 4, 5, 8, 12, 14, 17, 19, 22, 25, 30.
The in-order traversal visits the nodes in sequential order.
Q3: Construct a binary search tree to hold the names Mark, Stephanie, Chigozie, Paul, Anne, Hanna,
7-39
Luke, David, Vincent, Tom. List the names, in the order they would be checked, to find David.
Q4: List the names in the order they would be output when an in-order traversal is performed.
Post-order traversal
Draw an outline around the tree structure, starting to the left of the root. As you pass to the right of a
node (where the red dot is marked), output the data in that node.
17
8 22
4 12 19 30
5 14 25
The nodes will be visited in the sequence 5, 4, 14, 12, 8, 19, 25, 30, 22, 17.
217
SECTION 7 – DATA STRUCTURES
A-Level only
• left pointer
• data item
• right pointer
Alternatively, it could be held in a list of tuples, or three separate lists or arrays, one for each of the
pointers and one for the data items.
The numbers 17, 8, 4, 14, 22, 19, 12, 5, 30, 25 used to construct the tree above could be held as
follows:
For example, the left pointer in tree[0] points to tree[1] and the right pointer points to tree[4]. The value -1
is a ‘rogue value’ which indicates that there is no child on the relevant side (left or right).
Q5: Show how the search tree below could be implemented in an array with left and right pointers.
Names were inserted in the tree in the following order: Monkey, Topi, Ostrich, Giraffe, Hippo,
Zebra, Buffalo, Cheetah, Rhino, Baboon, Jackal
Monkey
Giraffe Topi
218
CHAPTER 39 – TREES
A-Level only
* /
a b c d 7-39
Figure 39.1
Suppose this data is held as shown below:
In pseudocode:
procedure inorderTraverse(p)
if tree[p].left != -1 then
inorderTraverse(tree[p].left)
endif
print(tree[p].data)
if tree[p].right != -1 then
inorderTraverse(tree[p].right)
endif
endprocedure
219
SECTION 7 – DATA STRUCTURES
A-Level only
The routine is called with a statement inorderTraverse(0)
Tracing through the algorithm, the nodes are output in the order a * b + c / d
In pseudocode:
procedure postorderTraverse(p)
if tree[p].left != -1 then
postorderTraverse(tree[p].left)
endif
if tree[p].right != -1 then
postorderTraverse(tree[p].right)
endif
print(tree[p].data)
7-39 endprocedure
The nodes are output in the sequence a b * c d / +. This is the sequence in which algebraic expressions
are written using Reverse Polish Notation, which is used by compilers to evaluate expressions.
In pseudocode:
procedure preorderTraverse(p)
print(tree[p].value)
if tree[p].left != -1 then
preorderTraverse(tree[p].left)
endif
if tree[p].right != -1 then
preorderTraverse(tree[p].right)
endif
endprocedure
A pre-order traversal may be used for producing a prefix expression from an expression tree such as
the one shown in Figure 39.1. Prefix is used in some compilers and calculators.
220
CHAPTER 39 – TREES
A-Level only
Exercises
1. Data may be stored as a binary tree.
(a) Show how the following data may be stored as a binary tree for subsequent processing in
alphabetic order by drawing the tree. Assume that the first item is the root of the tree and
the rest of the data items are inserted into the tree in the order given,
Data items: magpie, robin, chaffinch, linnet, thrush, blackbird, fieldfare, skylark, pigeon. [3]
(b) Show how the data could be represented using three one-dimensional arrays. [3]
(c) List the order that the nodes would be visited using
2. In what order should the following tree be traversed so that each section and subsection is
printed in the correct sequence? [1]
Project
7-39
Section 1.1 Section 1.2 Section 2.1 Section 3.1 Section 3.1
221
Section 8
Boolean algebra
In this section:
Chapter 40 Logic gates and truth tables 223
222
CHAPTER 40 – LOGIC GATES AND TRUTH TABLES
Binary logic
At the most elementary level, an electronic device can only recognise the presence or absence of current
or voltage. Either electricity is present or it isn’t. This is a switch – on or off, true or false, 1 or 0. With a
computer’s semiconductor, the voltage at the input and output terminals is measured and is either high
or low; 1 or 0. Computers comprise billions of these switches and manipulating these sequences of ONs
and OFFs can change individual bits.
Electronic logic gates can take one or more inputs and produce a single output. This output can become
the input to another gate and a complicated cascaded sequence of logic gates can be implemented to
form a circuit in, for example, the CPU.
8-40
Simple logic gates and truth tables
There are a number of different logic gates that are each designed to perform a different operation in
terms of output. We will look at NOT, AND, OR and XOR gates.
Each of these gates may be represented by a truth table showing the output for each possible input or
combination of inputs. The four gates are shown below. Inputs are usually given algebraic letters such as
A, B and C and output is usually represented by P or Q.
Input A Output Q
A Q 0 1
1 0
Q = NOT A
223
SECTION 8 – BOOLEAN ALGEBRA
OR gate (disjunction)
Input A Input B Output Q
A 0 0 0
Q 0 1 1
B
1 0 1
Q = A OR B 1 1 1
The Boolean expression for OR is written: Q = A B where represents OR.
^ ^
where A and B are both true. XOR is referred to as exclusive OR, and OR is sometimes referred to as
inclusive OR.
D
A
Q
B E
C
224
CHAPTER 40 – LOGIC GATES AND TRUTH TABLES
0 0 0 1 0 1
0 0 1 1 0 1
0 1 0 1 0 1
0 1 1 1 1 1
1 0 0 0 0 0
1 0 1 0 0 0
1 1 0 0 0 0
1 1 1 0 1 1
A
E
Q
B D
C
^
Q2: Show, by drawing a truth table for P = (A ^ ¬B) (¬A ^ B), that P = Q, where Q = A B.
^
8-40
^
Q3: Write the Boolean expression Q = ¬ ((A B) ^ C) using AND, OR, NOT, XOR instead of symbols.
Draw the corresponding logic circuit.
Q4: Write the Boolean expression represented by the logic diagram below, using AND, OR and NOT
instead of symbols. Then write the same expression using symbols. What is the output if A, B
and C are all True?
A
B
P
225
SECTION 8 – BOOLEAN ALGEBRA
Example 1
A boiler has two sensors, a pressure sensor and a temperature sensor. If either the temperature (T) or the
pressure (P) is too high, a valve (V) will close.
This can be expressed as V = T P or alternatively as V = T OR P
^
Example 2
A chemical process has a sensor to detect a dangerous situation, in which case it sounds an alarm (A).
The alarm is sounded if:
either temperature >= 100°C AND rotator is OFF
or PH > 6 AND temperature < 100°C
A table can be drawn to represent these conditions as Boolean values.
8-40
Input Binary value Condition
1 Temperature >= 100°C
T
0 Temperature < 100°C
1 Rotator ON
R
0 Rotator OFF
1 PH > 6
P
0 PH <= 6
Now the logic circuit for this process can be drawn as follows:
226
CHAPTER 40 – LOGIC GATES AND TRUTH TABLES
Exercises
1. (a) Complete the following truth table for the XOR logic gate.
[1]
(ii) Q = ¬A ^ B C [3]
^
A F
B
C G 8-40
Q
D
K
H
E
(b) What are the values of F, G, H, K and Q if A, B, C and D and E are all equal 1? [5]
3. Three sensors A, B and C are used to monitor a process. A signal X is output from the circuit.
227
SECTION 8 – BOOLEAN ALGEBRA
A-Level only
de Morgan’s laws
Augustus de Morgan (1806-1871) was a Cambridge Mathematics professor who formulated two
theorems or laws relating to logic. These laws can be used to manipulate and simplify Boolean
expressions. Although his theoretical work had little practical application in his lifetime, it became of
major significance in the next century in the field of digital electronics, in which TRUE and FALSE can be
replaced by ON and OFF or the binary numbers 0 and 1.
8-41 Using de Morgan’s laws, any Boolean function can be converted to one which uses only NAND functions
or only NOR functions, and these can be further converted to an expression using all NAND functions or
all NOR functions.
Thus, any integrated circuit can be built from just one type of logic gate. This is an advantage in
manufacturing where costs can be kept down by using only one type of gate.
The truth of this is clear from the Venn diagram on the right. Suppose we
have a variable X defined by X
A B
^
X = ¬(A B)
^
X = ¬A ^ ¬B
A B ¬A ¬B A B ¬(A B) ¬A ^ ¬B
^ ^
0 0
0 1
1 0
1 1
228
CHAPTER 41 – SIMPLIFYING BOOLEAN EXPRESSIONS
A-Level only
A B ¬A ¬B A^B ¬(A ^ B) ¬A ¬B
^
0
0
1
1
General rules
1. X ^ 0 = 0
2. X ^ 1 = X
3. X ^ X = X
4. X ^ ¬X = 0
5. X 0=X
^
6. X 1=1
^
7. X X=X
^
8. X ¬X = 1
^
Commutative rule
9. X ^ Y = Y ^ X
10. X Y=Y X
^ ^
229
SECTION 8 – BOOLEAN ALGEBRA
A-Level only
Associative rules
11. X ^ (Y ^ Z) = (X ^ Y) ^ Z
12. X (Y Z) = (X Y) Z
^ ^ ^ ^
Distributive rules
13. X ^ (Y Z) = (X ^ Y) (X ^ Z)
^ ^
14. (X Y) ^ (W Z) = (X ^ W) (X ^ Z) (Y ^ W) (Y ^ Z)
^ ^ ^ ^ ^
Absorption rules
15. X (X ^ Y) = X
^
16. X ^ (X Y) = X
^
Double negation
17. X = ¬ ¬X
Example 1
Use de Morgan’s laws and the laws of Boolean algebra to simplify the following Boolean expression:
Q = ¬(¬(X ^ ¬Y) ^ (¬Y ¬Z))
^
8-41
Example 2
Use de Morgan’s laws to simplify A ^ B ¬A ¬B
^ ^
Answer: Put brackets between the parts of the expression separated by ^ (OR)
(A ^ B) (¬A ¬B)
^ ^
Example 3
Use Boolean algebra to show that (A B) ^ (A C) = A B^C
^ ^ ^
= A (B ^ A) (A ^ C) (B ^ C) (since A ^ A = A
^ ^ ^
= A (A ^ B) (A ^ C) (B ^ C) (commutative law)
^ ^ ^
= A (A ^ C) (B ^ C) (Absorption Law)
^ ^
= A (B ^ C) (Absorption Law)
^
230
CHAPTER 41 – SIMPLIFYING BOOLEAN EXPRESSIONS
A-Level only
Example 4
A single output Q is produced from three inputs X, Y and Z. Q is 1 only if X and Y are 1, or Z is 1
and Y is 0.
Write the Boolean expression to represent this circuit.
Answer: There are two separate logic gates involved here: X ^ Y, Z ^ ¬Y.
The output from these two gates are input to an OR gate.
Q = (X ^ Y) (Z ^ ¬Y)
^
Represent this equation diagrammatically using a combination of AND, OR and NOT gates.
Answer:
X
Y Q
Example 5
Write the Boolean expression corresponding to the following logic circuit. 8-41
Answer: A ¬(B ^ C)
^
A
B Q
C
231
SECTION 8 – BOOLEAN ALGEBRA
Exercises
1. (a) Write a Boolean expression for P in the logic circuit shown in Figure 1. [1]
(b) Write a Boolean expression for R. [1]
(c) Draw the truth table for the logic circuit. [3]
A P
B
R
C Q
Figure 1
A-Level only
3. (a) State the names of the logic gates represented by each of the truth tables below. [2]
(ii) A ^ B B [1]
^
232
CHAPTER 42 – KARNAUGH MAPS
Introduction
A Karnaugh map provides an alternative way of simplifying Boolean expressions which is often easier
than using Boolean algebra for those involving up to three or four variables. It is similar to a truth table
and allows us to easily detect groupings of expressions with common factors.
A B P
B
0 0 a A 0 1
0 1 b 0 a b
1 c d
1 0 c
1 1 d
8-42
The values inside the squares are copied from the output column of the truth table, so there is one
square in the Karnaugh map for every row in the truth table. Suppose we have the following truth table:
For example, when A = 0 and B = 0, the output is 0. When A = 1 and B = 1, the output is 1.
233
SECTION 8 – BOOLEAN ALGEBRA
Example 1
Use a Karnaugh map to simplify the expression Q = ¬A ^ ¬B A ^ ¬B ¬A ^ B
^ ^
Draw a blank Karnaugh map and fill in a 1 for the first sub-expression ¬A ^ ¬B. Then insert a 1 for the
second sub-expression A ^ ¬B. Finally add a 1 for the sub-expression ¬A ^ B
B B B
A 0 1 A 0 1 A 0 1
0 1 0 1 0 1 1
1 1 1 1 1
(¬A ^ ¬B) (¬A ^ ¬B) (A ^ ¬B) (¬A ^ ¬B ) (A ^ ¬B) (¬A ^ B)
^ ^ ^
B
A 0 1
0 1 1
1 1
Now make groupings of 1, 2, or 4 ones, which can be overlapping. Each grouping should be as large as
possible – in this case, the two groupings each consist of two squares.
8-42
The pink group represents NOT A, and the blue group represents NOT B. Therefore the whole expression
represents Q = NOT A OR NOT B, or in alternative notation, ¬A ¬B.
^
Q1: Draw a Karnaugh map representing A ^ B A ^ ¬B, and hence simplify the expression.
^
Example 2
Represent the expression ¬A ¬B A ^ B ^ ¬C in a Karnaugh map, and hence simplify the expression.
^ ^
BC
A 00 01 11 10
0
1
Note: The order of terms along the top is not random: they are arranged so that each subsequent term
reflects a change in only one variable. They are not in numerical sequence of 00, 01, 10, 11.
The choice of whether to put A on its own, and group B and C together, or choose a different pair, and
put for example C as the column heading and AB as the row heading, is not important, and will produce
the same groupings.
234
CHAPTER 42 – KARNAUGH MAPS
First, divide the expression into sub-expressions, bracketing between the (OR) symbols, giving
^
As before, we can now start filling in the table one step at a time, representing each sub-expression in turn.
BC BC BC
A 00 01 11 10 A 00 01 11 10 A 00 01 11 10
0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
1 1 1 1 1 1 1 1
BC
A 00 01 11 10
0 1 1 1 1
1 1 1 1
Notice that the green group has “wrapped around” and is counted as one group representing ¬C.
These three groups together represent ¬A ¬B ¬C.
^ ^
8-42
Q2: Use a Karnaugh map to simplify the expression (A ^ C) (¬A ^ B) (B ^ C)
^ ^
Example 3
Use a Karnaugh map to simplify the expression (¬ A ^ B) (B ^ ¬ C) (B ^ C) (A ^ ¬ B ^ ¬ C)
^ ^ ^
BC BC
A 00 01 11 10 A 00 01 11 10
0 1 1 0 1 1
1 1 1
(¬A ^ B) (¬A ^ B) (B ^ ¬C)
^
BC BC
A 00 01 11 10 A 00 01 11 10
0 1 1 0 1 1
1 1 1 1 1 1 1
(¬A ^ B) (B ^ ¬C) (B ^ C) (¬A ^ B) (B ^ ¬C) (B ^ C) (A ^ ¬B ^ ¬C)
^ ^ ^ ^ ^
235
SECTION 8 – BOOLEAN ALGEBRA
BC
A 00 01 11 10
0 1 1
1 1 1 1
Here, the group outlined in green “wraps around” but is still a single group. The expression simplifies to
B (A ^ ¬C)
^
AB
C 00 01 11 10
0
1
Example 4
Represent the expression A (A ^ ¬ B ^ C ^ D) in a Karnaugh map, and hence simplify the expression.
^
8-42
CD CD
AB 00 01 11 10 AB 00 01 11 10
00 00
01 01
11 1 1 1 1 11 1 1 1 1
10 1 1 1 1 10 1 1 1 1
This simplifies to A.
236
CHAPTER 42 – KARNAUGH MAPS
Exercises
1. A Karnaugh map is shown below.
CD
AB 00 01 11 10
00 0 0 0 0
01 0 0 0 0
11 1 1 1 1
10 0 0 0 0
CD
AB 00 01 11 10
00 0 1 0 0
01 0 1 0 0
11 1 1 1 1
10 1 1 1 1
CD
AB 00 01 11 10
0
1
(A ^ B ^ ¬C ^ D) (A ^ B ^ ¬C ^ D) (A ^ B ^ C ^ ¬D) [4]
^ ^ ^
237
SECTION 8 – BOOLEAN ALGEBRA
A-Level only
Half adders
A half adder can take an input of two bits and give a two-bit output as the correct result of an addition of
the two inputs.
A A B S C
S
B 0 + 0 = 0 0
0 + 1 = 1 0
1 + 0 = 1 0
C
8-43 1 + 1 = 0 1
This is shown by the diagram above and represented by the truth table where S represents the sum and
^
C represents the carry bit. S can be given as S = A B, and C as C = A ^ B. Although a flip-flop can
output the value of a carry bit, it only has two inputs so it cannot use the carry from a previous addition
as a third input to a subsequent addition in order to add n-bit numbers.
Full adders
A full adder combines two half adders to add three bits together including the two inputs A and B, and
a carry bit C. The logic gate circuit below illustrates how two half adders have been connected with an
additional OR gate to output the carry bit.
A B Cin S Cout
0 + 0 + 0 = 0 0
A 0 + 0 + 1 = 1 0
B S
0 + 1 + 0 = 1 0
Cin
0 + 1 + 1 = 0 1
1 + 0 + 0 = 1 0
Cout
1 + 0 + 1 = 0 1
1 + 1 + 0 = 0 1
1 + 1 + 1 = 1 1
^ ^ ^
Now the Boolean logic becomes S = A B Cin, and Cout = (A ^ B) (Cin ^ (A B)).
^
238
CHAPTER 43 – ADDERS AND D-TYPE FLIP-FLOPS
A-Level only
A3 B3 A2 B2 A1 B1 A0 B0
S3 S2 S1 S0
8-43
Q1: What would be the output S4 from a fifth adder connected to the diagram above if the inputs for
A4 and B4 were 0 and 1? What would be the output C5?
D-type flip-flops
A flip flop is an elemental sequential logic circuit that can store one bit and flip between two states, 0
and 1. It has two inputs, a control input labelled D and a clock signal.
The clock or oscillator is another type of sequential circuit that changes state at regular time intervals.
Clocks are needed to synchronise the change of state of flip flop circuits.
Clock period
1
Falling edge
Clock Rising
0
width edge
The D-type flip-flop (D stands for Data or Delay) is a positive edge-triggered flip-flop, meaning that it
can only change the output value from 1 to 0 or vice versa when the clock is at a rising or positive edge,
i.e. at the beginning of a clock period.
When the clock is not at a positive edge, the input value is held and does not change. The flip-flop
circuit is important because it can be used as a memory cell to store the state of a bit.
239
SECTION 8 – BOOLEAN ALGEBRA
A-Level only
Output Q only takes on a new value if the value at D has changed at the point of a clock pulse.
This means that the clock pulse will freeze or ‘store’ the input value at D until the next clock pulse.
If D remains the same on the next clock pulse, the flip-flop will hold the same value.
1
D
0
Clock
1
Q
8-43 0
Q2: Show the output Q for the input D in the figure below.
1
D
0
1
Clock
0
1
Q
0
240
CHAPTER 43 – ADDERS AND D-TYPE FLIP-FLOPS
A-Level only
Exercises
1. A half-adder is used to find the sum of the addition of two binary digits.
(a) Complete the diagram below to construct a half adder circuit. [3]
A S
(b) Complete the following truth table for a half adder’s outputs S and C.
A B S C
[2]
8-43
(c) How does a full adder differ from a half adder in terms of its inputs? [2]
2. An edge-triggered D-type flip-flop can be used as a memory cell to store the value of a single bit.
The following graph shows the clock cycle and the input signals applied to D.
(a) Label each rising edge on the diagram below. [1]
(b) Draw the flip-flop’s output Q on the graph. [4]
1
D
0
Clock
1
Q
0
A
241
Section 9
Legal, moral, ethical and cultural issues
In this section:
Chapter 44 Computing related legislation 243
242
CHAPTER 44 – COMPUTING RELATED LEGISLATION
Introduction
The rapidly changing field of computing and worldwide communications poses particular challenges
to legislators.
Countries have different laws, and it is sometimes hard to prove in which country an offence was
committed, and equally hard to trace the offender or to prosecute.
New applications in computing are constantly being invented and with them, new ways of committing
offences for which there is no legislation. Legislators have to balance the rights of the individual with
the need for security and protection from terrorist or criminal activity. Many countries, for example, have
enacted legislation restricting or banning the use of strong cryptography. 9-44
• the Data Protection Act (1998) which is designed to ensure that personal data is kept accurate,
up-to-date, safe and secure and not used in ways which would harm individuals
• the Computer Misuse Act, which makes it an offence to access or modify computer material
without permission
• The Regulation of Investigatory Powers Act 2000
Other laws such as the Copyright, Designs and Patents Act (1988) have a more general application,
covering the intellectual property rights of many types of work including books, music, art, computer
programs and other original works.
243
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
Q1: How concerned are you about misuse of your personal data? Are you aware of how your
social profile may be used by future employers?
Q2: Describe some behaviours which would be illegal under this Act. Find some examples of the
application of the Computer Misuse Act (e.g. www.computerevidence.co.uk/Cases/CMA.htm)
244
CHAPTER 44 – COMPUTING RELATED LEGISLATION
If you buy a music CD or pay to download a piece of music, software or a video, it is illegal to
• The user must enter a unique key before the software is installed
• Some software will only run if the CD is present in the drive
• Some applications will only run if a special piece of hardware called a ‘dongle’ is plugged into a USB
port on the computer
However, although a piece of software such as an applications package, game or operating system is
protected, algorithms are not eligible for protection. If you come up with a much better sorting algorithm
than anyone else, for example, you cannot stop others from using it.
• enables certain public bodies to demand that an ISP provide access to a customer's communications 9-44
in secret
• enables mass surveillance of communications in transit
• enables certain public bodies to demand ISPs fit equipment to facilitate surveillance
• enables certain public bodies to demand that someone hand over keys to protected information
• allows certain public bodies to monitor people's Internet activities
• prevents the existence of interception warrants and any data collected with them from being revealed
in court
245
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
To be consistent with data protection laws, we're asking you to take a moment
to review key points of Google's Privacy Policy. This isn't about a change that
we've made - it's just a chance to review some key points.
• When you search for a restaurant on Google Maps or watch a video on YouTube, for
example, we process information about the activity - including information like the video
you watched, device IDs, IP addresses, cookie data and location.
• We also process the kind of information described above when you use apps or sites
that use Google services like ads, Analytics and the YouTube video player.
Why we process it
We process this data for the purposes described in our policy, including to:
• Help our services deliver more useful, customised content such as more relevant
search results;
• Deliver ads based on your interests, including things like searches you've done or
videos you've watched on YouTube;
9-44 • Conduct analytics and measurement to understand how our services are used.
Q3: Why do some people object to this data being collected and stored? What are the arguments
for and against organisations collecting such data?
246
CHAPTER 44 – COMPUTING RELATED LEGISLATION
247
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
As journalist Glenn Greenwald painstakingly sifted through the mountain of information provided by
Snowden, he was shocked at the extent of the American surveillance operation. It included the NSA’s
tapping of Internet servers, satellites, underwater fibre-optic cables, local and foreign telephone systems
and personal computers. A list of individuals targeted for particularly invasive forms of spying included
terrorist and criminal suspects, democratically elected leaders of many countries in Europe including
France and Germany, and ordinary American citizens.
The documents leaked by Snowden revealed that the literal aim of the US Government was to collect,
store, monitor and analyse metadata about all electronic communications by everybody in the world.
Exercises
1. Do you think Edward Snowden was right to reveal the secret documents to which he had
access, being legally forbidden to do so under the US Espionage Act 1917? Justify your answer. [4]
2. The FBI and NSA have been protesting about losing surveillance capabilities—through greater
encryption of the Internet—since the 1990s. In China, the manufacture, use, sale, import, or
export of any item containing encryption without prior government approval may lead to
administrative fines, the seizure of equipment, confiscation of illegal gains, and even
criminal prosecution.
Give arguments for and against a policy of making it illegal for individuals and organisations to
use strong encryption in their online communications. [4]
3. The Data Protection Act 1998 sets out eight principles for the protection of privacy in data
collection, handling and distribution. Name two of these principles and explain how each
9-44 serves to protect privacy. [4]
4. What Act provides intellectual property protection for software? What actions are illegal
under this Act? [3]
References:
Andrew Keen, “The Internet is not the Answer”, Atlantic Books, London, 2015
Glenn Greenwald, “No Place to Hide: Edward Snowden, the NSA and the Surveillance State”, McLelland
and Stewart, 2014
Luke Harding, “The Snowden files: The Inside Story of the World’s Most Wanted Man”, Vintage Books,
2014
Websites:
http://www.theguardian.com/world/2013/jun/06/nsa-phone-records-verizon-court-order
https://www.youtube.com/watch?v=5yB3n9fu-rM
http://www.nybooks.com/articles/archives/2013/nov/21/snowden-leaks-and-public/
248
CHAPTER 45 – ETHICAL, MORAL AND CULTURAL ISSUES
Amazon
Amazon started as an online bookstore in 1994 but soon diversified into DVDs, software, video games,
toys, furniture, clothes and thousands of other products. In 2013 the company turned over $75 billion
in sales, and it now accounts for 65% of all digital purchases of book sales. As a consequence of their
domination, in 2015 there were fewer than 1,000 independent bookstores in Britain, one third less than
in 2005. Where a bookshop employs 47 people for every $10 million in sales, Amazon employs 14 to
generate the same revenue.
eBay
eBay, essentially an electronic platform bringing together buyers and sellers of goods, grew from a user
base of 41,000 trading goods worth $7.2 million in 1995, to 162 million users trading goods worth
$227.9 billion in 2014.
Google
In 1996, Larry Page and Sergey Brin, two Stanford University Computer Science postgraduate students,
created Google. There were already several successful search engines like Yahoo and AltaVista on the
market, but Page and Brin came up with a game-changing algorithm, which they called PageRank, for
249
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
determining the relevance of a Web page based on the number and quality of its incoming links. The
idea was that you could estimate the importance of a Web page by the number and status of other
web pages that link to it. Every time you make a search, the Google search engine becomes more
knowledgeable and thus more useful. Even more valuable to Google is the fact that Google learns more
about you every time you search.
By 1998, Google was getting 10,000 queries every day. By 1999, they were getting 70 million daily
requests. Their next step was to figure out how to make money out of their free technology, and they
came up with AdWords, which enabled advertisers to place keyword-associated ads down the right
hand side of the page. The image below shows what comes up when a user in Dorchester searches
for “Paintball”, with the nearest companies, sponsored advertisements and a map with their locations
appearing on the right of the screen.
By 2014, Google had joined Amazon as a winner-takes-all company, with 1.5 billion daily searches and
revenues of $50 billion.
9-44
9-45
A 2013 paper by Carl Benedikt Frey and Michael Osborne entitled “The future of Employment: how
susceptible are jobs to computerisation?” estimates that 47% of total US employment is at risk. They
examine the impact of future computerisation on more than 700 individual occupations, and note the
shifting of labour from middle-income manufacturing jobs to low-income service jobs which are less
susceptible to computerisation. At the same time, with falling prices of computing, problem-solving
skills are becoming relatively productive, explaining the substantial employment growth in occupations
involving cognitive tasks where skilled, well-educated labour has a comparative advantage.
Thus there is a polarization of labour, with growing employment in high-income cognitive jobs and low-
income manual labour, and the disappearance of middle-income occupations. Driverless cars developed
by Google are an example of how computerisation is no longer confined to routine manufacturing tasks.
The possibility of drones delivering your parcels is no longer in the realms of science fiction. In the 10 jobs
that have a 99% likelihood of being replaced by software and automation within the next 25 years, the
authors include tax preparers, library assistants, clothing factory workers, and photographic
process workers.
250
CHAPTER 45 – ETHICAL, MORAL AND CULTURAL ISSUES
In fact, jobs in the photographic industry have already all but vanished. In 1989, when Tim Berners-Lee
invented the World Wide Web, Kodak employed 145,000 people in research labs, offices and factories
in Rochester US and had a market value of $31 billion. In 2013 the company filed for bankruptcy and
Rochester became virtually a ghost town.
Meanwhile, in 2010, a young entrepreneur called Kevin Systrom started up Instagram, which enabled
users to create photos on their smartphones with filters to give them, for example, a warm, fuzzy glow.
9-45
An Instagram moment
Twenty-five thousand iPhone users downloaded the app when it launched on 6th October 2010. A
month later, Systrom’s Instagram had a million members. By early 2012, it had 14 million users and by
November, 100 million users, with the app hosting 5 billion photos. But when Systrom sold Instagram
to Facebook for a billion dollars in 2012 (less than two years after the startup), Instagram still only had
thirteen full-time employees working out of a small office in San Fransisco. It is a good example of a
service that is not providing any jobs at all in the winner-takes-all economics of the digital marketplace.
251
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
Driverless cars
The prospect of large numbers of self-driving cars on our roads raises ethical questions about the
morality of automated decision making and different algorithms which could be used in the face of
causing “unavoidable harm” - who gets harmed and who gets spared(v).
a b c
(a) The car can stay on course and kill several pedestrians, or swerve and kill one passer-by
(b) The car can stay on course and kill one pedestrian, or swerve and kill its passenger
(c) The car can stay on course and kill several pedestrians, or swerve and kill its passenger
The MIT Technology Review asked: “Should different decisions be made when children are on board,
since they both have a longer time ahead of them than adults, and had less say in being in the car in the
first place? If a manufacturer offers different versions of its moral algorithm, and a buyer knowingly chose
one of them, is the buyer to blame for the harmful consequences of the algorithm’s decisions?”
252
CHAPTER 45 – ETHICAL, MORAL AND CULTURAL ISSUES
One of the commonly held principles that form a commonly held set of pillars for moral life is the
obligation not to inflict harm intentionally; in medical ethics, the physician’s guiding principle is “Do no
harm”. Going further, the moral duties of all scientists, including computer scientists, should also include
trying to promote the common good.
Artificial intelligence
As digital technologies are used in more and more areas of our lives, spreading into our offline
environments through the so-called ‘Internet of things’, previously inert objects are expected to become
networked and start making decisions for us. Algorithms will allow the refrigerator to decide what food
needs replacing, a door will decide who to let in. Should your door call the police if the door is opened
by someone without a tracking device? Should your house report a child who screams excessively to the
Social Services?
• Does a computer system mean that people can work from home and therefore drive less?
• Has computer technology led to a “throw-away society”, with huge waste dumps of unwanted
products which are thrown away rather than repaired or upgraded?
• Is working at home more environmentally friendly than everyone working in a big office, in terms of 9-45
heating and lighting?
• Do computer-managed engines work more efficiently? Create less pollution and use less fuel?
253
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
Exercises
1. Some of the jobs likely to disappear over the next decade owing to computerisation include
manufacturing jobs, clerical jobs and even service jobs, where people will be replaced by robots.
Give examples of other jobs which may be lost owing to computerisation. What will be the social
effects of the job losses? [7]
9-44
9-45 2. Decisions are often made about us on the basis of algorithms of which we may be completely
unaware. Car insurance premiums are calculated based largely on your age, experience, address,
occupation and vehicle details. Health insurance premiums are affected by age, occupation, personal
and parental medical histories. Are the algorithms that calculate these premiums fair? Discuss how
the algorithms used embed moral and/or cultural values. State with reasons who benefits from the
decisions made by these algorithms and whether anyone is harmed. [4]
References
(i) Naughton, John, The Guardian (2015, December 6) “Algorithm writers need a code of conduct”
www.theguardian.com/commentisfree/2015/dec/06/algorithm-writers-should-have-code-of-conduct
(ii) Centre of Internet & Human Rights (2015) “The ethics of Algorithms: from radical content to self-
driving cars”. Retrieved from www.gccs2015.com/sites/default/files/documents/ Ethics_Algorithms-
final%20doc.pdf
(iii) Kramer, Adam et al (2014, March 25) “Experimental evidence of massive-scale emotional contagion
through social networks”. Retrieved from http://www.pnas.org/content/111/24/8788.full
(iv) Rogway, Phillip (2015, December 12) “The Moral Character of Cryptographic Work”. Retrieved from
http://web.cs.ucdavis.edu/~rogaway/papers/moral-fn.pdf
(v) O
wano, Nancy (2015 October 24) “When self-driving cars drive the ethical questions”. Retrieved
from http://techxplore.com/news/2015-10-self-driving-cars-ethical.html
254
CHAPTER 46 – PRIVACY AND CENSORSHIP
Q1: Do you agree that there should be some form of censorship on the Internet?
255
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
9-44
9-46 Q2: Do you think that net firms should do more to halt trolls? Would this be an unwanted prevention
of free speech?
Q3: The media scholar Clay Shirky encapsulated the problem of managing discussion forums by
saying “Comment systems can be good, big, cheap – pick two”. What did he mean by this?
256
CHAPTER 46 – PRIVACY AND CENSORSHIP
Monitoring behaviour
We are all used to our movements and behaviour being caught on camera, in town
and on the roads. CCTV cameras are used for security purposes, crime prevention
and detection. They are used to record drivers speeding, turning or parking illegally or
driving the wrong way up a one-way street.
Employers may monitor employee behaviour on the Internet, recording what sites are
visited during working hours and how much time is spent on them.
And, of course, you can use wearable technology to monitor your own behaviour – how many
steps you have taken during the day, your heart rate during a run, the time you took to swim 100 metres.
Layout
Most websites are designed based on the US layout containing a linear structure of information with
multiple blocks of text that a western reader is likely to skim over. With Japanese websites, for example,
the preference is to include less information per page which, as a whole, is easier to absorb without fear
of missing something. In the West, where text is read from left to right, menus are commonly placed on
the left. In other countries, where Arabic script, for example, is read from right to left, menus and other
page features might more logically appear mirrored in comparison with western versions of the same page.
Maps are a good example of the use of cultural or nationalistic bias reflected in layout. A world map is
frequently shown with the country where it was created appearing in the centre. 9-46
Colour paradigms
Around the world, the way that different cultures see and describe colours varies dramatically. In general,
blue is considered the safest colour choice around the world, since it has many positive associations.
In North America and Europe, blue represents trust, security, and authority, and is considered to be
soothing and peaceful. However, it can also represent depression, loneliness, and sadness (hence having
“the blues”).
In Western cultures, green represents luck, nature, freshness, spring, environmental awareness, wealth,
inexperience, and jealousy (the “green-eyed monster”). In Indonesia, green has traditionally been
forbidden, whereas in Mexico, it’s a national colour that stands for independence. In the Middle East,
green represents fertility, luck, and wealth, and it’s considered the traditional colour of Islam. In Eastern
cultures, green symbolizes youth, fertility, and new life, but it can also mean infidelity. In fact, in China,
green hats for men are taboo because it signals that their wives have committed adultery!
257
SECTION 9 – LEGAL, MORAL, ETHICAL AND CULTURAL ISSUES
In Western cultures, orange represents autumn, harvest, warmth, sunshine. In Hinduism, saffron (a
soft orange colour) is considered auspicious and sacred. In Eastern cultures, orange symbolizes love,
happiness, humility, and good health.
Look up http://www.shutterstock.com/blog/the-spectrum-of-symbolism-color-meanings-around-the-
world to see the symbolism of other colours in countries around the world.
Character sets
A character set is the mapping of a collection of characters to specific bit sequences or codes. The
collection can increase in number dependent on the maximum number of bits allocated to each
character. ASCII uses only seven bits allowing for 128 characters whereas Unicode (UTF-16) has been
developed to represent over a million characters including those of most languages, and symbols used in
mathematical, scientific and musical notation. Unicode and ASCII is covered in more detail in Chapter 29.
9-44
9-46
Exercises
1. Networking sites frequently feature angry, violent or inaccurate content.
Should Facebook, Twitter, Ask.com and others take responsibility for content posted on their
sites? What sort of content should be allowed? Would it be possible to develop software to
facilitate such a task? Discuss. [5]
2. “Honest and law-abiding citizens have nothing to fear from the distribution of their personal data.”
Do you agree with this statement? Give reasons for your view and a reason why someone else
might not agree with you. [5]
3. A University is debating whether to offer a course on writing malware such as viruses, worms
and Trojan horses. Discuss the ethical issues involved in this decision, and whether or not you
think they should run the course. [5]
References
Andrew Keen, “The Cult of the Amateur”, Nicholas Brealey Publishing, 2007, 2008
Andrew Keen, “The Internet is not the Answer”, Atlantic Books, London, 2015
Carl Benedict Frey and Michael Osborne, “The Future of employment: How susceptible are jobs to
computerisation?” http://www.oxfordmartin.ox.ac.uk/downloads/academic/The_Future_of_Employment.pdf
258
Section 10
Computational thinking
In this section:
Chapter 47 Thinking abstractly 260
10
259
SECTION 10 – COMPUTATIONAL THINKING
Computational thinking
What is computational thinking? It is not about following an algorithm in one’s head to carry out a
mathematical task like adding ten numbers. Rather, it is about thinking how a problem can be solved.
This involves two basic steps:
• Formulate the problem as a computational problem – in other words, state it in such a way that it is
potentially solvable using an algorithm
• Try to construct an algorithm to solve the problem
A computational thinker will not be satisfied with any old algorithm, though; it must be a ‘good’ solution –
that is, a correct and efficient solution. A programmer needs to be able to show that a solution is correct
and efficient by using logical reasoning, test data and user feedback.
Clearly, then, computational thinking is a vital skill for a programmer, and in fact it is not possible to be
a programmer without it. It includes the ability to think logically and to apply the tools and techniques of
10-47 computing to thinking about, understanding, formulating and solving problems.
Computing has been called the automation of abstractions, so let’s move on to talk about abstraction.
Abstraction
Representational abstraction can be defined as a representation arrived at by removing
unnecessary details.
Here are some examples of abstraction.
• Any computer model, say of the environment, a new car or a flight simulator, is an abstraction.
• If you are planning to write a program for a game involving a bouncing ball, you will need to decide
what properties of the ball to take into account. If it’s bouncing vertically rather than, say, on a
snooker table, gravity needs to be taken into account. How elastic is the ball? How far and in what
direction will it bounce when it hits an edge? What you are required to do is build an abstract model
of a real-world situation, which you can simplify; remembering, however, that the more you simplify,
the less likely it becomes that the model will mimic reality.
• A builder who is planning to build 100 houses on a new estate may use a physical model of the new
estate, or in the first instance, a plan on paper or on a computer screen. In either case the model
will be greatly simplified. All the houses may appear identical in the model. They may lack windows,
doors or chimneys. All the trees in the model may be of identical size, colour and shape.
• The map of the London Underground is a simple model of the actual geography of the Tube stations.
260
CHAPTER 47 – THINKING ABSTRACTLY
The map tells you what line each station is on and which other lines each station is connected to. It is
very useful for a person travelling around London, but of very little use to an engineer who is planning
where to dig tunnels for a proposed new line.
All of these models contain different types of abstraction which are used in programming. In
programming, abstraction is concerned with the distinction between what a program unit does and how
it does it.
261
SECTION 10 – COMPUTATIONAL THINKING
Abstraction by generalisation
There is a famous problem dating back more than 200 years to the old Prussian city of Königsberg.
This beautiful city had seven bridges, and the inhabitants liked to stroll around the city on a Sunday
afternoon, making sure to cross every bridge at least once. Nobody could figure out how to cross each
bridge once and once only, or alternatively prove that this was impossible, and eventually the Mayor
turned to the local mathematical genius Leonhard Euler.
Euler’s first step was to remove all irrelevant details from the map, and come up with an abstraction:
10-47
North bank
West
island
East
island
South bank
To really simplify it, Euler represented each piece of land as a circle and each bridge as a line between
them.
North bank
South bank
262
CHAPTER 47 – THINKING ABSTRACTLY
What he now had was a graph, with nodes representing land masses and edges (lines connecting the
nodes) representing the bridges. Now that Euler had his graph, how could he solve the problem?
He did not want to try every possible solution; he realised that this was just a particular instance of
a more general problem and he wanted to find a solution that was applicable to similar problems.
He noticed a critical feature of the puzzle: since each bridge could be crossed only once, each node had
to have an even number of connections, because you must enter and leave a node by a different edge.
The only exceptions are the start and end node, since you don’t have to enter a start node or leave
the end node.
All the nodes in this graph have an odd number of edges, so it is therefore impossible!
Euler had laid the foundation of graph theory, which you met in Chapter 38, with more in Chapter 63.
By abstracting the problem, Euler made possible the solution of innumerable related problems. Not only
does it apply to different cities with different numbers of bridges, it applies to many other problems with
similar requirements.
Abstraction by generalisation, as illustrated above, is a grouping by common characteristics to arrive at a
hierarchical relationship of the “is a kind of” type. Thus Euler’s problem is a particular instance of
graph theory.
This type of abstraction is very common in object-oriented programming. A class of object, say an
Animal, will be defined with its own attributes such as gender and whether it is carnivore or vegetarian,
and its own behaviours, methods or procedures such as move, sleep, eat, etc. Other objects such
as Dog, Cat, Mouse and so on may be defined as subclasses of Animal - they all share common
characteristics which are defined in the Animal class, but have their own attributes and behaviours as
well. In other words Dog “is a kind of” Animal, as are Cat and Mouse.
10-47
Q1: Use abstraction by generalisation to continue the sequence 1,4,9,16... What is the
50th number in the sequence?
Data abstraction
A similar idea is that of data abstraction.
The details of how data are actually represented are hidden. For example, when you use integers or real
numbers in a program, you are not interested in how these numbers are actually represented in
the computer.
In a higher level language, it is possible to create abstract data types such as queues, stacks and
trees. The abstract data type, for example a queue, is a logical description of how the data is viewed
and the operations that can be performed on it. For example, elements can be added to the rear of the
queue and removed from the front. The queue may have a maximum size that cannot be exceeded.
The programmer using this data structure, however, is concerned only with the operations such as
AddToQueue or RemoveFromQueue and does not need to know how the data structure is implemented
using, for example, an array and pointers to the front and rear of the queue.
263
SECTION 10 – COMPUTATIONAL THINKING
Exercises
1. “Representational abstraction is a representation arrived at by removing unnecessary details.”
Describe what this means in relation to a computer program which allows the user to enter a
starting address A and a destination address B and returns a map of the route, the number of
miles and the estimated journey time it will take to travel by car from A to B.[5]
2. Explain how abstraction could be used in a game program in which the player has to collect
treasure in a cave and avoid being eaten by a monster. [5]
10-47
264
CHAPTER 48 – THINKING AHEAD
Computational problems
At its most abstract level, a computational problem can be represented by a simple diagram:
Input is the information relevant to the problem, which could for example be passed as parameters to a
subroutine.
Output is the solution to the problem, which could be passed back from a subroutine.
A clear statement of exactly what the inputs and outputs of a problem are is a necessary first step in
constructing a solution.
10-48
Example 1: Determine whether a given item is present in a list
On the face of it, this is a simple problem. But do we know exactly what the inputs are? For example, is
the list sorted? Are the items numeric or alphabetic? What about the output – are we expecting it to be
simply True or False, or should the output give the position in the list of the item if it is found?
The problem needs to be formally defined, stating the inputs and outputs. This can be done as follows:
Name: SearchList
Inputs: A list of strings S = (s1, s2, s3 …sn)
A target string t
Outputs: A Boolean variable b
Now we can write pseudocode for the function SearchList:
function SearchList(s, t)
found = False
n = 0
while found == False AND n < len(s)
if t == s[n] then
found = True
else
n = n + 1
endif
endwhile
return found
endfunction
265
SECTION 10 – COMPUTATIONAL THINKING
Q1: Write a pseudocode algorithm which initialises a list of string items, asks the user to enter an
item to search for, calls the above function SearchList and prints an appropriate message
depending on whether the function returns True or False.
Specifying preconditions
Suppose that a pseudocode algorithm has been written to find the maximum of a list of numbers.
function maxInt(listInt)
maxNumber = listInt[0]
for i = 1 to len(listInt) - 1
if listInt[i] > maxNumber then
maxNumber = listInt[i]
endif
next i
return maxNumber
endfunction
If the function is called with an empty list, it will crash on the statement
maxNumber = listInt[0]
In order to make sure the function never crashes, either the function must test for an empty list, or a
precondition must be specified with the documentation for the function.
Name: maxInt
10-48
Inputs: A list of integers listInt = (k1, k2, k3 …kn)
Outputs: An integer maxInt
Precondition: length of listInt > 0
Q2: Specify the input, output and any preconditions for a function sqrt(n) which finds the square
root of an integer or floating point number.
266
CHAPTER 48 – THINKING AHEAD
In a large project, programmers may create their own libraries of reusable components. If, for example,
abstract data structures such as queues, stacks or trees are used, routines to traverse, add to or delete
from these data structures may be required in many different modules making up the whole project.
Clearly, having components which have already been written, debugged and thoroughly tested will save
time in completing the project.
A-Level only
Exercises
1. Explain the benefits of specifying inputs, outputs and preconditions in the documentation for
a subroutine which will be saved in a library of subroutines for importing into many programs. [6]
2. Give two examples of reusable program components in a programming language with which
you are familiar. [2] 10-48
A-Level only
3. Explain what is meant by caching and give an example of when it is used in a computer system. [2]
267
SECTION 10 – COMPUTATIONAL THINKING
Procedural abstraction
Computer science is, in broad terms, the study of problem-solving, and as such is also the study of
abstraction. As we have seen, abstraction allows us to separate the physical reality of a problem from
the logical view. Thus, for example, you can send an email, play music or download an image without
knowing any of the detail of how these things are actually done. On the other hand, the computer
engineers, technicians and system administrators who enable these things to happen have a very
different view. They need to be able to control the low-level details that users are not even aware of.
Procedural abstraction means using a procedure to carry out a sequence of steps for achieving some
task such as calculating a student’s grade from her marks in three exam papers, buying groceries online
or drawing a house on a computer screen.
Consider, for example, how you could code a program to create the plan for an estate of 100 new
houses. You could use a procedure which will draw a triangle of certain dimensions and colour.
10-49 The colour and dimensions are passed as arguments to the procedure, for example:
procedure drawTriangle(colour, base, height)
The programmer does not need to know the details of how this procedure works. She simply needs to
know how the procedure is called and what arguments are required, what data type each one is and
what order they must be written in. This is called the procedure interface.
Similarly, there may be a procedure to build a rectangle that is defined by parameters colour, height and
width, which are passed as arguments:
drawRectangle ("beige", 4.0, 5.0)
To draw a house at a given position on the screen, the programmer may write a procedure buildHouse()
which uses the drawTriangle() and drawRectangle() procedures, aligns them and positions the house at a
particular position on the screen. All these variables will be passed as arguments to the procedure.
Several houses could be combined to make a street. Several streets could be drawn to represent
the estate.
Then, if the builder of the new estate decides to make all the houses larger, the procedure for drawing the
house does not need to be changed - it is simply called with new arguments.
268
CHAPTER 49 – THINKING PROCEDURALLY
Procedure for
drawing street
Procedure for
drawing house
Procedures for
drawing rectangle
and triangle
Problem decomposition
Most computational problems beyond the trivial need to be broken down into sub-problems before they
can be solved. Think of any system which starts off by presenting the user with a menu of choices. Each
choice will result in a different, self-contained module.
Q1: Give some other advantages of writing a program as a collection of independent modules.
Hierarchy charts
A hierarchy chart is a tool for representing the structure of a program, showing how the modules relate to
each other to form the complete solution. The chart is depicted as an upside-down tree structure, with
modules being broken down further into smaller modules until each module is only a few lines of code
(never more than a page).
269
SECTION 10 – COMPUTATIONAL THINKING
Example 1
Draw a hierarchy chart for a program which calculates and prints a customer’s monthly gas bill.
This can be broken down into several steps.
Calculate
gas bill
‘Calculate units used’ and ‘Calculate total bill’ may now be further broken down.
Calculate
gas bill
10-49
Calculate
Get Calculate Add
cost of units used
previous reading this month’s any outstanding
plus standing
units used amount owing
charge
Q2: Draw a hierarchy chart for a program which asks the user which times table they would like to
be tested on, and then displays five questions, getting the user’s answer each time and telling
them whether they were right or wrong. If they are wrong, the correct answer is displayed.
270
CHAPTER 49 – THINKING PROCEDURALLY
Exercises
1. Using local rather than global variables in subroutines is one way of helping to make a program
easy to maintain.
(b) Describe briefly three other ways in which a program can be made easy to understand
and maintain. [6]
2. Draw a hierarchy chart for a quiz program which does the following:
• asks the user 10 random multiple-choice questions from a bank of 100 questions held in a file
• if the user gives the correct answer, gives feedback and adds 1 to the user’s score
• if they give the wrong answer, gives feedback and displays the correct answer
• at the end of the questions, gives the score out of 10 [6]
10-49
271
SECTION 10 – COMPUTATIONAL THINKING
Chapter 50 – T
hinking logically, thinking
concurrently
Objectives
• Identify the points where a decision has to be taken
• Determine the logical conditions that affect the outcome of a decision
• Determine how decisions affect flow through a program
A • Determine which parts of a program can be tackled at the same time
272
CHAPTER 50 – THINKING LOGICALLY, THINKING CONCURRENTLY
Example 1
Consider the following algorithm. It is intended to print out the number of values between a lower and
upper bound entered by the user, that are divisible by either 3, 5 or both.
count = 0
first = input("Please enter lower bound: ")
last = input("Please enter upper bound: ")
n = first
while n <= last
if n mod 5 == 0 then
count = count + 1
endif
if n mod 3 == 0 then
count = count + 1
endif
n = n + 1
endwhile
print("Values divisible by 3 or 5: ", count)
Q1: Suppose the user enters a lower bound of 0 and an upper bound of 15. What answer
would you expect?
What will be output by the program?
There are two problems with this algorithm. The first is that it counts the value 0 as divisible by both 3
and 5, whereas the user would probably not intend 0 to be included. We have not specified that the user 10-50
should enter positive integers, and this should be specified as a pre-condition to the routine.
The second problem is that any number divisible by both 3 and 5 will be counted twice. This is a logic
error which needs to be corrected.
Q2: Suggest amendments to the algorithm so that it works correctly for any two positive
integers entered by the user.
Example 2
Competitors playing in a chess tournament are awarded 2 points for a win, 1 point for a draw and 0
points for a loss. Each player plays 12 games.
The results for a player are held in an array of characters, with “W” representing a win, “D” representing a
draw and “L” representing a loss.
Write a pseudocode algorithm for a function which returns the points score of a player. Show how the
function would be called and the result output.
273
SECTION 10 – COMPUTATIONAL THINKING
function calculatePoints(score)
points = 0
for n = 0 to len[score] – 1
if score[i] == "W" then
points = points + 2
else
points = points + 1
endif
next n
return points
endfunction
// main program
myscore = ["W", "W", "D", "W", "W", "W", "W", "L", "D", "D", "W", "L")
result = calculatePoints(myscore)
print("Points scored: ", result)
A second algorithm is written to provide the administrator of the tournament with further information
about the players’ performance. The array names holds the name of each player in the tournament, and
the array scores holds the corresponding points score for each player.
274
CHAPTER 50 – THINKING LOGICALLY, THINKING CONCURRENTLY
In the above algorithm, the function playerStats returns a list of tuples called lowerCount. Each
element of the tuple consist of a player’s name and an integer count:
((names[0], count[0]), (names[1], count[1]) … (names[10], count[10]))
Q4: What are the second and third lines output? What is the function playerStats calculating?
A-Level only
Thinking concurrently
The difference between concurrent computing and parallel computing is debatable and is often taken to
mean the same thing. For example, a house may have a burglar alarm system which continually monitors
the front door, back door, windows, rooms upstairs and downstairs.
Generally, concurrent computing is defined as being related to but distinct from parallel computing.
Parallel computing requires multiple processors each executing different instructions simultaneously,
with the goal of speeding up computations. It is impossible on a single processor.
Concurrent processing, on the other hand, takes place when several processes are running, with 10-50
each in turn being given a slice of processor time. This gives the appearance that several tasks are
being performed simultaneously, even though only one processor is being used. Processor scheduling
algorithms are covered in Section 2, Chapter 7.
• Increased program throughput – the number of tasks completed in a given time is increased
• Time that would be wasted by the processor waiting for the user to input data or look at output is
used on another task
• The drawback is that If a large number of users are all trying to run programs, and some of these
involve a lot of computation, these programs will take longer to complete
275
SECTION 10 – COMPUTATIONAL THINKING
Exercises
1. A plumber charges for parts and labour. Labour is charged at £20 per half hour or part of a half hour.
The time spent is recorded as a four-digit integer, so that for example
A variable called duration holds the four-digit integer representing time spent.
(a) Write a subroutine to calculate and return the labour charge. [4]
(c) Show how the subroutine will be called using a parameter. [2]
2. In a vote for which of three plays produced at a theatre was most enjoyable, the total votes cast
for each of plays “A”, “B” and “C” have been stored in an array totalVotes.
The following algorithm has been written to output the play with the most votes.
10-50 04 endif
05 else
06 if totalVotes[1] > totalVotes[2] then
07 print ("Play B")
08 else
09 print ("Play C")
10 endif
11 endif
(a) In the event that an equal number of votes is cast for each play,
(b) Write an algorithm so that the result is always printed correctly in the event of two or three
plays all receiving the same number of votes. [8]
A-Level only
(b) A school runs a local area network linking computers throughout the school. Describe how
concurrent processing can be achieved on the network. [2]
(c) When a class of students all try and download a piece of software at the beginning of a class,
performance is affected. Explain why. [2]
A
276
CHAPTER 51 – PROBLEM RECOGNITION
A-Level only
Computable problems
A problem is defined as being computable if there is an algorithm that can solve every instance of it in a
finite number of steps. Some problems may be theoretically computable, but if they take millions of years
to solve, they are, in a practical sense, insoluble.
An example of such a problem is the cracking of a secure password. If you choose a password of 10
characters or more, comprising a mixture of random letters, numbers and special symbols, it will be
impossible to crack. You can test the strength of your passwords on various websites.
10-51
Enumeration
Theoretically, many problems and algorithmic puzzles can be solved by exhaustive search – trying
all possible solutions until the correct one is found. Thousands of problems which were in the past
insoluble have, thanks to the power of modern computers, become soluble. For example, a database of
fingerprints or DNA can within a reasonable time find the identity of an individual, if his or her fingerprints
or DNA are on the database.
The most important limitation of the exhaustive search strategy is its inefficiency – in general, the number
of possible solutions increases exponentially as the size of the problem increases.
277
SECTION 10 – COMPUTATIONAL THINKING
A-Level only
Consider, for example, the problem of constructing a magic square of order 3. The problem can be
stated as follows:
Fill the 3 x 3 square with the integers 1 to 9 in such a way that the sum of each row, column
and corner-to-corner diagonal is the same.
? ? ?
? ? ?
? ? ?
How many possibilities are there? There is a choice of 9 numbers for the first square, 8 for the second,
and so on giving 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 362, 880 ways of arranging the 9 numbers. This is 9!
(spoken 9 factorial.)
A magic square of 5 rows and columns has 25! solutions, and it would take a computer making 10 trillion
operations per second about 49,000 years to try all the options.
There are in fact algorithms which will find solutions for magic squares of any size. This is the theoretical
approach, which will generally find results considerably faster than a “brute force” method of solution.
10-51 Simulation
Simulation is the process of designing a model of a real system in order to understand the behaviour of
the system, and to evaluate various strategies for its operation. Such problems include:
• financial risk analysis
• population predictions
• queueing problems
• climate change predictions
• engineering design problems
Simulating a system invariably makes use of abstraction to reduce the problem to its essentials,
removing all unnecessary details. Queueing problems, for example, include problems of finding out how
many checkouts are needed in a new supermarket or on a new toll road, or how many staff are needed in
a software support department to man the helplines, or in a tax office to process tax returns.
Q2: How would abstraction be applied to this type of problem? What factors would be relevant, and
what would be irrelevant?
Simulation can also involve building a physical model of, for example, a spacecraft, ship or wind turbine,
so that its behaviour can be studied. This is obviously useful when it would be too expensive, dangerous
or impractical to carry out tests on the real thing. A model can be used to evaluate performance or test
outcomes.
278
CHAPTER 51 – PROBLEM RECOGNITION
A-Level only
Problem abstraction
Problem abstraction involves removing details until the problem is represented in a way that it is possible
to solve because it reduces to one that has already been solved.
Consider the following problem: There are four knights on a 3x3 chessboard: the two white knights
are at the bottom two corners, and the two black knights are at the two upper corners. The goal is to
switch the knights in the minimum number of moves so that the white knights are in the upper corners
and the black knights are in the bottom corners. (A knight can only move in the following manner: one
or two squares horizontally or vertically, followed by two squares or one square at right angles, moving 3
squares in total.)
10-51
We can abstract this problem by first numbering the squares of the chessboard. 1 to 9. Now we can
draw lines from 1 to 6 and 1 to 8 representing the two possible moves from square 1. Do the same for
each square in turn, and you end up with the graph shown in (b). (Square 5 can’t be reached with a
knight’s move so it is omitted from this graph.)
2
1 8
1 2 3 1 3
6 3
4 5 6 4 6
7 4
7 8 9 7 9
2 9
8
279
SECTION 10 – COMPUTATIONAL THINKING
A-Level only
Figure (b) is not much help in solving the problem. Now imagine that all the vertices are joined by a single
string, and rearrange the string so that the vertices form a circle – this gives us a much more revealing
picture. There are only two ways to solve the puzzle in the minimum number of moves; move the knights
along the edges in either a clockwise or a counter-clockwise direction until each of the knights reaches
the diagonally opposite corner for the first time.
This is the “graph unfolding” method of solution, equivalent to a general problem that has already been
solved in the same way, so is a reduction of the more general problem.
Q3: What is the total number of moves required to switch the knights to the opposite side of
the board?
Automation
Automation in computer science deals with building and putting into action models to solve problems.
For example, you could model the financial implications of running an ice-cream stand at a given venue
for a week or a longer period. You have to decide on what has to be included in the model and what
assumptions you are going to make. Then you have to create and implement the algorithms and execute
and test the results.
Physical world
scenario or
phenomenon
cr
10-51
ea
t
ou
te
ab
an
e
ab
or
st
m
ra
us
ct
lls
io
n
te
Mathematical
Automaton
model
automate
Automating the abstraction may in fact tell us more about the reality that we are modelling.
A
280
CHAPTER 51 – PROBLEM RECOGNITION
A-Level only
Exercises
1. A computer game is being designed to simulate cars on a race track. Abstraction has been used in
the design.
Explain how abstraction may be applied in the creation of the game. [3]
2. The goal in this problem is to place as many coins as possible at points of the 8-pointed star
depicted below, according to the following rules:
• Each coin must first be placed on an unoccupied point and then moved along a line to an
unoccupied point
• Once a coin has been positioned, it cannot be moved again.
8 1
7 2
10-51
6 3
5 4
Tip: Use the “graph unfolding” method of solution explained on the previous page.
281
SECTION 10 – COMPUTATIONAL THINKING
A-Level only
oo visualisation
oo backtracking
oo data mining
oo heuristics
oo performance modelling
oo pipelining
Visualisation
The manner in which a problem is presented is often a very important factor in finding a solution.
Computers work with binary numbers but humans often prefer a visual image. Consider this
representation of a binary tree:
282
CHAPTER 52 – PROBLEM SOLVING
A-Level only
Mike
George Tara
Q1: Suggest some other applications where an image is more useful for solving a problem than a
written description or other method of presenting the information.
Backtracking
In some problems, in order to find a solution you have to make a series of decisions, but there may be
cases for which:
10-52
• you don’t have enough information to know which is the best choice
• each decision leads to a new set of choices
• one or more of the sequences may be a solution to the problem
Backtracking is a methodical way of trying out different sequences until you find one that leads to a
solution. Solving a maze is a typical problem of this kind, and it is the technique used in a depth-first
traversal of a graph, covered in Chapter 63.
? Dead end
Start ? ? ?
?
Dead end
283
SECTION 10 – COMPUTATIONAL THINKING
A-Level only
Data mining
Data mining is the process of digging through big data sets to discover hidden connections and predict
future trends, typically involving the use of different kinds of software packages such as analytics tools.
Big data is the term used for large sets of data that cannot be easily handled in a traditional database.
Big Data analysis is quite probably going to be the most exciting, interesting and useful field of study in
the computing world over the next decade or two. We are just at the beginning of exploring its massive
benefits in healthcare and medicine, business, communication, speech recognition, banking, and many
other fields. Here are some questions it can answer:
• Does cellphone use increase the likelihood of cancer? With six billion cellphones in the world, there is
plenty of data to analyse. (The answer turned out to be “No”!)
• How can you improve voice-translation software? By scoring the probability that a given digitised
snippet of voice corresponds to a specific word. Google has made use of this data in its speech
recognition software.
• How does the Bank of England find out whether house prices are rising or falling? By analysing
search queries related to property.
• How can online education programmers use data collection to improve the courses offered?
By studying data on the percentage of thousands of students registered who rewatched a segment
of the course, suggesting it was not clear, or collecting data on wrong answers to assignments.
The term “Big Data” was first coined in the early 2000s by scientists working in fields such as astronomy
10-52 and human genome projects, where the amount of data they were collecting was so massive that
traditional methods of organising and analysing data, such as relational databases, could no longer
be used.
Intractable problems
Some problems are termed intractable because although an algorithm may exist for their solution, it
would take an unreasonably long time to find the solution. An example of such a problem is known as
the Travelling Salesman Problem (TSP), which poses the question “Given a list of towns and the
distances between each pair of towns, what is the shortest possible route that the salesman can use to
visit each town exactly once and return to the starting point?”
This is different from finding the shortest path from A to B. This problem has many applications in fields
such as planning, logistics, the manufacture of microchips and DNA sequencing.
Bury St Edmunds
57 Framlingham
10
25 Wickham Market
45
31 56
9
Stowmarket
21
15
Ipswich Woodbridge
284
CHAPTER 52 – PROBLEM SOLVING
A-Level only
To solve the problem, we could look first at a brute-force method, testing out every combination of
routes.
With just five cities, the number of possible routes is: 4! = 4 x 3 x 2 x 1 = 24.
A computer could calculate the best route in a fraction of a second.
Q2: How many different routes are there for (a) 10 cities? (b) 20 cities? (c) 50 cities?
The problem is said to be intractable because it will take a long time for a fast computer to find the
optimal solution for even a relatively small number of cities, and using the brute force algorithm, the
problem rapidly becomes impossible to solve within a reasonable time as the number of cities increases.
10 50 100 1000
n 10 50 100 1000
log2 n 3.3 5.64 6.65 9.97
n2 100 2500 10,000 1 million
n 3
1000 125000 1 million 1 billion
10-52
2 n
1024 A 16-digit number A 31-digit number A 302-digit number
A very, very large
n! 3.6 million A 65-digit number A 161-digit number
number!
Q3: Algorithms for problems A, B and C have time complexities O(n3), O(2n), O(n!). Using the table
above, which of A, B, C are tractable and which are intractable?
Intractable problems, which have no efficient algorithms to solve them, are in fact quite common; so how
can solutions to these problems be found?
Heuristic methods
Not all intractable problems are equally hard, and not all instances of a given intractable problem are
equally hard. Brute-force algorithms are not the only option for solving these problems. It may be
quite simple to get an approximate answer, or an answer that is good enough for a particular purpose.
One approach is to find a solution which has a high probability of being correct.
Another approach is to solve a simpler or restricted version of the problem, if that is possible. This may
give useful insights into possible solutions.
An approach to problem solving which employs an algorithm or methodology not guaranteed to be
optimal or perfect, but is sufficient for the purpose, is called a heuristic approach. An adequate solution
may be achieved by trading optimality, completeness, accuracy or precision for speed. The objective is to
find a good solution in a reasonable time frame.
285
SECTION 10 – COMPUTATIONAL THINKING
A-Level only
We often apply heuristics in our everyday lives – if we want to travel from A to B we may use a route that
we already know, even if it is not the best one. An employer who interviews several people for a job may
see several suitable candidates, and make a decision based on two or three factors, ignoring others which
may be relevant to the decision. In psychology, a heuristic is a mental shortcut that allows people to make a
judgement and solve problems, while being aware that the solution may not necessarily be the optimal one.
Returning to the Travelling Salesman Problem (TSP), a large number of heuristic solutions has been
developed, the best of which (developed in 2006) can compute a solution within two or three percent of
an optimal tour for as many as 85,000 “cities” or nodes.
Performance modelling
Performance modelling is the process of simulating different user and system loads on a computer using
a mathematical approximation, rather than doing actual performance testing which may be difficult and
expensive. For example, it could be used to test the performance of a network under different conditions.
The output from the performance model may then be used to help with planning a new system which is
suited to the requirements of an organisation.
Pipelining
Pipelining is the technique of splitting tasks into smaller parts and overlapping the processing of each
part of the task. It is commonly used in microprocessors used in personal computers so that for example
while one instruction is being fetched, another is being decoded and a third, executed. It basically works
much like an assembly line.
10-52
Exercises
1. (a) Describe what is meant by data mining.[2]
2. Describe the key features of a backtracking algorithm. Give an example of a problem which
can be solved using this technique. [3]
3. Explain what is meant by a heuristic solution to a problem. Give an example of when such
a solution could be applied and why it would be an appropriate method. [4]
286
Section 11
Programming techniques
In this section:
Chapter 53 Programming basics 288
11
287
SECTION 11 – PROGRAMMING TECHNIQUES
What is an algorithm?
An algorithm is a set of rules or a sequence of steps specifying how to solve a problem. A recipe for
chocolate cake, a knitting pattern for a sweater or a set of directions to get from A to B, are all algorithms
of a kind. Each of them has input, processing and output.
Q1: What are the inputs and outputs in a recipe, a knitting pattern and a set of directions?
Ingredients Method
Put flour and salt into a large mixing bowl
100g plain flour and make a well in the centre.
11-53
2 eggs Crack the eggs into the middle
300ml milk Pour in about 50ml milk and the oil.
1tbsp oil Start whisking from the centre, gradually
Pinch salt drawing the flour into the eggs, milk and
oil, etc.
In the context of programming, the series of steps has to be written in such a way that it can be
translated into program code which is then translated into machine code and executed by the computer.
Using pseudocode
Whatever programming language you are using in your practical work, as your programs get more
complicated you will need some way of working out what the steps are before you sit down at the
computer to type in the program code. A useful tool for developing algorithms is pseudocode, which is
a sort of halfway house between English and program statements. There are no concrete rules or syntax
for how pseudocode has to be written, and there are different ways of writing most statements. We will
use a standard way of writing pseudocode that translates easily into a programming language such as
Python, Visual Basic or whatever procedural language you are learning.
This section does not teach you how to program in any particular programming language – you will learn
how to write programs in your practical sessions – but it will help you to understand and develop your
own algorithms to solve problems.
288
CHAPTER 53 – PROGRAMMING BASICS
This program will ask the user to input their name, and then display “Hello, Jo” or whatever name the
user entered. Notice that in this pseudocode, text such as “Hello, ” will be wrapped in speech marks to
distinguish it from variables.
We will normally use the pseudocode
myname = input("What is your name?")
which combines the print and input statements to display the prompt “What is your name” and then
waits for the user to enter text and press the ENTER key.
Comments
Note also that anything following a // will be treated as a comment and will have no effect on the running
of the program. Comments are very important when you come to code your programs, to document the
code (specifying the name, author, date written and purpose of the program, for example) and to explain
how any tricky bits of the program work. 11-53
Data types
Different data types are held differently in the computer’s memory so you need to use the correct data type
for the task. In Section 6 the most common data types built into programming languages were listed as:
289
SECTION 11 – PROGRAMMING TECHNIQUES
In pseudocode you can assume that billBetween3 will be automatically defined as a real
variable and will return a value such as 6.666666667, though this may not be the case in every
programming language.
Q2: How could you convert this answer to a string variable and assign the answer to billstring?
Exponentiation
If you want to find, for example 25, 5 is called the exponent and you need to use exponentiation.
You can write this operation in pseudocode as
x = 2**5
290
CHAPTER 53 – PROGRAMMING BASICS
String-handling functions
Programming languages have a number of built-in string-handling methods or functions. Some of the
common ones in a typical language are:
len(string) Returns the length of a string
string.find(str) Determines if str occurs in string. Returns index (the position of the first
character in the string) if found, and -1 otherwise. In our pseudocode we will
assume that string (1) is the first element of the string, though in Python, for
example, the first element is string (0)
ord("a") returns the integer value of a character (97 in this example)
chr(97) returns the character represented by an integer (“a” in this example)
Example:
date1 = date(2015,1,18)
date2 = date(2014,12,30)
days = date1-date2
print(date1, date2, days)
291
SECTION 11 – PROGRAMMING TECHNIQUES
Some programming languages also allow you to define constants, whose value never changes while the
program is being run. For example, if your program involved calculating the area of a circle, you could
define pi at the start of the program as a constant having the value 3.14159. Or, you might hold the
company phone number as a constant, declared at the start of the program as
const companyPhone = "01453 123456"
The advantage of using a constant is that in a long, complex program there is no chance that a
programmer will accidentally change its value by using the identifier for a different purpose.
Some languages such as Python do not require or even allow you to define variables or constants – you
just use them as and when required in the program.
292
CHAPTER 53 – PROGRAMMING BASICS
Exercises
1. A school keeps data about each of its pupils. State the most suitable data type for each of the
following data items:
Pupil’s surname
A single letter indicating whether they are male or female
The amount owed for school trips
The number of school trips they have participated in
Whether or not the pupil is entitled to free school meals [5]
2. (a) Write pseudocode for a program which asks the user to enter the total bill for a restaurant
meal, and the total number of people who had a meal. The program should add 10% to the
bill as a tip, and then calculate and display to the nearest penny what each person owes,
assuming the bill is evenly split. [6]
(b) Complete the following table showing an additional two sets of test data, the reason for
each test and the expected result. [6]
11-53
3. (a) Name two ways in which you can help to make your programs understandable to another
programmer. [2]
(b) Imagine that you have had a stall at the Summer Fayre. At the end of the day you count up
the number of each 1p, 2p, 5p, 10p, 20p and 50p coins you have received.
Write a pseudocode algorithm to allow the user to input the number of coins of each value,
and to calculate and display the total takings.
Make use of two ways of making the program understandable given in your answer to part (a).[6]
4. Below is an algorithm that adds VAT to the net price of an item and outputs the total price.
(a) Write down one example of the following from the above algorithm:
(i) a constant; (ii) a variable; (iii) a comment [3]
(b) Suggest three standards for naming variables, and give two reasons why such standards
are useful. [5]
293
SECTION 11 – PROGRAMMING TECHNIQUES
Chapter 54 – Selection
Objectives
• Use relational operators
• Use Boolean operations AND, OR, NOT, XOR
• Use nested selection statements
Program constructs
There are just three basic programming constructs: sequence, selection and iteration.
Sequence is just two or more statements following one after the other, such as
n = input("Please enter a number: ")
nsquared = n * n
print("The square is",nquared)
Selection
Selection statements are used to select which statement will be executed next, depending on some
condition. Conditions are formulated using relational operators.
11-54
Relational operators
The following operators may be used in pseudocode for making comparisons:
> greater than <= less than or equal
< less than == equal
>= greater than or equal != not equal
Q1: Are these operators the same as the ones used in the programming language you are learning?
If not, how are they different?
If … then … else
Selection statements can take different forms, for example:
if (expression1) then
(do these statements)
endif
294
CHAPTER 54 – SELECTION
If expression1 does not evaluate to TRUE, control passes to the next statement after the if statement.
Alternatively, you can specify what should happen if the condition does not evaluate to TRUE:
if (expression1) then
(do these statements)
else
(do these statements)
endif
For example:
if mark >= 50 then
print("Pass")
else
print("Fail")
print("You will have to retake this test.")
endif
Example 1
A bank offers different interest rates according to how much is in the account. There are three thresholds
of £500, £3,000 and £10,000:
If amount less than 500, rate = 1%
if amount is greater than or equal £500 but less than £3000, rate = 1.5%
if amount is greater than or equal £3000 but less than £10000, rate = 2%
If amount is greater than or equal £10000, rate is 3.5%
295
SECTION 11 – PROGRAMMING TECHNIQUES
Example 2
Perform different statements according to an option choice entered by the user.
switch choice:
1 :print("You have selected option 1")
(more statements here)
2 :print("You have selected option 2")
(more statements here)
3 :print("You have selected option 3")
(more statements here)
else
print("You must enter 1, 2 or 3")
endswitch
Example 3
A statement to calculate the number of days in the month between 2001 and 2009 may be written:
11-54
switch month:
"Jan","Mar","May","Jul","Aug","Oct","Dec": daysInMonth = 31
"Apr","Jun","Sep","Nov": daysInMonth = 30
"Feb" : if year MOD 4 = 0 then
daysInMonth = 29
else
daysInMonth = 28
endif
endswitch
Example 4
if (a > b) AND (a > c) then
max = a
else if (b > a) AND (b > c) then
max = b
else
max = c
endif
296
CHAPTER 54 – SELECTION
Example 5
Write pseudocode for a program to allow the user to input the day of the week and output “Weekday” or
“Weekend”.
day = input("Enter day of week: ")
if (day = "Saturday") OR (day = "Sunday") then
print("Weekend")
else
print("Weekday")
Example 6
A tourist attraction has a daily charge for children of £5.00 on a weekday, or £7.50 on a weekend or bank
holiday. Adults are charged £8.00 on weekdays and £12.00 on weekends and bank holidays.
Write pseudocode to allow the user to calculate the charge for a visitor.
day = input("Enter W for weekend, B for bank holiday or D for weekday: ")
visitor = input("Enter A for adult, C for child: ")
if ((day = "W") OR (day = "B")) AND (visitor = "A") then
charge = 12.0
else if ((day = "W") OR (day = "B")) AND (visitor = "C") then
charge = 7.5
else if (visitor = "A") then
charge = 8.0
else
charge = 5.0 11-54
endif
Notes: It is important to use brackets and to get them in the correct place to avoid any confusion over
which operator is processed first. In standard Boolean logic the precedence rules make NOT
highest, then AND, then OR.
Note that NOT takes precedence over AND. Add extra brackets if you are in any doubt!
297
SECTION 11 – PROGRAMMING TECHNIQUES
Exercises
1. Below is a segment of an algorithm.
swimTime = False
if (Membership == "Premier") then
swimTime = TRUE
else if ((Membership == "Adult") AND (Day == "Weekday") AND
(Time <1500) OR
((Membership == "Adult") AND (Day == "Weekend")) then
swimTime = TRUE
else if (Membership == "Junior") AND (Day == "Weekend") then
swimTime = TRUE
endif
Write down the values of swimTime after the segment of the algorithm has executed for the
following data:
2. (a) Write a pseudocode algorithm for a program which calculates the cost of carpeting a room.
11-54
The carpet is supplied in a roll 4m wide. The cost of the carpet is £10 per square metre.
The program should ask the user to enter the longest dimension (length) and shortest dimension
(width) of the room, then calculate and display the length and width and cost of carpet that
will be supplied.
You can assume that the width of the room is not more than 4m. If a width of more than 4m is
entered, display an error message and quit the program.
(b) Calculate the expected results for the following room sizes:
Length = 5, width = 3
Length = 5, width = 4
Length = 3, width = 2
Length = 3.9, width = 2
Length = 6, width = 5 [5]
298
CHAPTER 55 – ITERATION
Chapter 55 – Iteration
Objectives
• Understand and use three different types of iterative statement WHILE, REPEAT and FOR
Performing a loop
In the last two chapters we looked at sequence and selection statements. The third basic programming
construct is iteration. Iteration means repetition, so iterative statements always involve performing a
loop in the program to repeat a number of statements. There are three different types of loop to be
considered, although some programming languages do not implement all three.
Test this algorithm with temperatures 8, 12 and -100. We can draw a trace table showing the value of the
variables as they change during execution of the program.
Q1: Complete the trace table. What is the average temperature calculated by this algorithm?
299
SECTION 11 – PROGRAMMING TECHNIQUES
You should have ended up with 3 temperatures and an average temperature of -26.66667 instead of
10. The problem is that the expression controlling the loop is tested only once each time round, at the
beginning of the loop, and not after each statement within the loop as it is executed. Therefore, we have
to make sure that as soon as the number -100 is entered, the next thing that happens is that the Boolean
expression is tested.
temp = 0 // initialise temp
totalTemp = 0 // initialise total of Temperatures
numberOfTemps = 0 // initialise number of temperatures
temp = input("Enter first temperature") // input first temperature
while temp != -100
totalTemp = totalTemp + temp
numberOfTemps = numberOfTemps + 1
temp = input("Enter next temperature")
endwhile
averageTemp = totalTemp/numberOfTemps
print("Average temperature: ", averageTemp)
Note that with a while ... endwhile loop, if the Boolean expression is FALSE at the start, the loop will not
be executed at all and control will pass straight to the next statement after endwhile.
Q2: What will happen if the first temperature entered is -100? Alter the algorithm to ensure that the
program displays a suitable message.
Note: Python does not support a repeat … until statement, but the same output can be achieved
with a while … endwhile loop.
Example 1
Write pseudocode for a program which tests someone on the squares of numbers up to 25.
// program to test a user on the squares of numbers
// random(a,b) generates a random integer between a and b
repeat
num = random(1,25)
numsquare = num * num
answer = input("What is the square of ", num, "? ")
if answer == numsquare THEN
print("correct, well done")
else
print("No, it is ", numsquare)
endif
anotherGo = input("Another go? Answer Y or N: ")
until (anotherGo = "N") OR (anotherGo = "n")
300
CHAPTER 55 – ITERATION
Q3: Rewrite this algorithm using a while … endwhile loop. Which loop do you think is preferable for
performing this task?
The value of count starts at 2 and is incremented each time round the loop. When it reaches 12, the loop
terminates and the next statement is executed.
Nested loops
Loops can be “nested” one inside another. Suppose we want to display all the multiplication tables
between 2 and 12. We can do this with two FOR loops, one inside the other.
Example 2
for table = 2 to 12
for count = 2 to 12
product = table * count
print(table, " x ", count, " = ", product) 11-55
next count
next table
Example 3
Use a random number generator to simulate throwing two dice to find out how many throws it takes to
get a 6.
totalThrows = 0
answer = "y"
while (answer == "y") OR (answer == "Y")
numberOfThrows = 0
throw = 0
while throw != 6
throw = random(1,6)
numberOfThrows = numberOfThrows + 1
print("You threw a ", throw)
endwhile
print "That took ", numberOfThrows," throws"
answer = input("Another go? (Y or y): ")
endwhile
301
SECTION 11 – PROGRAMMING TECHNIQUES
Q5: What will happen if the user answers “yes” in answer to the question "Another go? (Y or y)"?
Example 4
You can count backwards as well as forwards in a for … next loop. Here is a pseudocode program
which uses the ’sleep’ method to count down in seconds to blast-off. It starts by importing an external
module called time which contains a built-in method sleep:
import time // imports an external module
ReadyForCountdown = input("Press enter when you're ready to start")
for sec = 10, 0, step -1
print(sec)
time.sleep(1) // suspends execution for 1 second
next sec
print("BLAST-OFF!")
Exercises
1. Write a pseudocode algorithm to allow the user to input two integers highestNumber and
multiplier. The program should output the results of multiplying integers 2, 3… highestNumber
by multiplier.
11-55 For example if the user enters 100 for highestNumber and 7 for multiplier the program should
output the numbers 14, 21 … 98. [5]
2. Write pseudocode for a program that asks the user which times table they would like to be
tested on, and then gives them 5 random questions on this table, telling them each time whether
they got the answer right or wrong. [5]
302
CHAPTER 56 – SUBROUTINES AND RECURSION
Types of subroutine
A subroutine is a named block of code which performs a specific task within a program.
Most high-level languages support two types of subroutine, functions and procedures, which are
called in a slightly different way. Some languages such as Python have only one type of subroutine,
namely functions.
All programming languages have ‘built-in’ functions which you will have already used if you have written
any programs. For example, in Python:
myName = input("What is your name? ")
print("Hello, ", myName) 11-56
A subroutine is called by writing its name in a program statement. Some functions return a result, like
input function above, and some do not return any result, like the print function. Notice that the first
statement above combines the print and input functions; when the statement is executed, the computer
will display the question “What is your name?” and wait for the user to input an answer, which will be
assigned to the variable myName.
In languages which distinguish between functions and procedures, a function is called like the input
above and always assigns a return value to a variable. A procedure is called by writing its name but
not assigning the result to a variable, like the print statement above. However, as we shall see later, a
procedure can still pass values back to the calling program if necessary.
In Chapter 53, we listed some string-handling functions, and we can write, for example, pseudocode
such as
x = int("567")
to call the int function, which will convert the string “567” into an integer.
Q1: List some other functions you have used in your programs or pseudocode algorithms.
Q2: In some languages, sqrt is a function which returns the square root of a number. What value
will be assigned to the variable z by the statement
z = sqrt(25)
303
SECTION 11 – PROGRAMMING TECHNIQUES
User-written subroutines
You can write your own subroutines (functions and/or procedures) and call them from within the program
as many times as needed. The subroutine first needs to be defined, typically above the code in the main
program.
Example 1
Using pseudocode, write a subroutine which displays a menu of 4 options in a game.
procedure displayMenu // declare the subroutine
print("Option 1: Display rules")
print("Option 2: Start new game")
print("Option 3: Quit")
print("Enter 1, 2 or 3: ")
endprocedure
To call the subroutine from the main program, you simply write its name:
displayMenu
This subroutine always produces the same result whenever it is called; it simply displays this menu.
Example 2
Sometimes, you may want a subroutine to return a value to the main program:
function getChoice
print("Option 1: Display rules")
11-56 print("Option 2: Start new game")
print("Option 3: Quit")
print("Enter 1, 2 or 3: ")
choice = input()
return choice
endfunction
#main program starts here
option = getChoice
print("You have chosen ",option)
In this example, when the program is run, the first line to be executed is the first statement in the main
program, option = getChoice. The subroutine is called, it displays the menu, gets the user’s
choice in choice and returns this to the main program using the statement return choice.
Execution continues where it left off, at the statement print("You have chosen ", option).
The subroutine is called in a slightly different way from the subroutine displayMenu – compare this to
the two different ways in which built-in print and input subroutines are called.
print("What is your name?")
myName = input()
print("Hello, ",myName)
The print subroutine does not return a value, the input subroutine does.
304
CHAPTER 56 – SUBROUTINES AND RECURSION
In some programming languages, parameters may be passed in different ways. If a parameter is passed
by value, its actual value is passed to the subroutine, where it is treated as a local variable. Changing a
parameter inside the subroutine will not affect its value outside the subroutine.
All parameters are passed by value in Python.
In Visual Basic or Pascal, parameters may be passed by value but they may also be passed by
reference. In this case, the address, and not the value, of the parameter is passed to the subroutine.
Therefore, if the value is multiplied by three, for example, its value in the main program will reflect that
change since it is referring to the same memory location.
To pass by reference in Pascal, the procedure header will specify that the relevant parameter is a variable.
For example:
procedure abc(x, y : integer; var z : integer;)
Example 3
Consider a simple subroutine which calculates the volume of a cylinder. In the main program, the user 11-56
is asked to enter values for the radius and length of the cylinder. These variables are then passed as
parameters to the subroutine for use in the calculation.
The values of the parameters radius and length in line 9 are passed to the subroutine where they are
referred to using the identifiers r and len respectively. The order in which the parameters are written
when calling the subroutine are written is important: radius is passed to r, length is passed to len.
The return value vol is passed back to the main program, where it is assigned to volume in line 9.
1 function cylinderVolume(r,len)
2 pi = 3.142
3 vol = pi*r*r*len
4 return vol
5 endfunction
6 #main program
7 radius = input("Enter the radius of the cylinder: ")
8 length = input("Enter the length of the cylinder: ")
9 volume = cylinderVolume(radius,length)
10 print("The volume of the cylinder is ", volume)
Q3: Line numbers have been added to each statement of the above pseudocode for reference.
Write down the statement numbers in the order in which they are executed.
Q4: Write pseudocode for a program which calls a function addNumbers(n,m) to add all the
numbers between 5 and 10. The result should be returned to the main program and displayed.
305
SECTION 11 – PROGRAMMING TECHNIQUES
Q5: Write pseudocode for a program which asks the user to enter a name and percentage mark. It
then calls a subroutine which assigns a grade to the mark which it passes back to the main program,
where it is printed together with the name. Grades are assigned as follows:
mark >= 80: Distinction
mark between 65-79: Merit
mark between 50 and 64: Pass
mark < 50: Fail
Q6: Can you find a local variable in function cylinderVolume(r,len) in Example 3 above?
What would happen if you tried to print its value in the main program?
11-56 The ability to declare local variables is very useful because it ensures that each subroutine is completely
self-contained and independent of any global variables that have been declared in the main program.
The principle of encapsulation of all the variables needed in a subroutine is very important in
programming. A subroutine written according to this principle can be tested independently, and used
many times in many different programs without the programmer needing to know what variables it uses.
Any variable in the calling program which coincidentally has the same name as a local variable declared in
the subroutine will not cause an unexpected side-effect.
Example 4
1 procedure printNumbers(x)
2 a = 1
3 b = 2
4 c = 3
5 print("In the subroutine, a,b,c and x have values ", a,b,c,x)
6 endprocedure
7 #main program
8 a = 4
9 b = 5
10 c = 6
11 x = 10
12 print("In the main program, a,b,c and x have values ", a,b,c,x)
13 printNumbers(x)
14 print("In the main program, a,b,c and x now have values ", a,b,c,x)
306
CHAPTER 56 – SUBROUTINES AND RECURSION
Q7: Write down the line numbers of the statements in the order in which they are executed.
Modular programming
When a program is short and simple, there is no need to break it up into subroutines. With a long,
complex program, however, a top-down approach, in which the problem is broken down into a number
of subtasks, is generally very helpful in designing the algorithm for a satisfactory solution.
Recursion
Definition of a recursive subroutine
A subroutine is recursive if it is defined in terms of itself. The process of executing the subroutine is
called recursion. A recursive routine has three essential characteristics:
• A stopping condition or base case must be included which when met means that the routine will not
call itself and will start to ‘unwind’
• For input values other than the stopping condition, the routine must call itself
• The stopping condition must be reached after a finite number of calls
307
SECTION 11 – PROGRAMMING TECHNIQUES
A-Level only
Recursion is a useful technique for the programmer when the algorithm itself is essentially recursive.
Some algorithms can be written using either recursion or iteration. Recursive routines are often much
shorter, but more difficult to trace through. If a recursive routine is called a very large number of times
before the stopping condition is reached, the program may crash with a “Stack overflow” error (see
below, “Use of the call stack”). An iterative routine, on the other hand, has no limit to the number of
times it may be called.
Example
A simple example of a recursive routine is the calculation of a factorial, where n! (read as n factorial or
factorial n) is defined as follows:
If n = 0 then n! = 1
otherwise n! = n x (n-1) x (n-2) … x 3 x 2 x 1
Thus for example 5! = 5 x 4 x 3 x 2 x 1
If we were calculating this manually, we probably calculate 5 x 4 = 20, then multiply 20 by 3 and so on.
The calculation could be written as
5! = ((((5 x 4) x 3) x 2) x 1) = (((20 x 3) x 2) x 1) = ((60 x 2) x 1) = 120 x 1 = 120
This is essentially how recursion works. In pseudocode, it can be written like this:
function calcFactorial(n)
if n == 0 then
11-56 factorial = 1
else
factorial = n * calcFactorial(n-1)
print(factorial) //LINE A
endif
return factorial
endfunction
Nothing will be printed until the routine has stopped calling itself. As soon as the stopping condition is
reached, in this case n = 0, the variable factorial is set equal to 1, the return statement at the end
of the subroutine is reached and control is passed back (for the first time, but not the last) to the next
statement after the last call to calcFactorial, which is the print statement marked LINE A.
308
CHAPTER 56 – SUBROUTINES AND RECURSION
A-Level only
Return addresses, parameters and local variables (not used here) are put on the stack each time a
subroutine is called, and popped from the stack each time the end of a subroutine is reached. At Line 8,
for example, Line 9 (referred to here as Line A) is the first return address to be put on the stack with the
parameter 4 when printlist(x) is called from the main program, with the parameter 4.
Representations of the current state of the stack each time a recursive call is made, and the subsequent
“unwinding” are shown below. 11-56
B2
B3 B3 B3
A4 A4 A4 A4 A4
Q11: Write iterative and recursive routines to sum the integers held in a list numbers. Show how
each routine will be called.
A
309
SECTION 11 – PROGRAMMING TECHNIQUES
Exercises
1. (a) A program may use global and local variables.
(i) Explain one difference between a global variable and a local variable. [2]
(ii) Describe what will happen if a programmer declares a global variable and a local variable
with the same name. [2]
(b) Jo has written a computer program to produce invoices for customers of her father’s plumbing
business.
To calculate the invoice total, the number of hours worked is rounded up to the next integer (e.g.
67 minutes would round up to 2 hours). This is then multiplied by the hourly rate. Finally, the cost
of parts is added.
01 REAL HourlyRate
...
40 PROCEDURE Initialise
41 HourlyRate = 15
42 END PROCEDURE
...
60 PROCEDURE CalculateTotal
61 INTEGER TimeInMinutes
62 INTEGER CostOfParts
11-56 63 INPUT TimeInMinutes
64 INPUT CostOfParts
65 OUTPUT TimeInMinutes DIV 60 + 1 * HourlyRate + CostOfParts
66 END PROCEDURE
State one global variable and one local variable in Jo’s code. [2]
(d) Evaluate the extract of Jo’s code. You should identify and explain the positive and negative
aspects of her coding style and the implications that this will have on the maintainability of
the program.
The quality of written communication will be assessed in your answer to this question. [8]
310
CHAPTER 56 – SUBROUTINES AND RECURSION
A-Level only
2. The words COW, BEEF, and FORTY have all their letters written in alphabetical order. Here is an
algorithm for a function which checks whether all the letters in a word are in alphabetical order.
01 FUNCTION IsInOrder(Word)
02 IF LENGTH(Word) = 1 THEN
03 RETURN TRUE
04 ELSE
05 FirstChar = First character in Word
06 RestOfWord = All characters in Word except the first
07 IF FirstChar > RestOfWord THEN
08 RETURN FALSE
09 ELSE
10 RETURN IsInOrder(RestOfWord)
11 END IF
12 END IF
13 END FUNCTION
(b) Explain the difference between the uses of the = sign in line 02 and in line 05, stating the
type of operation being carried out. [4] 11-56
(c) Line 07 compares the first character of the word with the rest of the word as shown below.
(d) State what is meant by recursion, using this algorithm as an example. [2]
(e) The algorithm is tested with the call IsInOrder("Z"). State the value which will be returned.
(f) Explain what happens if the algorithm is tested with a call IsInOrder(" ") where the value of
the argument is the empty string. [2]
(g) Explain what happens when the algorithm is tested with the call IsInOrder("APE").
You should show each call made, the lines of the algorithm executed and the return value
of each call. You may use a diagram. [6]
311
SECTION 11 – PROGRAMMING TECHNIQUES
A-Level only
3. (a) Explain briefly the main features of a recursive procedure from the programmer’s point of
view. Explain what is required from the system in order to enable recursion to be used. [3]
procedure listProcess(numList)
if length(numlist) > 0 then
Remove first element of numlist and store in first
listProcess (numList)
append first to end of numList
endif
return numList
endprocedure
(i) Complete the following trace table if the list numbers is defined in the main program as
numbers = [3,5,10,2]
numlist
length (numlist) 0 1 2 3 first new
4 3 5 10 2 3
11-56
[6]
(ii) Explain what the subroutine does. [1]
312
CHAPTER 57 – USE OF AN IDE
Facilities of an IDE
When you create a program you will be using a software package that helps you write the code more
easily.
This is called an Integrated Development Environment or IDE.
The screenshot below shows the Komodo IDE being used.
11-57
The IDE provides many tools to help you enter, edit, compile, test and debug your programs.
313
SECTION 11 – PROGRAMMING TECHNIQUES
Can you spot the logic error? Logic errors are usually much more difficult to find than syntax errors.
The IDE has various tools to help you.
• You can set a breakpoint in the program which will cause the program to stop on that line, so that
you can see whether it reaches that line.
• You can set a watch on a variable so that its value is displayed each time it changes
11-57
• You can step through a program a line at a time so that you can see what is happening
In this program, the variable max is the length of the list and it has been incorrectly set to 7. It should be
6. Once this has been done, it returns the correct result:
Q1: Can you now deduce that the program is working correctly? If not, why not?
Q2: Suggest three more tests that could usefully be performed on the program to determine
whether it works correctly for any user input.
314
CHAPTER 57 – USE OF AN IDE
11-57
Figure 57.4
Test strategies
There are several different test strategies used by software development companies, some of which
are applicable to smaller software projects that you may have written. Test strategies are discussed in
Section 3, Chapter 11.
315
SECTION 11 – PROGRAMMING TECHNIQUES
We need to choose test data that will test the outcome for any user input. To do this, we need to select
normal, boundary and erroneous data.
• normal data is data within the range that you would expect, and of the data type (real, integer, string,
etc. that you would expect. For example, if you are expecting an input between 0 and 100, you
should test 1 and 99
• boundary data is data at the ends of the expected range – for example, test 0 and 100 to make sure
that these give the expected results if the valid range is between 0 and 100
• erroneous data is data that is either just outside an expected range, e.g. -1, 101 or is of the wrong
data type – for example, non-numeric characters when you are expecting a number to be input
For each test, you should specify the purpose of the test, the expected result and the actual result.
Example 1
The following algorithm is intended to calculate and print the average mark for each student in a class, for
all the tests they have attempted:
// average mark
students = input("How many students? ")
for n = 1 to students
name = input("Enter student name ")
totalMarks = input("Enter total marks for ", name)
numTests = input("How many tests has this student taken? ")
11-57 averageMark = round(totalMarks/numTests)
print ("Average mark = ",averageMark)
next n
Program
4 Amina: total marks 0, tests 0 No tests taken 0
crashes
Program
5 Number of students abc Test invalid data
terminates
You can probably think of some other input data that would make the program crash. For example,
what if the user enters 31.5 for the total marks? The program should validate all user input, so some
amendments will have to be made to the program before general release!
316
CHAPTER 57 – USE OF AN IDE
# main program
shift = 3
msg = input("Enter your message: ") 11-57
codedMessage = code(msg,shift)
print("The encoded message is: ", codedMessage)
Q4: What will be the output from the algorithm if the user inputs “Hi, Jo!”?
Explain briefly the purpose of the algorithm.
Dry-running a program
A useful technique to locate an error in a program is to perform a dry run, with the aid of a trace table.
As you follow through the logic of the program in the same sequence as the computer does, you note
down in the trace table when each variable changes and what its value is.
317
SECTION 11 – PROGRAMMING TECHNIQUES
Exercises
1. A Python program is run in an IDE and gives an incorrect result in the output pane, as shown:
(c) Give the line number of the line that is causing the problem and write the correct statement. [2]
(d) Apart from debugging aids, identify three features of an IDE that you might use when
developing a program. [3]
2. Complete the trace table below to show how each variable changes when the algorithm is
performed on the test data given.
x = 0 w x y z
y = 0
0 0 0
z = 0
5 5 1 0
w = input
repeat
x = x + w
y = y + 1
w = input
until w < 0
z = x/y
print z
Test data: 5 7 2 2 4 -1
[5]
318
CHAPTER 58 – USE OF OBJECT-ORIENTED TECHNIQUES
Procedural programming
Programming languages have been evolving ever since the development of assembly languages.
High level languages such as Basic and Pascal are known as procedural languages, and a program
written in one of these languages is written using a series of step-by-step instructions on how to solve
the problem. This is usually broken down into a number of smaller modules, and the program then
consists of a series of calls to procedures or functions, each of which may in turn call other procedures
or functions.
In this method of programming, the data is held in separate primitive variables such as integer or char,
or in data structures such as array, list or string. The data may be accessible by all procedures in the
program (global variables) or local to a particular subroutine. Changes made to global data may affect
other parts of the program, either intentionally or unintentionally, and may mean other subroutines have to
be modified.
A-Level only
Object-oriented programming
In object-oriented programming, the world is viewed as a collection of objects. An object might be a 11-58
person, animal, place or event, for example. It could be something more abstract like a bank account or
a data structure such as a stack or queue that the programmer wishes to implement.
An object-oriented program is composed of a number of interacting objects, each of which is responsible
for its own data and the operations on that data. Program code in an object-oriented program creates
the objects and allows the objects to communicate with each other by sending messages and receiving
answers. All the processing that is carried out in the program is done by objects.
An object has behaviours. These are the actions that can be performed by an object; for example, a cat
can walk, pounce, catch mice, purr, miaow and so on.
319
SECTION 11 – PROGRAMMING TECHNIQUES
A-Level only
Classes
A class is a blueprint or template for an object, and it defines the attributes and behaviours (known as
methods) of objects in that class. An attribute is data that is associated with the class, and a method is
a functionality of the class – something that it can do, or that can be done with it.
For example, a stock control system might be used by a bookshop for recording the items that it receives
into stock from suppliers and sells to customers. The only information that the stock class will hold in
this simplified system is the stock ID number, stock category (books, stationery, etc.), description, and
quantity in stock.
Part of a sample definition of a class named StockItem is defined below. Program coding will vary
according to the language used.
* Stock class used to model a simple stock control system,
* allowing stock to be added and sold.
class StockItem
* instance variables (properties/attributes)
private stockID
private category
private description
private qtyInStock
* A procedure may take one or more parameters. It does not return a value.
11-57 * A procedure with the name new is a constructor.
public procedure new(aStockID, aCategory, aDescription, aQty)
(instructions)
endprocedure
public procedure ReceiveStock (integer aQty)
(instructions)
endprocedure
public procedure SellStock (integer aQty)
(instructions)
endprocedure
As a general rule, instance variables or attributes are declared private and most methods public, so that
other classes may use methods belonging to another class but may not see or change their attributes.
This principle of information hiding, where a class cannot directly access the attributes of another class
when they are declared private, is an important feature of object-oriented programming.
320
CHAPTER 58 – USE OF OBJECT-ORIENTED TECHNIQUES
A-Level only
book1 is called a reference type variable, or simply a reference variable. Note that this is a different type
of variable from stockID or qtyInStock, which are string or integer variables.
Like primitive variables of type integer, double, char, string, reference variables are named
memory locations in which you can store information. However, a reference variable does not hold the
object – it holds a pointer or reference to where the object itself is stored.
A variable reference diagram shows in graphical form the new StockItem object referenced by the
variable book1. In the diagram, reference variables are shown as circles and primitive data types (and
string variables) are shown as rectangles.
11-58
StockItem
book1
StockID PT123
(StockItem) Category Book
Description Computer Science
QtyInStock 35
Sending messages
Messages can be categorised as either “getter” or “setter” messages. In some languages, “getter”
messages are written as functions which return an answer, and “setter” messages as procedures
which change the state of an object. This is reflected in the pseudocode used in this book.
The state of an object can be examined or changed by sending it a message, for example to get or
increase the quantity in stock. To get the quantity in stock of book1, for example, you could write:
quantity = book1.GetQtyInStock
321
SECTION 11 – PROGRAMMING TECHNIQUES
A-Level only
Q2: In the class definition for Radio (shown in the figure below), add the missing instance variables,
and a procedure to set volume.
class Radio
* instance variables
private volume
* insert more instance variables here
Q3: Write pseudocode statements to instantiate two new radio objects named robertsRadio and
philipsRadio.
11-58
me Sw
Tuned to Radio Suffolk, lu
itc
vo
volume 3
ho
Change
n/off
Each object belongs to a class, and all the objects in the same class have the same structure and methods
but they each have their own data. Objects belonging to a class are called instances of the class.
Encapsulation
An object encapsulates both its state (the values of its instance variables) and its behaviours or methods.
All the data and methods of each object are wrapped up into a single entity so that the attributes and
behaviours of one object cannot affect the way in which another object functions. For example, setting the
volume of the philipsRadio object to 5 has no effect on any other radio object.
Encapsulation is a fundamental principle of object-oriented programming and is very powerful.
It means, for example, that in a large project different programmers can work on different classes and
not have to worry about how other parts of the system may affect any code they write. They can also
use methods from other classes without having to know how they work.
322
CHAPTER 58 – USE OF OBJECT-ORIENTED TECHNIQUES
A-Level only
Related to encapsulation is the concept of information hiding, whereby details of an object’s instance
variables are hidden so that other objects must use messages to interact with that object’s state.
To change the volume of the Roberts radio, for example, a programmer might write:
robertsRadio.setVolume(5)
A programmer using the method does not need to know how this is achieved. The documentation of
each method will specify the number and variable type of any arguments that need to be passed to the
method, and what value, if any, is returned by the method. The attribute volume cannot be seen or
changed directly; it can only be changed by sending a message (i.e. invoking the method).
Inheritance
Classes can inherit data and behaviour from a parent class in much the same way that children can
inherit characteristics from their parents. A “child” class in object-oriented program is referred to as a
subclass, and a “parent” class as a superclass.
For example, we could draw an inheritance hierarchy for animals that feature in a computer game. Note
that the inheritance relationship in the corresponding inheritance diagram is shown by an unfilled arrow
at the “parent” end of the relationship.
Animal
Cat Rodent
11-58
Mouse Beaver
All the animals in the superclass Animal share common attributes such as name and position.
Animals may also have common procedures (methods), such as moveLeft, moveRight. A Cat may
have an extra attribute size, and an extra method pounce. A Rodent may have an extra method
gnaw. A Beaver may have an extra method, makeDam.
323
SECTION 11 – PROGRAMMING TECHNIQUES
A-Level only
To code the class header for Cat, which is a subclass of Animal, in pseudocode we could write
something like
Class Cat inherits Animal
11-58 private size
public procedure new(aName, aSize)
super.new(aName)
size = aSize
endclass
Polymorphism
Polymorphism refers to a programming language’s ability to process objects differently depending on
their class. For example, all objects in subclasses of Animal can execute the methods moveLeft,
moveRight, which will cause the animal to move one space left or right.
We might decide that a cat should move three spaces when a moveLeft or moveRight message is
received, and a Rodent should move two spaces. We can define different methods within each of the
classes to implement these moves, but keep the same method name for each class.
Defining a method with the same name and formal argument types as a method inherited
from a superclass is called overriding. In the example above, the moveLeft method in
each of the Cat and Rodent classes overrides the method in the superclass Animal.
324
CHAPTER 58 – USE OF OBJECT-ORIENTED TECHNIQUES
A-Level only
Moves
Animal one space
Mouse Beaver
Q6: Suppose that tom is an instance of the Cat class, and jerry is an instance of the Mouse
class. What will happen when each of these statements is executed?
tom.moveRight()
jerry.moveRight()
Exercises
1. A sports club keeps details of its members. Each member has a unique membership number, first
name, surname and telephone number recorded. Three classes have been identified:
Member
JuniorMember
SeniorMember
11-58
The classes JuniorMember and SeniorMember are related, by single inheritance, to
the class Member.
(b) Programs that use objects of the class Member need to create a new member, edit a
member’s details, delete a member’s details, and show a member’s details. No other form
of access is to be allowed.
Complete the definition of the attributes and the procedure new for the Member class.
Class Member
private memberNumber
...
public procedure new(aMemberNumber, aFirstame, aSurname, aTel)
memberNumber = aMemberNumber
...
endprocedure
...
endclass [1]
(ii) Write a statement to create a new Member object with membership number A456,
first name John, surname Bell, telephone number 07981 345987. [2]
325
SECTION 11 – PROGRAMMING TECHNIQUES
A-Level only
2. (a) In an object-oriented computer game there is a class called Crawlers. Two subclasses
of Crawlers are Spiders and Bugs. Draw an inheritance diagram for this. [2]
(b) For the subclass Spiders suggest:
(b) An object-oriented program stores details of a class Bird and a subclass Seagull,
defined as follows:
Class Bird
public procedure move
system.print("Birds can fly")
endprocedure
endclass
(i) What will be printed when the following lines are executed?
bird1.move
bird2.move [2]
326
Section 12
Algorithms
In this section:
Chapter 59 Analysis and design of algorithms 328
12
327
SECTION 12 – ALGORITHMS
A-Level only
Comparing algorithms
Algorithms may be compared on how much time they need to solve a particular problem. This is referred
to as the time complexity of the algorithm. The goal is to design algorithms which will run quickly while
taking up the minimal amount of resources such as memory.
In order to compare the efficiency of different algorithms in terms of execution time, we need to quantify
the number of basic operations or steps that the algorithm will need, in terms of the number of items to
be processed.
For example, consider these two algorithms, which both calculate the sum of the first n integers.
The second algorithm computes the same sum using a different algorithm:
function sumIntegersMethod2(n)
sum = n * (n+1)/2
return sum
endfunction
The first algorithm performs one operation (sum = 0) outside the loop and n operations inside the for
loop, a total of n + 1 operations. As n increases, the extra operation to initialise sum is insignificant,
and the larger the value of n, the more inefficient this algorithm is. Its order of magnitude or time
complexity is basically n. The second algorithm, on the other hand, takes the same amount of time
whatever the value of n. Its time complexity is a constant.
We will return to this idea later in the chapter, but first, we need to look at some of the maths involved in
calculating the time complexity of different algorithms.
328
CHAPTER 59 – ANALYSIS AND DESIGN OF ALGORITHMS
A-Level only
Introduction to functions
The order of magnitude, or time complexity, of an algorithm can be expressed as a function of its size.
A function maps one set of values onto another.
INPUT x
A linear function
A linear function is expressed in general terms as f(x) = ax + c
Values of the function f(x) = 3x + 4 are shown below for x = 1, 10, 100, 10,000
x 3x 4 y = f(x)
1 3 4 7
10 30 4 34
100 300 4 304
10,000 30,000 4 30,004
12-59
Notice that the constant term has proportionally less and less effect on the value of the function as
the value of x increases. The only term that is significant is 3x, and f(x) increases in a straight line
as x increases.
A polynomial function
A polynomial expression is expressed as f(x) = axm + bx + c
Values of the function f(x) = 2x2 + 10x + 50 are shown below for x = 1, 10, 100, 10,000
The values of b and c have a smaller and smaller effect on the answer as x increases, compared with
the value of a. The only term that really matters is the term in x2, if we are approximating the value of the
function for a large value of x.
An exponential function
An exponential function takes the form f(x) = abx. This function grows very large, very quickly!
Q2: What is the value of f(x) = 2x when x = 1? When x = 10? When x = 100?
329
SECTION 12 – ALGORITHMS
A-Level only
A logarithmic function
A logarithmic function takes the form f(x) = a logn x
“The logarithm of a number is the power that the base must be raised to make it equal to the number.”
Values of the function f(x) = log2 x are shown below for x = 1, 8, 1,024, 1,048,576.
x y = log2 x
1 0
8 (2 )3
3
1024 (2 ) 10
10
1,048,576 (2 ) 20
20
Permutations
The permutation of a set of objects is the number of ways of arranging the objects. For example, if you
have 3 objects A, B and C you can choose any of A, B or C to be the first object. You then have two
choices for the second object, making 3 x 2 = 6 different ways of arranging the first two objects, and
then just one way of placing the third object. The six permutations are ABC, ACB, BAC, BCA, CAB, CBA.
Q3: How many permutations are there of four objects? How many ways are there of arranging six
students in a line?
The formula for calculating the number of permutations of four objects is 4 x 3 x 2 x 1, written 4! and
12-59 spoken as “four factorial”. (Note that 10! = 3.6 million… so don’t try getting 10 students to line up in all
possible ways!)
Big-O notation
Now that we have got all the maths out of the way and hopefully understood, we can study the so-called
Big-O notation which is used to express the time complexity, or performance, of an algorithm. (‘O’
stands for ‘Order’.)
The best way to understand this notation is to look at some examples.
330
CHAPTER 59 – ANALYSIS AND DESIGN OF ALGORITHMS
A-Level only
Q4: A hacker trying to discover a password starts by checking a dictionary containing 170,000
words. What is the maximum number of words he will need to try out?
This procedure fails to find the password. He now needs to try random combinations of the
letters in the password. He starts with 6-letter combinations of a-z, A-Z.
Explain why the second procedure will take so much longer than the first.
n!
2n 12-59
n2
Time
n
log n
n
Graphs of log n, n, n2, 2n, n!
331
SECTION 12 – ALGORITHMS
A-Level only
To calculate the time complexity of the algorithm in Big-O notation, we need to count the number of
basic operations it performs. There is one initial statement, and n if statements, so the time complexity
is 1 + n. However, as we have already discussed, the 1 is insignificant compared to n and this algorithm
therefore executes in linear time and has time complexity O(n).
The second algorithm compares each value in the array to all the other values of the array, and if the
current value is less than or equal to all the other values in the array then it is the minimum.
for k = 1 to n-1
isMinimum = True
for j = 1 to n-1
if arrayX[k] > arrayX[j] then
isMinimum = false
endif
next j
if (isMinimum) then
minimum = arrayX[k]
endif
next k
To calculate the time complexity of this algorithm, we count the number of basic operations it performs.
There are two basic operations in the outer loop, (isMinimum = true and the final if statement)
which are each performed n times. The inner loop has one basic operations performed n2 times.
12-59 This gives us a time complexity of 2n + n2, but as discussed earlier, the only significant term is the one
in n2. The time complexity is therefore O(n2).
Q5: What is the time complexity of each of the two subroutines sumIntegerMethod1 and
sumIntegerMethod2 discussed at the beginning of this chapter?
332
CHAPTER 59 – ANALYSIS AND DESIGN OF ALGORITHMS
A-Level only
Exercises
1. Assuming a is an array of n elements, compute the time complexity of the following algorithm.
duplicate = False
for i = 0 to n - 2
for j = i + 1 to n – 1
if a[i] = a[j] then duplicate = True
next j
next i [3]
n 1 2 4 8 12 [4]
f(n) = n2 1 4
f(n) = 2n
2 4
f(n) = log2n 0 1 3.585
f(n) = n! 1 479,001,600
(b) Place the following algorithms in order of time complexity, with the most efficient 12-59
algorithm first. [2]
Algorithm A of time complexity O(n)
(c) Explain why algorithms with time complexity O(n!) are generally considered not to be helpful
in solving a problem. Under what circumstances would such an algorithm be considered? [3]
333
SECTION 12 – ALGORITHMS
Linear search
Sometimes it is necessary to search for items in a file, or in an array in memory. If the items are not in any
particular sequence, the data items have to be searched one by one until the required one is found or the
end of the list is reached. This is called a linear search.
The following algorithm for a linear search of a list or array alist (indexed from 0) returns the index of
itemSought if it is found, -1 otherwise.
function linearSearch(alist,itemSought)
index = -1
i = 0
found = False
while i < length(alist) and found = False
if alist[i] = itemSought then
index = i
12-60
12-59 found = True
endif
i = i + 1
endwhile
return index
endfunction
Q1: What is the maximum number of items that would have to be examined to find a particular
item in a linear search of one million items? What is the average number that would have to be
searched?
A-Level only
Time complexity of linear search
We can determine the algorithm’s efficiency in terms of execution time, expressed in Big-O notation.
To do this, you need to compute the number of operations that the algorithm will require for n items.
The loop is performed n times for a list of length n, and there are two steps in the loop (an IF statement
and an assignment statement), giving a total of 3 + 2n steps (including 3 steps at the start). The constant
term and the coefficient of n become insignificant as n increases in size, and the time complexity of the
algorithm basically depends on how often the loop has to be performed in the worst-case scenario.
Therefore, the time complexity of the linear search is O(n).
334
CHAPTER 60 – SEARCHING ALGORITHMS
Binary search
The binary search is a much more efficient method of searching a list for an item than a linear search, but
crucially, the items in the list must be sorted. If they are not sorted, a linear search is the only option.
The algorithm works by repeatedly dividing in half the portion of the data list that could contain the
required data item. This is continued until there is only one item in the list.
Consider the following ordered list where we wish to search for data item 50.
15 21 29 32 37 40 42 43 48 50 60 64 77 81 90 98
Stage 1: middle term is 43; we can therefore discard all data items less than or equal to 43. Note that
the middle item of an even number of items is obtained by rounding down; the middle item of 16 items is
item 8.
48 50 60 64 77 81 90 98
Stage 2: middle term is 64, so we can discard all data items greater than or equal to 64.
48 50 60
Which one of the following is the correct sequence of comparisons when used to locate the
data item 8?
(i) 12, 6, 8
(ii) 11, 5, 6, 8
Q3: Ask a friend to think of a number between 1 and 1000. Then use a binary search algorithm to
guess the number. How many different guesses will you need, at most?
Q4: Look at the following data list. Which items will you examine in (a) a linear search and
(b) a binary search to find the following data items?
(i) 27
(ii) 11
(iii) 60
9 11 19 22 27 30 32 33 40 42 50 54 57 61 70 78 85
335
SECTION 12 – ALGORITHMS
A-Level only
336
CHAPTER 60 – SEARCHING ALGORITHMS
Q5: An array contains 12 numbers 5, 13, 16, 19, 26, 35, 37, 57, 86, 90, 93, 98
Trace through the binary search algorithm to find how many items have to be examined before
the number 90 is found. The first row of the trace table is filled in below.
itemSought index found first upper midpoint aList(midpoint)
90 -1 false 0 11 5 35
Q6: What is the maximum number of items that would have to be examined to find a particular item
in a binary search of one million items?
A-Level only
A recursive algorithm
The basic concept of the binary search is in fact recursive, and a recursive algorithm is given below.
The procedure calls itself, eventually “unwinding” when the procedure ends. When recursion is used
there must always be a condition that if true, causes the program to terminate the recursive procedure,
or the recursion will continue forever.
Once again, first, last and midpoint are integer variables used to index elements of the array, with
first starting at 0 and last starting at the upper limit of the array index.
12-60
function binarySearch(aList, itemSought, first, last)
if last < first then
return -1
else
midpoint = integer part of (first + last) / 2
if aList[midpoint] > itemSought then
#itemSought is in first half of list
return binarySearch(aList, itemSought, first, midpoint-1)
else
if aList[midpoint] < itemSought then
#itemSought is in second half of list
return binarySearch(aList, itemSought, midpoint+1, last)
else
#itemSought has been found
return midpoint
endif
endif
endif
endfunction
Q7: What condition(s) will cause a value to be returned from the subroutine to the calling program?
337
SECTION 12 – ALGORITHMS
A-Level only
50
27 62
12 35 59 71
9 14 28 41 52 60 68
function binarySearchTree(itemSought,currentNode)
if currentNode = None then
return False
else
if itemSought = item at currentNode then
return True
else
12-60
12-59 if itemSought < item at currentNode then
if left child exists then
return binarySearchTree (itemSought, left child)
else
return False
endif
if right child exists then
return binarySearchTree(itemSought, right child)
else
return False
endif
endif
endif
endif
endfunction
338
CHAPTER 60 – SEARCHING ALGORITHMS
Exercises
1. (a) Data structures may be described as static or dynamic.
(ii) State one type of data structure that is always considered static.
Use the values given to show the first three stages for:
(iii) Explain the difference between binary searching and serial searching. [2]
(iv) State one advantage and one disadvantage of a binary search compared with
a serial search. [2]
What is the maximum number of names that would need to be accessed to determine if a
particular name appears in the list? [1]
A-Level only
(b) Which of the following is the order of time complexity of the binary search method?
339
SECTION 12 – ALGORITHMS
Sorting algorithms
Sorting is a very common task in data processing, and frequently the number of items may be huge, so
using a good algorithm can considerably reduce the time spent on the task. There are many different
sorting algorithms and we will start by looking at a simple but inefficient example.
Bubble sort
The Bubble sort is one of the most basic sorting algorithms and the simplest to understand. The basic
idea is to bubble up the largest (or smallest) item to the end of the list, then the second largest, then the
third largest and so on until no more swaps are needed.
• Go through the array, comparing each item with the one next to it. If it is greater, swap them.
• The last element of the array will be in the correct place after the first pass
• Repeat n-2 times, reducing by one on each pass the number of elements to be examined
12-59
12-61
Q1: How do you swap two items in an array?
340
CHAPTER 61– BUBBLE SORT AND INSERTION SORT
Pass 1 9 5 4 15 3 8 11
5 9 4 15 3 8 11
5 4 9 15 3 8 11
5 4 9 15 3 8 11
5 4 9 3 15 8 11
5 4 9 3 8 15 11
5 4 9 3 8 11 15
After the first pass, the largest item is in the correct place at the end of the list. On the second pass, only
the first six numbers are checked.
Pass 2 4 5 3 8 9 11 15
12-61
11 and 15 in the correct place; so only the first five numbers are checked.
Pass 3 4 3 5 8 9 11 15
9, 11 and 15 in the correct place; so only the first four numbers are checked.
Pass 4 3 4 5 8 9 11 15
8, 9, 11 and 15 in the correct place; so only the first three numbers are checked.
Pass 5 3 4 5 8 9 11 15
Pass 6 3 4 5 8 9 11 15
Notice that in this case, no numbers were swapped on Pass 5. Therefore Pass 6 was not necessary.
In order to avoid performing unnecessary passes on a list that is already in sequence, a flag may be set
and tested on each pass so that if no swaps are made, no more unnecessary passes are made through
an already sorted list. This is shown in Example 2 on the next page.
341
SECTION 12 – ALGORITHMS
Example 1
Write a pseudocode algorithm for a bubble sort to sort the numbers 9, 5, 4, 15, 3, 8, 11 into ascending
sequence. Print the numbers after each of the 6 passes through the list.
numbers = [9, 5, 4, 15, 3, 8, 11]
numItems = len(numbers) #get number of items in the array
for i = 0 to numItems - 2
for j = 0 to(numItems - i - 2)
if numbers [j] > numbers[j + 1]
# Swap the names in the array
temp = numbers[j]
numbers[j] = numbers[j + 1]
numbers[j + 1] = temp
endif
next j
print (numbers)
next i
12-59
12-61 [3, 4, 5, 8, 9, 11, 15]
[3, 4, 5, 8, 9, 11, 15]
[3, 4, 5, 8, 9, 11, 15]
The last pass through the list was not necessary.
Example 2
Amend the algorithm so that no unnecessary passes are made though the list.
numbers = [9, 5, 4, 15, 3, 8, 11]
numItems = len(numbers) #get number of items in the array
flag = True #indicates when a swap is made
while i < (numItems - 1) and (flag = True)
flag = False
for j = 0 to numItems – i - 2
if numbers [j] > numbers[j + 1]
# Swap the names in the array
temp = numbers[j]
numbers[j] = numbers[j + 1]
numbers[j + 1] = temp
flag = True
endif
next j
i = i + 1
endwhile
print(numbers)
342
CHAPTER 61– BUBBLE SORT AND INSERTION SORT
Insertion Sort
This is a sorting algorithm that sorts one data item at a time. It is rather similar
to how you might sort a hand of cards. The algorithm takes one data item
from the list and places it in the correct location in the list. This process is
repeated until there are no more unsorted data items in the list. Although more
efficient than the bubble sort, it is not as efficient as the merge sort or quick sort.
9, 5, 4, 15, 3, 8, 11
We leave the first item at the start of
9 5 4 15 3 8 11
the list
12-61
8 is now inserted into the sorted list 5th pass 3 4 5 8 9 15 11
On each pass, the current data item is checked against those already in the sorted list (shaded in the
diagram). If the data item being compared in the sorted list is larger than the current data item, it is now
shifted to the right. This continues to happen until we reach a data item in the sorted list which is smaller
than the current data item.
For example, at the 5th pass 8 is compared with 15, and since it is smaller, 15 is shifted right.
8 is compared with 9, and 9 is shifted right.
8 is compared with 5, and as it is larger, it is inserted into the free space.
343
SECTION 12 – ALGORITHMS
#main program
alist = [9,5,4,15,3,8,11]
insertionSort(alist)
print("sorted list ", alist)
Q2: The following list of names is to be sorted into alphabetical sequence using an insertion sort.
George, Jane, Miranda, Ahmed, Sophie, Bernie, Keith.
(a) What is the first name to be moved? What will the list look like after this name is moved?
12-59
12-61
(b) What is the second name to be moved? What will the list look like after this name has
been moved?
Exercises
1. (a) A bubble sort is performed on the following list:
344
CHAPTER 62 – MERGE SORT AND QUICK SORT
A-Level only
Merge sort
The merge sort uses a divide and conquer approach. The list is successively divided in half, forming
two sublists, until each sublist is of length one. The sublists are then sorted and merged into larger
sublists until they are recombined into a single sorted list. The basic steps are:
• Divide the unsorted list into n sublists, each containing one element
• Repeatedly merge sublists to produce new sorted sublists until there is only one sublist remaining.
This is the sorted list.
The merge process is shown graphically below for a list in the initial sequence 5 3 2 7 9 1 3 8.
5 3 2 7 9 1 3 8
Merge Merge Merge Merge
3 5 2 7 1 9 3 8
Merge Merge
2 3 5 7 1 3 8 9
Merge
1 2 3 3 5 7 8 9
The list is first split into sublists each containing one element.
The merge process merges each pair of sublists into the correct sequence. Taking for example two
lists: leftlist = [2,3] and rightlist = [1,3], the merge process works like this:
1. Compare the first item in leftlist with the first element in rightlist
2. If item in leftlist < item in rightlist, add item from leftlist to mergedlist and read
the next item from leftlist
3. Otherwise, add item from rightlist to mergedlist and read the next item from
rightlist
4.Once one list is empty, any remaining items are copied into the merged list
5. Repeat from Step 2 until all items are in mergedlist
The process is then repeated for each pair of sublists until the lists are merged into the final sorted list.
345
SECTION 12 – ALGORITHMS
A-Level only
An algorithm for the merge sort is given below.
procedure mergesort(mergelist)
if len(mergelist) > 1 then
mid = len(mergelist) div 2 #performs integer division
lefthalf = mergelist[:mid] #
left half of mergelist into
lefthalf
righthalf = mergelist[mid:] #
right half of mergelist into
righthalf
mergesort(lefthalf)
mergesort(righthalf)
i = 0
j = 0
k = 0
while i < len(lefthalf) and j < len(righthalf)
if lefthalf[i] < righthalf[j] then
mergelist[k] = lefthalf[i]
i = i + 1
else
mergelist[k] = righthalf[j]
j = j + 1
endif
12-62
12-59 k = k + 1
endwhile
while i < len(lefthalf) # check if left half has elements
not merged
mergelist[k] = lefthalf[i] #if so, add to mergelist
i = i + 1
k = k + 1
endwhile
while j < len(righthalf) # check if rt half has elements
not merged
mergelist[k] = righthalf[j] #if so, add to mergelist
j = j + 1
k = k + 1
endwhile
endif
endprocedure
#****** main program *******
mergelist = [5, 3, 2, 7, 9, 1, 3, 8]
mergesort(mergelist)
print(mergelist)
346
CHAPTER 62 – MERGE SORT AND QUICK SORT
A-Level only
Q2: Draw a graphical representation of how a list [5, 3, 9, 4, 2, 6, 1] is first split into halves until
each sublist contains zero or one items, and then the sublists are merged to become the
sorted list.
Space complexity
The amount of resources such as memory that an algorithm requires, known as the space complexity,
is also a consideration when comparing the efficiency of algorithms. The bubble sort, for example, 12-62
requires n memory locations for a list of size n. The merge sort, on the other hand, requires additional
memory to hold the left half and right half of the list, so takes twice the amount of memory space.
Quick sort
The quick sort algorithm, like the insertion sort, uses a Divide and Conquer algorithm to quickly reduce
the size of the problem, but without using the additional storage required by the merge sort.
The steps in the quick sort are as follows:
1. Select a value called the pivot value. There are different ways to choose the pivot value but we will
choose the first item in the list. The actual position where the pivot value belongs in the final sorted
list, called the split point, will be used to divide the list for subsequent calls. In the list shown below,
9 is the first pivot value.
9 5 4 15 3 8 11
• all elements less than the pivot value must be in the first partition
• all elements greater than the pivot value must be in the second partition
(The order of the elements in each partition is not significant in this explanation. It will become clearer
in the explanation of the detailed procedure.)
3 5 4 8 9 15 11
347
SECTION 12 – ALGORITHMS
A-Level only
3. 3 and 15 are now the pivots in the left and right partitions. Recursively repeat the process.
3 5 4 8 9 11 15
3 4 5 8 9 11 15
leftmark rightmark
11 > 9 so move rightmark to left.
9 5 4 15 3 8 11
8 < 9 so stop.
leftmark rightmark
12-62
12-59 Exchange 15 and 8, and continue
9 5 4 15 3 8 11
moving leftmark and rightmark
leftmark rightmark
8 < 9 so move leftmark to right.
9 5 4 8 3 15 11
3 < 9 so move to right. 15 > 9 so stop.
leftmark rightmark
leftmark rightmark
ightmark and leftmark have now crossed over, so we stop. The position of rightmark is now the split
R
point. The pivot value is exchanged with the contents of the split point and the pivot value is now in
place.
3 5 4 8 9 15 11
All the items to the left of the split point are less than the pivot value, and all the items to the right of the
split point are greater than the pivot value. The list can now be divided at the split point and the quick
sort invoked recursively on the two halves.
3 5 4 8 9 15 11
quicksort quicksort
left half righthalf
348
CHAPTER 62 – MERGE SORT AND QUICK SORT
A-Level only
349
SECTION 12 – ALGORITHMS
A-Level only
Exercises
1. (a) There are many methods of sorting a set of records into ascending order of key.
What factors would you consider in deciding which of these methods is the most suitable
for a particular application? [2]
(b) The merge sort algorithm has time complexity O(n log n). For a list of 1,024 items in
random sequence, is this algorithm more or less efficient than a sort algorithm of time
complexity O(n2)? Explain your answer. [3]
350
CHAPTER 63 – GRAPH TRAVERSAL ALGORITHMS
A-Level only
Graph traversals
There are two ways to traverse a graph so that every node is visited. Each of them uses a supporting
data structure to keep track of which nodes have been visited, and which node to visit next.
Depth-first traversal
In this traversal, we go as far down one route as we can before backtracking and taking the next route.
The following recursive subroutine dfs is called initially from the main program, which passes it a graph,
defined here as an adjacency list (see Chapter 38) and implemented as a dictionary with nodes A, B, 12-63
C, … as keys, and neighbours of each node as data. Thus if "A" is the current vertex, graph["A"] will
return the list ["B","D","E"] with reference to the algorithm below and the graph overleaf.
The calling program also passes an empty list of visited nodes and a starting vertex.
Check the graph in Step 1 on the next page to verify that it corresponds to the nodes and their
neighbours. There are different ways of drawing the graph but logically they should all be equivalent!
GRAPH = { "A":["B","D","E"], "B":["A","C","D"], "C":["B","G"],
"D":["A","B","E","F"], "E":["A","D"] , "F":["D"], "G":["C"]}
visitedList = [] #an empty list of visited nodes
#main program
traversal = dfs(GRAPH, "A", visitedList)
print("Nodes visited in this order: ", traversal)
351
SECTION 12 – ALGORITHMS
A-Level only
It is easiest to understand how this works by looking at the graphs below. This shows the state of the
stack (here it just shows the current node when a recursive call is made), and the contents of the visited
list. Each visited node is coloured dark blue.
A B A B A
G G
Visited Visited
D D
C C
E F E F
Stack Stack
A B AB A B ABC
G G
Visited Visited
BA
D D
C C
A
E F E F
Stack Stack
Visited Visited
BA
D D
C C
E F E F
Stack Stack
12-63 5. Push C onto the stack and from C, visit the next
6. A
t G, there are no unvisited nodes so we backtrack.
unvisited node, G. Add it to the visited list. Colour it to
Pop the previous node C off the stack and return to C
show it has been visited.
A B ABCG A B ABCGD
G G
Visited Visited
BA
D D
C C
A
E F E F
Stack Stack
A B ABCGDE A B ABCGDE
G G
DBA
Visited Visited
BA
D D
C C
E F E F
Stack Stack
9. Push D onto the stack and visit E. Add it to the visited 10. F
rom E, A and D have already been visited so pop D
list. Colour it to show it has been visited. off the stack and return to D.
A B ABCGDEF A B ABCGDEF
G G
DBA
Visited Visited
D D
C C
E F E F
Stack Stack
11. Push D back onto the stack and visit F. Add it to the 12. A
t F, there are no unvisited nodes so we pop D, then
visited list. Colour it to show it has been visited. B, then A, whose neighbours have all been visited.
The stack is now empty which means every node has
been visited and the algorithm has completed.
352
CHAPTER 63 – GRAPH TRAVERSAL ALGORITHMS
A-Level only
Breadth-first traversal
With a breadth first traversal, starting at A we first visit all the nodes adjacent to A before moving to B and
repeating the process for each node at this ‘level’, before moving to the next level. Instead of a stack, a queue
is used to keep track of nodes that we still have to visit. Nodes are coloured pale blue when queued and dark
blue when dequeued and added to the list of nodes that have been visited.
A B A B A
G G
Visited Visited
D D
C C
A
E F E F
Queue Queue
1. Append A to the empty queue at the start of the 2. Dequeue A and mark it by colouring it dark blue. Add it
routine. This will be the first visited node. to the visited list.
A B A B AB
G A G
Visited Visited
D D
C C
BDE DE
E F E F
Queue Queue
3. Queue each of A’s adjacent nodes B, D and E in 4. We’ve now finished with A, so dequeue the first item in
turn, Colour each node pale blue to show it has been the queue, which is B. Mark it by colouring it dark blue
queued. and add it to the visited list.
A B AB A B ABD
G G
Visited Visited
D D
C C
F
DEC
F
EC 12-63
E E
Queue Queue
6. B’s neighbours are all coloured, so dequeue the first
5. Queue B’s remaining neighbour C. Colour it pale blue
item in the queue, which is D. Mark it by colouring it
to show it has been queued.
dark blue and add it to the visited list.
A B ABD A B ABDE
G G
Visited Visited
D D
C C
ECF CF
E F E F
Queue Queue
7. D’s adjacent node E has already been queued and
8. Dequeue the first item, E. Mark it by colouring it dark
coloured. Add D’s adjacent node F to the queue.
blue and add it to the visited list.
Colour it pale blue to show it has been queued.
A B ABDEC A B ABDEC
G G
Visited Visited
D D
C C
F FG
E F E F
Queue Queue
A B ABDECF A B ABDECFG
G G
Visited Visited
D D
C C
G
E F E F
Queue Queue
353
SECTION 12 – ALGORITHMS
A-Level only
Note that we need to distinguish between a dequeued vertex that is added to the visited list and whose
neighbours we are examining, which we colour dark blue, and neighbours of the current vertex, which we
put in the queue and colour pale blue to show they have been queued but not visited.
GRAPH = {
"A": {"colour": "White", "neighbours": ["B", "D", "E"]},
"B": {"colour": "White", "neighbours": ["A", "D", "C"]},
"C": {"colour": "White", "neighbours": ["B", "G"]},
"D": {"colour": "White", "neighbours": ["A", "B", "E", "F"]},
"E": {"colour": "White", "neighbours": ["A", "D"]},
12-63 "F": {"colour": "White", "neighbours": ["D"]},
"G": {"colour": "White", "neighbours": ["C"]}
}
#main
visited = bfs(GRAPH, "A")
print ("List of nodes visited: ", visited)
354
CHAPTER 63 – GRAPH TRAVERSAL ALGORITHMS
A-Level only
• In scheduling jobs where a series of tasks is to be performed, and certain tasks must be completed
before the next one begins.
• In solving problems such as mazes, which can be represented as a graph
Finding a way through a maze
A depth-first search can be used to find a way out of a maze. Junctions where there is a choice of route
in the maze are represented as nodes on a graph.
A
B A
C B
D
E
C D
E
X
Q1: (a) Redraw the graph without showing the dead ends. 12-63
(b) State the properties of this graph that makes it a tree.
(c) Complete the table below to show how the graph would be represented using an
adjacency matrix.
A B C D E X
A
B
C
D
E
X
Q2: Draw a graph representing the following maze. Show the dead ends on your graph.
A D X
B C
355
SECTION 12 – ALGORITHMS
A-Level only
• A major application of a breadth-first search is to find the shortest path between two points A and
B, and this will be explained in detail in the next chapter. Finding the shortest path is important in, for
example, GPS navigation systems and computer networks.
• Facebook. Each user profile is regarded as a node or vertex in the graph, and two nodes are connected
if they are each other’s friends. This example is considered in more depth in Chapter 72, Big Data.
• Web crawlers. A web crawler can analyse all the sites you can reach by following links randomly on a
particular website.
Giraffe Topi
12-63
Baboon Cheetah Jackal Rhino
Remember that a depth –first traversal of a graph (and therefore, of a tree) goes as far down one path as
possible, before backing up to the nearest root node and exploring that path as far as it goes. A depth-
first traversal of this tree visits nodes in the order
Monkey, Giraffe, Buffalo, Baboon, Cheetah, Hippo, Jackal, Topi, Ostrich, Rhino, Zebra.
Q3: Write down the order of nodes visited in a pre-order traversal of the tree.
You should have discovered that the nodes are visited in the same order – in other words, a depth-first
tree traversal is equivalent to a pre-order traversal.
Although it would be quite possible to do a depth-first tree traversal using the algorithm given above
using the stack as a “helper” data structure, a much simpler algorithm is given in Chapter 39.
Q4: Write down the order of nodes visited in a post-order traversal of the tree.
They are not the same! The breadth-first traversal is best done using the algorithm for the breadth-first
graph traversal, using a queue as the “helper” data structure.
356
CHAPTER 63 – GRAPH TRAVERSAL ALGORITHMS
A-Level only
Exercises
1. (a) Name the supporting data structure which is commonly used when traversing a graph
(b) Show the order in which vertices in the following graph are visited, starting at A, using
B E
D
C G H F
(c) (i) Explain why the graph above is not a tree. Which edges would need to be removed
for it to be a tree? [2]
(ii) Show, by traversing the tree below using a pre-order traversal and writing the nodes 12-63
in the order that they are visited, that a pre-order tree traversal is equivalent to a
depth-first graph traversal. [2]
1
2 3
4 5 6
7 8 9
2. List the order in which nodes in the tree below will be visited using
J P
C L R T
A S Z
A
357
SECTION 12 – ALGORITHMS
A-Level only
Optimisation problems
We increasingly rely on computers to find the optimum solution to a range of different problems.
For example:
• scheduling aeroplanes and staff so that air crews always have the correct minimum rest time
between flights
• finding the best move in a chess problem
• timetabling classes in schools and colleges
• finding the shortest path between two points – for building circuit boards, route planning,
communications networks and many other applications
Finding the shortest path from A to B has numerous applications in everyday life and in computer-related
12-64 problems. For example, if you visit a site like Google Maps to get directions from your current location to
a particular destination, you probably want to know the shortest route. The software that finds it for you
will use representations of street maps or roads as graphs, with estimated driving times or distances as
edge weights.
B 3
C
Edge weight 7
2 4 1
6
A D E
3 7
358
CHAPTER 64 – OPTIMISATION ALGORITHMS
A-Level only
The algorithm
The algorithm works as follows:
Assign a temporary distance value to every node, starting with zero for
the initial node and infinity for every other node
Add all the vertices to a priority queue, sorted by current distance
(This puts the initial node at the front, the rest in random order.)
while the queue is not empty
remove the vertex u from the front of the queue
for each unvisited neighbour w of the current vertex u
newDistance = distanceAtU + distanceFromUtoW
if newDistance < distanceAtW then
distanceAtW = newDistance
change position of w in priority queue to reflect new
distance to w
endif
next w
endwhile
Example
In the figure below, A is the start node. A temporary distance value has been assigned to every node,
starting with zero for the start node and infinity for every other node. 12-64
The priority queue is shown beside the graph, and it is kept in order of vertices with the shortest
known distance from A. To start with, A is at the front, and the other nodes are in random order, in this
case alphabetical.
The vertices are coloured.
• White vertices have not been visited and their distances remain at infinity.
• Pale blue vertices have been partially explored. A tentative distance to them has been found but all
possible paths to them have not yet been explored, so this distance cannot be guaranteed to be the
shortest one and they remain in the queue.
• Dark blue vertices have been removed from the queue and their minimum distance from A has been
found. These vertices are described as having being visited.
Start at A, remove it from the front of the queue and shade it dark blue to show it has been visited
∞
B 3 ∞
C
7
2 4
6
1 Priority queue
A
3
D
7
E B=∞ C=∞ D=∞ E=∞
0 ∞ ∞
Node A has two neighbours B and D. Shade each of these pale blue to show they have been partially
explored, and calculate new distance values for nodes B and D by taking the distance value at A (i.e.
zero) and adding it to the edge weight between A and B, A and D.
Since all these values are less than infinity, update the distances at B and D. Distance at D is less than
distance at B, so move D to the front of the priority queue.
359
SECTION 12 – ALGORITHMS
A-Level only
7
B 3 ∞
C
7
2 4 1
6
A D E D=3 B=7 C=∞ E=∞
3 7
0 3 ∞
Remove D from the front of the queue. Shade it dark blue to show it has been visited. Shade D’s
neighbours C and E pale blue to show they have been partially explored.
Now calculate new values for the unvisited neighbours of D, namely B, C and E. The distance between D
and B is 2, and this is added to the edge weight between D and A. 3 + 2 = 5 so the distance value at B
is changed to the new lowest value, 5.
The current tentative distance ∞ at C is replaced with 3 + 4 = 7, at E is replaced with 3 + 7 = 10.
The order of nodes in the priority queue does not need to be changed since B, the node with the
smallest current distance from A, is already at the front.
5
B 3 7
C
7
2 4 1
6
A
3
D
7
E B=5 C = 7 E = 10
0 3 10
12-64 Remove B from the priority queue. Shade B dark blue to show it has been visited.
At B, the values at C and E are calculated as 5 + 3 = 8 and 5 + 6 = 11 respectively, but these are both
greater than the tentative values already there, so these values are not changed.
5
B 3 7
C
7
2 4 1
6
A D E C = 7 E = 10
3 7
0 3 10
Remove C from the queue and shade it dark blue to show it has been visited. The distance to E via C will
be calculated as 7 + 1 = 8. This is less than current tentative distance to E (10) so will replace it.
5
B 3 7
C
7
2 4 1
6
A D E E=8
3 7
0 3 8
Remove E from the queue. It has no unvisited neighbours, so there are no new distances to calculate.
Shade E dark blue.
5
B 3 7
C
7
2 4 1
6
A D E
3 7
0 3 8
360
CHAPTER 64 – OPTIMISATION ALGORITHMS
A-Level only
The queue is empty, all the nodes have now been visited so the algorithm ends.
We have found the shortest distance from A to every other node, and the shortest distance from A is
marked in blue at each node.
Q1: Copy the graph below and use the method above to trace the shortest path from A to all other
nodes. Write the shortest distance at each node.
10
A C
5 4 5
B D
11
Q2: Use a similar method to trace the shortest path from A to all other nodes. Write the shortest
distance at each node. What is the shortest distance from A to G?
B 7
C
6
3 3
6 12-64
A 7 4 6
D G
4 1
2 10
F
E
The A* algorithm
Dijkstra’s algorithm is a special case of a more general path-finding algorithm called the A* algorithm.
Dijkstra’s algorithm has one cost function, which is the real cost value (e.g. distance) from the source
node to every other node.
The A* algorithm has two cost functions:
1. g(x) – as with Dijkstra’s algorithm, this is the real cost from the source to a given node.
2. h(x) – this is the approximate cost from node x to the goal node. It is a heuristic function, meaning
that it is a good or adequate solution, but not necessarily the optimum one. This algorithm stipulates
that the heuristic function should never overestimate the cost, therefore the real cost should be
greater than or equal h(x).
The total cost of each node is calculated as f(x) = g(x) + h(x).
The A* algorithm focusses only on reaching the goal node, unlike Dijkstra’s algorithm which finds the
lowest cost or shortest path to every node. It is used, for example, in video games to enable characters
to navigate the world.
361
SECTION 12 – ALGORITHMS
A-Level only
Exercises
1. (a) What is the purpose of Djikstra’s shortest path algorithm? [2]
(c) The weighted graph (Figure 1) shows the distances between each of the graph’s vertices.
Copy Figure 1 and show the tentative distance from the starting node A allocated to each node
after nodes B and D have been visited (dequeued and finished with) using Dijkstra’s algorithm.
4 3
A D F
8
5
4 E 6 2 4
1
3
B C G
5 7
Figure 1
(i) Describe in a similar way, the shortest path from A to C. What is its length? [2]
12-64
(ii) What is the shortest path from A to G? What is its length? [3]
362
CHAPTER 64 – OPTIMISATION ALGORITHMS
A-Level only
2. The following graph shows distances between five cities. Djikstra’s shortest path algorithm is used
to find the shortest distance between Liverpool and each of the other cities. The algorithm is given
below.
Assign a temporary distance value to every node, starting with zero
for the initial node and infinity for every other node
Add all the vertices to a priority queue, sorted by current
distance. (This puts the initial node at the front, the rest, which
all start with temporary distances of infinity, in random order.)
while the queue is not empty
remove the vertex u from the front of the queue
for each unvisited neighbour w of the current vertex u
newDistance = distanceAtU + distanceFromUtoW
if newDistance < distanceAtW then
distanceAtW = newDistance
change position of w in priority queue to reflect new
distance to w
endif
next w
endwhile
York
23 12-64
Leeds
69
75 45
36 61
Manchester
34 39
Liverpool
Sheffield
74
The following table represents the distances after the first statement in the algorithm is executed.
(a) Complete the following table after one iteration of the WHILE loop in the above algorithm. [3]
(b) Complete the table after the second iteration of the WHILE loop. [2]
A
363
INDEX– OCR A LEVEL COMPUTER SCIENCE
Index
A n-dimensional, 181 byte, 159
1NF, 89 artificial intelligence, 253 bytecode, 46
2NF, 91 ASCII code, 160
3NF, 91 assembler, 44, 69 C
A* algorithm, 361 assembly language, 9, 69 cache memory, 7
abstract data type, 184 attributes, 319 caching, 267
abstraction, 260 automated decision making, 252 Caesar cipher, 77
by generalisation, 262 automatic call stack, 202, 309
data, 263 backup, 40 camera-based reader, 18
problem, 279 updating, 40 Captcha, 138
procedural, 268 automation, 280 capturing data, 106
accumulator, 4 Cascade Style Sheets, 130
ACID, 107 B case statement, 296
active tag, 19 backtracking, 283 CD-ROM, 26
actuator, 23 barcode censorship, 255
adder 2D, 16 Central Processing Unit, 2
full, 238 linear, 16 character set, 159, 258
half, 238 reader, 16 CIR, 4
address bus, 2, 3, 8 base case, 307 circuit switching, 119
addressing modes, 72 behaviours, 319 circular
adjacency beta testing, 53 queue, 186
list, 210 Big-O notation, 330 shift instructions, 175
matrix, 210 binary CISC, 12
Index ADT, 184 addition, 162 class, 320
agile modelling, 55 fixed point, 165, 167 client-server networking, 147
algorithm, 57, 288 floating point, 167 client-side processing, 150
recursive, 337 number system, 155 clock speed, 7
algorithms search tree, 215, 218, 338 closed source, 42
and ethics, 252 subtraction, 164 cloud computing, 148
comparing, 328 binary search, 335 code
sorting, 340 recursive algorithm, 337 generation, 48
alpha testing, 53 BIOS, 38 optimisation, 48
ALU, 4 bit, 159 collision, 204
Amazon, 249 bitwise manipulation, 174 resolution, 206
analysing personal black box testing, 53 colour paradigms, 257
information, 245 block-structured languages, 272 comments, 289
analysis, 52 Blu-Ray disk, 26 commitment ordering, 109
AND, 176, 296 Boolean algebra rules, 229 compiler, 44, 46
AND gate, 224 Boolean operators, 296 composite key, 84
API, 152 branch instruction, 70 compression, 75
application layer, 122 breadth-first dictionary-based, 77
application software, 41 search, 212, 356 lossless, 76
arithmetic traversal, 353, 354 lossy, 75
operations, 289 bubble sort, 340 computable problems, 277
shift instructions, 174 bus, 2 computational thinking, 260
Arithmetic-Logic Unit, 2, 4 address, 2, 3, 8 Computer Misuse Act 1990, 244
array, 186 control, 3 computers in the workforce, 250
1-dimensional, 179 data, 2, 3 concurrent processing, 275
2-dimensional, 181 system, 2 constants, 292
364
INDEX– OCR A LEVEL COMPUTER SCIENCE
365
INDEX– OCR A LEVEL COMPUTER SCIENCE
366
INDEX– OCR A LEVEL COMPUTER SCIENCE
367
INDEX– OCR A LEVEL COMPUTER SCIENCE
V
variables, 292
global, 306
local, 306
Vernam cipher, 78
virtual machine, 38
virtual memory, 28, 31
virus, 127
virus checker, 40
visualisation, 282
von Neumann architecture, 10
W
WAN, 114
WAP, 117
waterfall model, 54
wearable technology, 21
web forms, 136
368
OCR AS and A Level
Computer
Science
The aim of this book is to About the authors Cover picture:
provide detailed coverage of Pat Heathcote is a well-
the topics in the new OCR known and successful ‘Away Day’
AS and A Level Computer author of Computer Science Mixed media on canvas, 61x61cm
Science specifications H046 textbooks. She has spent © Hilary Turnbull
and H446. It is presented in many years as a teacher of www.hilaryturnbull.co.uk
an accessible and interesting A Level Computing courses
way, with many in-text with significant examining
questions to test students’ experience. She has also
understanding of the material worked as a programmer
and their ability to apply it. and systems analyst, and was
Managing Director of Payne-
The book is divided into Gallway Publishers until 2005. This book has been
twelve sections and within endorsed by OCR.
each section, each chapter Rob Heathcote has many
covers material that can years of experience teaching
comfortably be taught in Computer Science and is
one or two lessons. Material the author of several popular
that is applicable only to the textbooks on Computing. He
second year of the full A Level is now Managing Director of
is clearly marked. Sometimes PG Online, and writes and
this may include an entire edits a substantial number of
chapter and at other times, the online teaching materials
just a small part of a chapter. published by the company.