All Help

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Computer Science

System Architecture
A Computer is an Electronic Device that:

 Takes an input.
 Processes data
 Gives an Output

The purpose of the CPU is to fetch, decode and execute instructions.

It fetches instructions from the main memory (RAM) and brings it back to the CPU.

It then decodes the Instruction.

It then carries out the instruction. This can result in going back to the main memory
to get more data, performing a calculation, or storing data back in the main memory.

The CPU consists of the following components:

 Arithmetic Logic Unit: performs calculations and logical decisions.


 Control Unit: sends signals to show how data moves around the CPU
 Cache: Provides fast access to frequently used data and instructions
 Registers: Tiny, super-fast pieces of memory inside the CPU that perform a
specific purpose
The Von Neuman Architecture consists of a, Control Unit, Arithmetic and Logic Unit
(ALU), Memory Unit and input/outputs.

It is based on the concept of the stored-program concept. Both instruction data AND
program data are stored in the same memory in binary form. There is no way to
know if the pure binary held in memory is representing instructions or data simply by
looking at it.

It also contains the following Registers:

 Program Counter: holds the address of the next instruction in memory


 Memory Address Register: holds the address of where data is to be fetched or
stored
 Memory data Register: holds the data fetched from, or to be written to
memory
 Accumulator: holds the result of calculations

Characteristics of a CPU can be affected by several things. The 3 most important


are:

 Clock speed: Measured in Hertz (Hz), number of fetch-decode-execute cycles


per second.
 Cache Size: temporary storage of data and instructions being read and written
from the RAM. Stores copies of recent data and instructions. Much quicker
getting items from Cache than Main Memory (RAM). Trying to avoid going and
getting instructions and data from main memory if we don’t need as it costs
time.
 Number of cores: a core is a complete copy of a CPU. Therefore a quad-core
would have 4 separate processing unit each with its own registers ALU etc.
Increasing the number of cores does not simply increase the overall speed as
CPU cores have to interact with each other which takes time and many
programs are not designed to use multiple cores

An Embedded system is a computer system with a dedicated function within a larger


machinal system.

Typical properties include: Low power consumption, Small size, Rugged operating
ranges, Low cost per unit.

Examples include: Traffic lights, domestic appliances, factory equipment, engine


management systems and Hospital equipment.

Memory and Storage


Primary storage consists of Random access memory (RAM), Read only memory
(ROM), Registers and Cache. It holds the data and instructions which the CPU
needs access to while a computer is running. The CPU can much more quickly
access data from primary storage than from secondary storage.

RAM holds the operating system, programs and data in use by the CPU when the
computer is running. It is Volatile, read and write, and large in comparison to ROM.

ROM holds the first instructions for when the computer is first turned on known as
the bootstrap. Programs may be stored in ROM in embedded systems. It is Non-
volatile (contents remain when power is turned off), read only, and small in
comparison to RAM.

Virtual memory is needed when there is not enough physical RAM to store the open
programs. Virtual memory is held on the hard disk.
Programs are transferred out to Virtual Memory from the RAM when they are not
currently being executed.

Programs are transferred back to RAM from Virtual Memory when they are needed.

Secondary storage is needed because ROM is read only and RAM is Volatile ( data
is lost when power is lost ). It is needed for:

 Storage of programs and data when the power is turned off.


 Semi-permanent storage of data that can change.
 Backup of data files.
 Archive of data files.

There are 3 Types of Secondary Storges :

 Optical ( CD-R/RW, DVD-R/RW, Blu-Ray )


 Low Capacity compared to other types of storage
 Slow to access data
 Thin, lightweight and portable
 Magnetic ( Hard disk drive, Tape )
 High storage capacity.
 Quick to access data.
 Has moving parts, which eventually fail.
 Hard disks perform better if they are defragmented.
 Solid State (SSD, Memory Stick, Flash Memory Cards)
 Medium storage capacity.
 Very quick to access data.
 No moving parts, very reliable.
 No noise.
 Low power.
 No need to defragment.
 Limited number of red/write cycles.
 Expensive compared to other types of storage.

The units of Data storage are : Bit, Nibble, Byte, Kilobyte, Megabyte, Gigabyte,
Terabyte, Petabyte.

A bit is a binary value of 1 or 0. A nibble is 4 bits. A byte is 8 bits. A kilo Byte is 1024
bytes. A Megabyte is 1024 Kilobytes. A Gigabyte is 1024 Megabytes. A Terabyte is
1024 Gigabytes. A Petabyte is 1024 Terabytes.

Binary is processed in 2 states. 1/0 or on/off. With two states, electronic components
are easier to manufacture, more cheaper, and more reliable.
Calculating text files = Bits per character x number of character

Calculating image file = Colour depth x image height (pixels) x image width (pixels)

Calculating sound file = Sample rate x Duration in seconds x Bit Depth

In order to convert Binary into denary you would use a Binary Number Line. Below is
an example of a 8 binary long number line. E.g. 10010101

Denary 128 64 32 16 8 4 2 1

Binary 1 0 0 1 0 1 0 1
You would then place find out the denary values for each 1. In this Binary Sequence
is the, 128, 16, 4 and 1. You would then add them together. 128 + 16 + 4 + 1 = 149.
Therefore, 10010101 = 149.

When adding two binary integers there are 5 rules to know.

 0+0=0
 1 + 0 = 1. 0+1=1
 1 + 1 = 0 ( carry the 1 )
 1 + 1 + ( carry 1 ) = 1 ( carry the 1 )
 An Overflow error occurs when a 1 is carried over into the 9 th bit section despite
the integer being only 8 bits.

Hexadecimal is another way of presenting data. They use 2 , 4 bit integers to create
a 2 digit number, letter or both.

The Symbols for hexadecimal are:


0 ,1 , 2, 3, 4, 5, 6, 7 ,8, 9, A (10) , B (11) , C (12) , D (13) , E (14) , F (15).

If you have the 8-bit binary integer: 11010101. You would split that into: 1101, and
0101

1101 = 13 = D. 0101 = 5 = 5. Therefore 11010101 = D5.

Converting Hexadecimal into denary is easily done by separating the Hexadecimal


into 2. For Example, D5 can be split into D and 5

D = 13. 13 x 16 = 208

5 = 5. 5x1=5

208 + 5 = 213. Therefore, D5 = 213 in Denary.

Binary shifts are used to double or half binary digits. Binary shifts to the left increase
the binary value whereas binary shifts to the right decrease. A 1/0 can fall off the 8 bit
binary integer if it passes the 8 bit line. Aswell as that a 0 will be automatically be
added at the end / start of the new binary integer.

A character set is a defined list of characters recognised by the computer. Each


character is represented by a unique binary number. The Character sets needed to
know are:

 ASCII – 7 bit character set with 128 different characters.


 Extended ASCII – 8 bit character set with 256 different characters.
 Unicode – 24 bit character set with over 16 million different characters

Images can be stored in binary as bitmaps or vectors. Bitmap pictures are


constructed from pixels whereas vector pictures store the mathematics to draw
coloured shapes.
Each pixel of a bitmap is stored in binary.

The number of bits required for each pixel depends on the number of colours
required. E.g. 1 bit has 2 possible values therefore can store up to 2 colours. 2 bits
have 4 different values therefore can store up to 4 colours. Number of colours can
be calculated by 2 ^ n where n is the number of bits for each pixel. The number of
bits for each pixel is also known as the colour depth.

Metadata is additional data stored with the image to define the width, height, colour
depth and colour palette.

The greater the colour depth and resolution, the larger the file size of the image.

Sound file size is the total number of bits in a sound and is calculated as the number
of samples per second x number of bits per sample x length of sample in seconds

Bit Depth is the number of bits stored per sample. The higher the number of bits, the
greater the quality of sound and larger the file size.

Sample rate is the number of samples stored per second. The higher the number of
samples per second, the higher the quality of the sound and the larger the file size.

Compression is reducing the number of bits in a file. It makes the storage capacity of
the file lower. Allowing data transfer of the file to be quicker. It is useful as more data
can be stored on a storage device and transferred in a smaller amount of time.

Lossy Compression:

 Some of the data is lost and cannot be recovered.


 Greatly reduces the file size.
 Reduces the quality of the image/sound.
 Suitable for images, sounds and video.
 Cannot be used on text and executable files.

Lossless compression:

 None of the data is lost, it is encoded differently.


 Can be turned back into the original format.
 Can be used on all types of data.
 Is usually less effective that lossy compression at reducing the file size.
 Most suitable for documents and executable files.

Computer Networks, Connections and Protocols


Advantages of networking is that users can share files. Users can share peripherals:
printers and connections to other networks, e.g. The Internet. Users can access their
files from any computer on the network. Servers can control security, software
updates and backup of data. Communications with other people, e.g. email and
social networking.

LAN (local area Network): covers a small geographical area located on a single site.
All the hardware for a LAN is owned by the organisation using it. LANs are wired with
UTP cables, fibre optic or wireless with Wi-Fi.

WAN (wide area Network): covers a large geographical area, connecting LANs
together. Infrastructure between the LANs is leased from telecommunication
companies who own and manage it. WANs are connected with telephone lines, fibre
optic cables or satellite links.

Personal home devices for connecting to The Internet tend to be multi-function, all-
in-one devices acting as: Switch, Router, Wireless Access Point.

Disadvantages of Networking; increased security risks to data, Malware and viruses


spread very easily between computers, If a server fails the computers connected to it
may not work, The computer may run more slowly if there is a lot of data traveling on
the network.

Factors that affect the performance of networks:

 Bandwidth: The amount of data that can be sent and received successfully in a
given time. This is not a measure of how fast data travels but how much data can
be sent on the transmission media. Measured in bits per second, often called bit
rate.
 Number of Users: Too many users or devices on the same network can cause the
network to slow down if there is insufficient bandwidth for the data.
 Transmission Media: Wired connections have a higher bandwidth that wireless
connections. Fibre Optic cables have higher bandwidth that copper cables.
 The Error Rate: Less reliable connections increase the number of errors that
occur in data transmission. This means data has to be resent. The signal quality
of wireless connections is dependant on the range of devices from the wireless
access point and other environmental factors. The signal quality of copper cables
is determined by the grade of material used which reduces interference. The
length of the cable is also a factor.
 Latency: The delay from transmitting data to receiving it. Latency is caused by
bottlenecks in the infrastructure of the network, e.g. by not using switches to
appropriately segment traffic on a network. Hardware such as switches and
transmission media may not operate at the same speed.

Client-server networks. A server controls access and security to one shared file
store. A server manages access to the Internet. A server manages printing jobs. A
server provides email services. A server runs a backup of data. A Client makes
request to the server for data and connections.
Advantages:

 Easier to manage security files.


 Easier to take backups of all shared data.
 Easier to install software updates to all computers.

Disadvantages:

 Can be expensive to setup and maintain.


 Requires IT specialists to maintain.
 The server is a single point of failure.
 Users will lose access if the server fails.

Peer-to-peer model. A peer is a computer on a network and is equal to all other


peers. Each peers serve their own files to each other. Each peer is responsible for its
own security. Each peer is responsible for its own backup. Peers usually have their
own printer. You can send print jobs to another peer to process but that peer would
need to be switched on to be able to communicate with the connected printer.

Advantages:

 Very easy to maintain.


 Specialist staff are not needed.
 No dependency on a single computer.
 Cheaper to set up.
 No expensive hardware required.

Disadvantages:

 The network is less secure.


 Users will need to manage their own backup.
 Can be difficult to maintain a well-ordered file store.

Hardware needed for a LAN:

 A Network Interface Controller (NIC) connects a device to a wired or wireless


networking connection.
 A NIC uses a protocol to ensure successful communication.
 Wireless access points allow wireless-enabled devices to access the network.
 Wireless connections are popular because it avoids the need to install cables.
 Bandwidth on a wireless connection is lower than a wired connection.
 Security is more of a problem on a wireless network.
 A Switch uses the NIC address on a device to route traffic.
 A Router sends data between networks.
 A Router is needed to connect a LAN to a WAN.
 A Router uses the IP address on a device to route traffic to other routers.
 Connections between desktop computers and a switch are usually made with
a UTP copper cables.
 Copper cable is cheap and flexible which makes it easier to use.
 Langer distance cables and wide area networks are usually connected with
fibre optic cables.
 Fibre optic has a higher bandwidth than copper and suffers from less
interference.

The Internet is a global collection of interconnected networks. Web addresses which


are easier for humans to remember are converted to IP addresses for routers by a
Domain Name Service resolver server. This actual process is carried out by the
Domain Name Service which is in reality multiple Domain Name Servers.

Websites are stored on servers dedicated for this purpose. This is known as hosting.
Hosted solutions provide 24/7 access, multiple users and greater security. Servers
that store data and programs remotely that can be accessed and used over The
Internet are referred to as cloud storage. They provide anything, anywhere,
automatic backup and collaboration of documents.

Servers provide services (e.g. Web server -> Web Pages, File Server -> file
storage/retrieval). Clients request / use services from a server.

A network topology is a given arrangement of all the elements that you need for
networking.

Star network Topology: Most popular wired type of network. Has a central switch. All
devices are connected into the central switch. The switch is intelligent and makes
sure traffic only goes to where it is intended. If a single cable breaks only that
computer is affected. The switch however is still a serious point of failure.

Full Mesh network: Every device is connected to every other device. Advantages is if
you get a break in any of your connections you can still route your traffic via another
route. Disadvantage is that there is a lot more cabling and switch hardware required
which will add to the cost for large networks.

Partial Mesh Network: Multiple routes exist between different devices however every
single device is not connected to every single other device. A compromise solution
which lowers the amount of hardware needed compared to a full mesh network.

Ethernet: A standard for networking technologies. Used for communicating on a


wired local area Network. Includes a number of associated protocols ( rules for
governing communication ). Provides reliable, error free, fast communication
between 2 points. Originally used in old style bus networks but is still used today in
more modern star and mesh networks. Data is transmitted in frames which include:

 Preamble of bits used to synchronise transmission.


 Start frame delimiter to signify the start of the data part of the frame.
 Source and destination MAC address.
 The Actual Data.
 Error checking information ( cyclic redundancy check – CRC )

Users location is limited by the need for a physical cable connection. A set up relying
on ethernet relies on lots of cables, connections, ports and physical hardware which
will affect cost.

Wireless networks (Wi-Fi and Bluetooth)

Wi-Fi: Common standard for wireless networks. Users can move around freely, less
expensive and easier to set up, convenient to use, can handle large number of
users, transfer of information to social media is much easier, speeds are slower than
wired networks, Relies on signal strength to the wireless access point (WAP), signal
can be obstructed, less secure than wired networks.

Bluetooth: more modern standard for wireless networking. It is ideal for connecting
personal devices, it has a very short range, is has a very low power consumption
compared to Wi-Fi.

Wireless Encryption. Wireless networks are identified by a unique Service Set


Identifiers ( SSID ). The SSID has to be used by all devices which want to connect to
that network. It can be set to automatically broadcast to any wireless device within
range of a WAP.
The SSID can be automatically set, but you are able to set it yourself. It can also be
made hidden in order to make is harder to detect. The SSID can be protected with a
password so even if it is found, devices wont be able to gain access to the wireless
network.

Encryption : Wireless networks broadcast data, so it must be encrypted to be secure.


This is done by scrambling the data into a cipher text used a ‘master key’ created
from the SSID of the network and the password. Data is decrypted by the receiver
using the same master key, so this key is not transmitted. Protocols used for wireless
encryption includes WEP, WPA, WPA2. A handshaking protocol is used to ensure
that the receiver has a valid masker key before transmission to the device begins.

Although wired networks are naturally more secure, encryption is still available and
often a sensible precaution.

MAC addressing is used to route frames on a local area network. Each MAC
address is unique to every network interface card.

IP addressing is used to route packets on a wide area network. There are 2 types,
IPc4 and IPv6. IPv4 are 32 bits in size, written in four numbers separated by
periods, each number in the range 0-255. Ipv6 are 128 bits in size, written in eight
groups separated by colons, each group made up of four hex values representing 16
bits.
A router with a unique WAN facing IP address and a LAN facing IP address. This
enabled a LAN device to have the same IP address as another device on another
LAN.

Standards are a set of specifications for hardware and software. It makes it possible
for manufacturers to create products and services which are compatible with each
other. Without standards most devices wouldn’t be able to successfully interact with
or communicate.

A Protocol is a set of rules that allow two devices to Communication.

 TCP: Provides an error free transmission between two routers.


 IP: Routes packets across a WAN. Along with TCP makes up the TCP/IP protocol
stack.
 HTTP: A client-server method of requesting and delivering HTML web pages.
 HTTPS: Encryption and authentication for requesting and delivering HTML web
pages. Used when sensitive information need to be transferred.
 FTP: Used for sending files between computers, usually on a WAN.
 POP: Retrieves emails from a mail server. Remove it from server and transfers it
to your device.
 STMP: Sends email to an email server.
 IMAP: Used by mail clients to manage remote mail boxes and retrieve email from
a mail server

The concept of layering is to divide the complex task of networking into smaller,
simpler tasks that work with each other. Advantages of layering include: Reduces the
complexity of the problem into manageable sub-problems. Devices can be
manufactured to operate at a particular layer. Products from different vendors will
work together.

 TCP/IP protocol and use of layers: This is one of the most important protocol
stacks in use today. A set of networking protocols consisting of four layers all
working together. All incoming and outgoing data packets pass up and down
through the various layers when you communicate on a network.
 The Application layer uses an appropriate protocol relating to whatever
application is being used to transmit data. ( HTTP, HTTPS, FTP etc )
 The Transport layer uses the TCP part. It is responsible for establishing an
end-to-end connections. Once the connection is made it splits the data to be
transmitted into packets. It adds to each packet: A number, the total number of
packets and the port number the packet should use.
 The network layer uses the IP part. It adds to each packet: Source IP address
and Destination IP address. All routers operate at this layer. They use the IP
address to know the destination the packets are heading to.
 The Link layer represents the actual physical connection between various
network nodes. It is responsible for adding the unique MAC address of the
source device and the MAC address of the destination device. The MAC
address is changed at each hop on the route.

Network Security
Malware is software which is specifically designed to disrupt, damage or gain
unauthorized access to a computer system. E.g. Viruses, Worms, Trojan horses,
ransomware etc.

Phishing is the fraudulent practise of sending emails pretending to be from a


reputable company in order to make individuals reveal their personal information.

Brute force attacks are a trial and error method of attempting passwords and pin
numbers. Automated software is used to generate a large number of consecutive
guesses.
Denial of Service attack ( DOS ) is flooding a server with useless traffic causing that
server to become overloaded and unavailable.

Data Interception and Theft is the unauthorised act of stealing computer-based


information from an unknowing victim with the intent of compromising privacy or
obtaining confidential information.

SQL injection is a technique used to view or change data in a database by inserting


addition code into a text input box, creating a different search string.

In order to prevent Malware you could have strong security software (Firewall, Spam
filter, Anti-virus, Anti-Spyware, Anti-Spam). Enabling OS and security software
updates. Being cautious of opening email attachment and downloading software and
backup files regularly onto removable media.

In order to prevent Phishing you could have strong security software, awareness of
spotting fake emails and websites, not disclosing personal or corporate information,
disabling browser pop-ups.

In order to prevent Brute Force Attacks, lock accounts after 3 password attempts,
use progressive delays, using effective passwords with symbols, letters, numbers
and mixed case. Using challenge response e.g. I am not a robot and ReCAPTCHA.

In order to prevent DOS attacks, you should have a strong firewall, packet filters on
routers, configuring the web server, auditing + logging and monitoring of systems.

In order to prevent Data interception and theft, you should use encryption have
virtual networks, use of passwords, locking computers, logging off and use of
portable media. Investigating you own network vulnerabilities.
In order to prevent SQL injection you should have validation on input boxes, using
parameters queries, setting database permissions and have penetration testing.

System Software
On operating system provides a platform on which the user can run programs. A
utility software is used to maintain the computer system.

A Graphical User Interface (GUI) is optimised for mouse and touch gesture inputs. It
is interactive, intuitive and visual. A Command line is text bases, less resource heavy
and has more commands compared to a GUI. It is efficient and for advanced users.
It is useful for automating processes with scripts. A Menu interface has successive
menus presented to the user. Single options chosen at each stage and often with
buttons on a keypad. Natural language interface responds to questions in spoken
language, not always reliable and is getting better all the time.

Multi-tasking is running multiple applications at the same time by giving each


application a small time-slice of processor time. This allows more than one program
to be held in memory at a time, and data shared between them such as copy and
paste. It also enabled you to listen to music on your PC at the same time as word
processing.

Memory management is when programs are loaded, the OS decides where they are
held in memory. Over time the memory becomes fragmented as programs are
loaded and closed because they use different amounts of memory. The OS must
keep track of different program fragments. When the memory is full, the OS uses
virtual memory.

Device Drivers translates OS instructions into commands that the hardware will
understand. Each peripheral need a device driver. Many are already built into the
OS.

User management:

 Providing for different users to log into a computer.


 The OS will retain settings for each user, such as icons, desktop backgrounds
etc.
 Each user may have different access rights to files and programs.
 A Client server network may impose a fixed or roaming profile for a user and
manage login requests to the network.

File Management:

 Data is stored in files.


 An extension to the filename tells the OS which application to load the file in.
 The OS may present a logical structure of files into folders and allow the user
to rename, delete, copy and move files.
Utility software:

 Encryption
 Utilizes an algorithm to scramble plain text into cipher text.
 The text can only be decrypted and made readable again with a master key.
 Defragmentation
 Reorganises files on a hard disk, putting fragments of file back together and it
collects together free space.
 This reduces the movement of read/write head across the surface of the disk
which speeds up file access
 Solid state drives should not be defragmented
 It is unnecessary as they have no moving parts.
 It also reduces their lifespan
 Data Compression
 Reduces the size of a file so that it takes up less space and is quicker to
download over The Internet.
 Compressed files must be extracted before they can be read.
 Depending on the algorithm used data is either lost, reducing the quality of an
image or sound, or represented in a different way using binary retaining the
original data in a new compressed format.

Open Source:

Users can modify and distribute the software. Can be installed on any number of
computers. Support provided by the community. Users have access to the source
code. May not be fully tested.

Proprietary source:

Users cannot modify the software. Protected by Copyright Design and Patents Act.
Usually paid for and licensed per use or per computer. Supported by developers.
Users do not have access to the source code. Tested by developers prior to release
although they may run beta programmes.

Algorithms
Abstraction is the process of removing unnecessary details and including only the
relevant details. It focuses on what is important in problem solving.

Decomposition is the breaking of a complex problem into smaller more manageable


problems. Dealing with many different stages of a problem at once is much more
difficult than breaking a problem down into a number of smaller problems and solving
each, one at a time. Advantages: Makes problems easier to solve, different people
can work on different parts of a problem at the same time reducing development
time, program components developed in one program can easily be used in other
programs.
Algorithmic thinking is a way of getting to a solution by identifying the individual steps
needed. By creating a set of rules, an algorithm that is followed precisely, leads to an
answer.

Abstraction, decomposition and algorithmic thinking are all part of computational


thinking.

An Input is anything which needs to be supplied to the program so it can meet its
goal. Often input by the user. Consider and appropriate variable name and data type
for the input.

Processes consider what calculations need to be performed while the program is


running. Does data need to change formats or data types.

Outputs consider what your program needs to output. Consider what form this output
needs to take. Consider an appropriate variable name and data type for any output.

Structure Diagrams:

 A method of designing a solution to a problem


 Illustrates problem decomposition.
 They can be used for developers to understand a problem to code and to
share with users during systems analysis.
 They are produced using a method known as stepwise refinement.
 We break the problem down using decomposition into smaller and smaller
components.
 Some areas of the program will need breaking down more than others.
 The lowest level nodes should achieve a single task.
 These can then be coded as a single module or sub-program.

A Flowchart is a method of representing the sequences of steps in an algorithm in


the form of a diagram.

 Rectangle ( Process ).
 Rhombus ( Decision ).
 Rectangle with two extra vertical lines ( Sub-routine ).
 Oval – ( Start/Terminal ).
 Parallelogram ( Input/Output )
 Arrow ( Program Direction ).

Pseudocode is an alternative, text-based way of representing the sequence of steps


in an algorithm. It can be thought of as a simplified form of programming code.

Syntax errors: Errors which break the grammatical rules of the programming
language. They stop it from being run/translated.
Logic errors: Errors which produce an unexpected output. On their own, they wont
stop the program running.

Trace Tables:

 Vital skill for understanding program flow and testing the accuracy of an
algorithm for logic is called ‘Tracing Execution’.
 Involves examining a printed extract of program code and running thorough
the program.
 Take each line at a time and write out in a trace table the current state of each
variable.
 Noting down any output the program produces.
 Each variable present in the program should have its own column in the trace
table.
 A new row should be added under any column if the state of a variable
changes.
 Trace tables are an excellent way to track down logic errors in a problem.

Binary Search: Calculate a mid-point in the data set, Check if that is the item to be
found. If not, if the item to be found is lower than the mid-point, repeat on the left
half of the data set, if the item to be found is greater than the mid-point, repeat on the
right half of the data set. Repeat until the item is found or there are no items left to
check. Required the data set to be in order of a key field. More efficient than a linear
search.
Linear Search: Starting from the beginning of a data set, each item is checked in turn
to see if it is the one being searched for. Doesn’t require the data set to be in order.
Will work on any type of storage device. Efficient for smaller data set. Very inefficient
for large data sets.
Bubble Sort: Sorts an unordered list of items. It compared each item with the next
one and swaps them if they are out of order. The algorithm finishes when no more
swaps need to be made. In effect it bubbles up the largest ( or smallest ) item to the
end of the list in successive passes. This is the most inefficient of the sorting
algorithms but is very easy to implement. This makes it a popular choice for very
small data sets.

Merge Sort: A Very efficient method of performing a sort. Uses a divide and conquer
method. Creates two or more identical sub-problems from the largest problem
solving them individually. Combines their solutions to solve the bigger program. Data
set is repeatedly split in half until each item is in its own list. Adjacent lists are then
merged back together. Works very well for large data sets.
Insertion Sort: Inserts each item into its correct position in a data set one at a time. It
is a useful algorithm for small data sets. It is particularly useful for inserting items into
an already sorted list. It is usually replaced by more efficient sorting algorithms for
large data sets.

Programming Fundamentals
A Variable is a value stored in memory that can change while the program is running.

A Constant is a value that does not change while the program is running, and is
assigned when the program is designed.

Assignment is giving a variable or constant a value.

Casting is converting a variable from one data type to another. A Variable can be an
integer, character, string, real ( float ) or Boolean.

An Input is a value that is read from an input device.

An Output is the data generated by the computer and displayed to the user.

Constants make a program easier to read as the yare usually declared and assigned
at the top of the program. Constants can easily be changed by the programmer in
one place in a program. Instead of changing every instance of a value throughout a
program. This leads to less chance of errors. The compiler can optimise the code.
This makes a program run more quickly if constants are used instead of variables.

The 3 basic programming constructs:

 Sequence: executing one instruction after another.


 Selection: Program branch depending on a condition. Some programming
languages support SWITCH/SELECT … CASE statements. A program can
branch in more than one direction depending on the value of the variable.
 Iteration: sometimes called looping, is a repeating section of code.
o FOR loops are used when the number of iterations needed is known
ahead of the iteration executing.
o WHILE loops are used when the number of iterations is not known
because the variable used to determine when the iteration ends is
changing within the iteration itself.
o DO … UNTIL loops are an alternative to WHILE loops where the code
executes at least once before the condition is checked.

 Common Arithmetic Operators


 + ( addition )
 - ( subtractions )
 * ( multiplication )
 / ( division )
 ^ ( Exponentiation )
 DIC or // ( Floor Division )
 MOD or % ( Modulus / Remainder )
 Common Comparison Operators
 == ( equals to )
 != ( does not equal to )
 < ( less than )
 <= ( less than or equal to )
 > ( greater than )
 >= ( greater than or equal to )
 Boolean Operators
 NOT
 AND
 OR
 Data Types
 Integer ( Whole Number )
 Real / Float ( Decimal Number )
 Character ( Single letter or symbol )
 String ( Combination of letters or symbols )
 Boolean ( True or False )

 String Manipulation Commands


 ( .length) To get the length of a string
 ( .upper) Returns the String in Uppercase
 ( .lower) Returns the String in Lowercase
 ( .substring ( x , y ) Returns part of a string starting at the character of the first
parameter (x) and counting up by the numbers in the second parameter (y).
 ( .left(i) ) Returns the left most character from a string where the parameter (i)
indicated how many to return.
 ( .right(i) ) Returns the right most character from a string where the parameter
(i) indicated how many to return.
 ( + ) Concatenation joins separate string values together.
 ( ASC(…) ) Returns the ASCII value of a character.
 ( CHR(…) ) Returns a character from its ASCII number.

The Stages to write data to a file are:

Open the file for creating / overwriting or appending to a file. Write the data to a file.
Close the file.

The Stages to read data from a file are:

Open the file for reading data. Assign a Boolean variable to ‘false’ to indicate the end
of file is not reached. While the end of file is false and the search item is not found:
Read the data from the file. If the data matches what is being searched for, assign
the data to variables or output. Check if the end of the file has been reached and
assign the Boolean variable to ‘True’. Close the File.

Serial text files provide us with a simple way to store data when a program is not
running. The contents of variables are effectively copied from volatile storage ( RAM
) and stored permanently in a text file, allowing the program to use the same data
when it is run again later. Storing data in text files is useful for small amounts of data
such as game configuration files.
Structured data and records can also be loaded and stored in arrays or lists. Access
to the data is extremely fast because it is being held in RAM. However, data stored in
this way cannot be accessed simultaneously by multiple users on different
computers. Arrays and lists are suitable for small data sets and typically used to hold
data read in from a file because arrays and lists are quicker to access and
manipulate.
A Record Structure is a collection of related fields. A Field is a variable. Each field in
a record can have a different data type.

Records stored in databases are often stored on remote servers. Often used to store
data shared by multiple users. Data is stored in records and fields. Use Advanced
data structures to store data efficiently. Data can be searched and sorted using
highly efficient algorithms. More secure than text files. The order of database fields is
independent of the code.

SQL is used to create, delete, modify and manipulate records in a database. Basic
commands include:

 SELECT which fields to be returned - * can be used to indicate all fields.


 FROM which table – databases can have more than one table, each with their
own unique name.
 WHERE records meet a condition – LIKE can be used as a wildcard.
An array can be though of like a variable which can contain more than one data item.
We can do that by allocating a contiguous part of memory to storing that data. A
static number of related data items are stored together in the same memory space.
Each data item has the same data type. The particular data item (element) is found
using its index. Indexes usually start at 0 for the first data item. Arrays may be single
or multi dimensional. Visualise dimensions as a column ( Single ) or Table ( Two + ).

Sub Programs ( Procedures and Functions ) :

Larger programs are developed as a set of sub-programs called subroutines.


Structuring code into sub-programs makes the code easier to read and debug.
Functions return values and create reusable program components. Procedures
create a modular structure to a program making it easier to read. Each sub-program
can easily be tested. Sub-programs can be saved into libraries and reused in other
programs.

In order to use the feature of randomness in a program you will have to import it with
the code ‘ import random ‘. Use can then use the feature ‘ randint ( x , y ) ’to
generate a random letter or number between the parameter (x) and parameter (y).

Producing Robust Programs


Input Validation is checking data input by the user meets specific criteria/ rules
before processing.

 Type Check: The input is in the correct data type.


 Range Check: The input is within a correct range.
 Presence Check: Some data has been entered.
 Format Check: The input is in the correct format.
 Length Check: The input has the correct number of characters.
By using input validation techniques a programmer can make their program more
robust, user friendly and can prevent further errors occurring later in the algorithm.

Anticipated Misuse can be: Division by zero, Communication error, Printer and other
peripheral errors, disk errors.

Authentication: Data used by systems should be secure. This can be achieved with,
Username and passwords to access systems, Recovering a password requiring
clicking on a link within the email that is sent to the registered address, Encryption of
data files. Online bots can submit data automatically to online forms. This can be
protected against by using software such as reCAPTCHA that verifies the user is
human. Programmers should also be aware of the potential for SQL injection hacks
and other methods used by hackers.

To Write maintainable code:

 Use comments to:


 Explain the purpose of the program.
 Explain sections of code. Typically selection, iterations and procedures.
 Explain unusual approach’s that were necessary.
 Visually divide sections of a program.
 Use white spaces to make sections of a program easier to see.
 Use indentation for every selection and iteration branch.
 Use Descriptive variable names and explain their purpose with a comment when
declared.
 User procedures and/or functions to:
 Structure the code.
 Eliminate duplicating code.
 Use constants declared at the top of the program.
Syntax errors are when the rules of the language have been broken. The program
will not run ( Compiled languages ). Syntax errors can happen because: Variables
are not declared or initialised before use. Incompatibility of variables types. Using
assignments incorrectly. Keywords misspelt.

Logic errors are when the program runs but does not give expected output. Logic
errors can happen because: Conditions and arithmetic operations are wrong.
Sequence of commands is wrong. Division by zero. Exceptions ( File not found )

Reasons for Testing:

 To Ensure there are no errors in the code.


 To Ensure that unauthorised access is prevented.
 To check that the program has an acceptable performance and usability.
 To check the program meets the requirements.

Types of Testing:
 Iterative Testing
 Each new module is tested as it is written.
 Program branches are checked for functionality.
 Checking new modules do not introduce new errors in existing code.
 Tests to ensure the program handles erroneous data and exceptional
situations.
 Final / Terminal Testing
 Testing that all modules work together.
 Testing the program produces the required results with normal, boundary,
invalid and erroneous data.
 Checking the program meets the requirements with real data.
 A beta test may find more errors.
 Types of Test Data:
 Normal Inputs
 Data which should be accepted by a program without causing errors.
 Boundary Inputs
 Data of the correct type which is on the edge of accepted validation
boundaries.
 Invalid Inputs
 Data of the correct type but outside accepted validation checks.
 Erroneous Inputs
 Data of the incorrect type which should be rejected by a computer system.
This includes no input being given when one is expected.

Refning Algorithms to make them more robust is about:


Writing code which anticipates a range of possible inputs.
Those inputs could be invalid data or erroneous data.
Making sure ‘bad’ data doesn’t crash the program.

Making sure prompts to the use are descriptive and helpful.

Making sure only data of the correct ‘data type’ are entered.

Checking and handling missing or blank data.

Programming Languages and IDE’s


Machine Code:

Binary representation of instruction s in a format that the CPU can decode and
execute. Have an operation code instruction and address or data to use.

Low Level Languages:


Written in Assembly language. Translated by an assembler into machine code. Used
for embedded systems and device drivers where instructing the hardware directly is
necessary. One instruction translated into one machine code instruction. The code
works on one type of processor only. The programmer works with memory directly.
Code is harder to write and understand. Memory efficient. Code is fast to execute.

High Level Languages:

Source code is written in languages such as Python, C++, Java etc. Translated by a
compiler or interpreter into machine code. Makes the writing of computer programs
easier by using commands that are like English. One source code instruction
translates to many machine code instructions. Code will run on different types of
processors. The programmer has lots of data structures to use. Code is quicker and
easier to understand and write. Less memory efficient. Code can be slower to
execute if it is not optimised.

Translators are used to turn high level source code into binary machine code for
execution.

 Compiler
 Translates source code from high level languages into object code and then
into machine code ready to be processed by the CPU.
 The Whole Program is translated into machine code before it is run.
 Advantages
 No need for translation software at run-rime.
 Speed of execution is faster.
 Code is usually optimized.
 Original source code is kept secret.
 Disadvantages
 Source code is easier to write in a high-level language, but the program
will not run with syntax errors which can make it more difficult to write the
code.
 Code needs to be recompiled when the code is changed.
 Designed for a specific type of processor.
 Interpreter
 Translates Source code from high level languages into machine code ready to
be processed by the CPU.
 The program is translated line by line as the program is running.
 Advantages
 Easy to write source code because the program will always run, stopping
when it finds a syntax error.
 Code does not need to be recompiled when the code is changed, and it is
easy to try out commands when the program has paused after finding an
error.
 This makes interpreted languages very easy for beginner programmers to
learn to write code.
 Disadvantages
 Translation software is needed at run-rime.
 Speed of execution is slower.
 Code is not optimised.
 Source code is needed.
 IDE ( Integrated Development Environment )
 Debugging tools for finding logic errors
 Breakpoints – stopping a program at a line of code during execution.
 Stepping through lines of code one at a time to check which lines are
executing.
 Tracing Through a program to output the values of variables.
 Help with preventing and identifying syntax errors.
 Illustrating keyword syntax and auto-completing command entry.
 Error highlighting.
 The compiler produces and output of the error message to help identify it.
 Providing a run-time environment
 Output window.
 Simulating different devices the program can run on.
 Usability functions
 Navigation, showing/hiding sections of code.
 Formatting source code
 Find and replace
 Comment or indent regions.

You might also like