1.1 Data, Information and Knowledge
1.1 Data, Information and Knowledge
1.1 Data, Information and Knowledge
Direct data source: data that is collected for the purpose for which it will be used
Indirect data source: data that was collected for a different purpose (secondary
source)
Codec: a computer program that encodes and decodes a digital data stream
ASCII (American Standard Code for Information Interchange) is a common
method for encoding text
Images are encoded as bitmaps through various parameters (such as
width/height, bit count, compression type, horizontal/vertical resolution, and rast
data.)
The graphic screens are made up of tiny grid called pixels. The more pixels=high
resolution= better quality= more storage needed.
Bitmap images are widely used on smartphones, cameras and online. Bitmaps are
organized as grids that are colored squares of pixels. That’s why when zooming in
a image, pixels are stretched into larger blocks. This is why bitmap appear as poor
quality when enlarged.
A byte consists of eight bits and so will represent eight pixels
Each colour of an image is stored as
a binary number.
As the image gets bigger its takes larger storage.
Therefore, a method called run-length encoding
(RLE) can be used to reduce the amount of storage
space that is used. This is known as compression.
Sometimes when files are compressed, they use lossy compression, which means
some of the original data is removed and the quality is reduced.
• Images are often encoded into file types such as: common bitmap image file
type
o JPEG/JPG (Joint Photographic Experts Group)
o GIF (Graphics Interchange Format)
o PNG (Portable Network Graphics)
o SVG (Scalable Vector Graphics)
Sound is encoded by storing the sample rate, bit depth and bit rate.
When sound is recorded, it is converted from original analogue format to a digital
format, which is broken down into thousands of samples per second. Each sound
sample is stored as binary data.
The Sample rate or frequency is the number of audio samples per second.
Measure in Hertz (Hz)
The higher the sample rate, the higher the quality of the music, but also the more
storage that is required. Each sample is stored as binary number
The bit depth is the number of bits (Is and Os) used for each sound clip. A higher
bit depth will give a higher quality sound.
The bit rate is the number of bits processed every second.
• bit rate= sample rate x bit depth x number of channels
• bit rate is measured in kilobits per second (kbps)
• Uncompressed encoding uses WAV (Waveform Audio File Format)
lossy compression: reduces files size by reducing bit rate, causing some loss in
quality
lossless compression: reduces the file size without losing any quality but can only
reduce the file size to about 50%
CD sound file has a sample rate of 44.1kHz (44100Hz), a bit depth of 16 bits and
two channels (left and right for stereo).
bit rate=44100 x 16 x 2=1411200 bps=1.4mbps (megabits per second)
That means that 1.4 megabits are required to store every second of audio.
bit rate by the number of seconds to find the file size.
file size (in bits)=1411200 x 210=296352000 (296 megabits)
There are eight bits in a byte and we use bytes to measure storage, so the
file size in bits is divided by eight:
file size (in bytes)=296352, 000+8=37044000 megabytes=37MB
(megabytes)
Video
when video is encoded it needs to store images and sound. Images are stored as
frames, standard quality video have 24 frames per second (fps)normally. High
quality video uses 50 fps and 60 fps. The higher the frames per second= larger
storage= but higher quality.
Size: A HD video will have an image size of1920 pixels wide and 1080 pixels high.
The bit rate of video include both audio and frames. The bit rate is the number of
bits to be processed every second. A higher frame rate requires a higher bit rate.
Example: in one hour, eight-bit HD video with 24 fps would require 334GB
(gigabytes) of storage, which is too much data to download. Therefore,
compression is required. Compression involves:
Resolution
Bit rate
Image size
These all result in lossy compression. (Digital video (DV)
Advantages of encoding Disadvantages of encoding
Reduced file size The required codecs cannot be
installed so file cannot be saved in the
desired format.
Enable real time streaming of video The necessary codecs need to be
and music in restricted bandwidth installed to open encoded files.
Reduce time take to download files Not all software is able to open
different file types.
Enable different formats to be used some hardware such as music and
(GIF allowing animated videos) video players only play files encoded in
certain formats ( Cd player playing
mp3 files but cannot download it)
Easy to download music, images or quality of images, sound and videos is
video from websites. lost when files are compressed using
lossy compression
text encoded using ASCII or UNICODE
needs to be decoded using the correct
format when it is opened.
Verification: the process of checking whether data entered into the system
matches the original source.
• Visual checking: Visually checking the data if it matches the original source, by
reading and comparing, usually by the user.
Visual checking does not ensure that the data entered is correct. If the original
data is wrong, then the verification process may still pass
For example, if the intended data is ABCD but ABC is on the source document,
then ABC will be entered into the computer and verified, but it should have been
ABCD in the first place
• Double data entry: Data is input into the system twice and checked for
consistency by comparing.
The two items of data are compared by the computer system and if they match,
then they are verified.
If there are any differences, then one of the inputs must have been incorrect.
By using both validation and verification, the chances of entering incorrect data
are reduced.
If data that is incorrect passes a validation check, then the verification check is
likely to spot the error.
The validation rule is that a person’s gender must be a single letter. N is entered.
This passes the validation check but is clearly incorrect. When verified using
double entry, the user enters N first followed by M the second time. The
verification process has identified the error. However, it is still possible that the
user could enter N twice and both the validation and verification processes would
fail.
The correct letter entered in M.
1.06 Summary
Information has context and meaning so a person knows what it means. The
quality of information can be affected by the accuracy, relevance, age, level of
detail and completeness of the information. Proofreading is the process of
checking information.
Data are raw numbers, letters, symbols, sounds or images without meaning.
Knowledge allows data to be interpreted and is based on rules and facts. Static
data does not normally change. Dynamic data updates as a result of the source
data changing. Data collected from a direct data source (primary source) must be
used for the same purpose for which it was collected. Data collected from an
indirect source (secondary source) already existed for another purpose.
Coding is the process of representing data by assigning a code to it for
classification or identification. Encoding is the process of storing data in a specific
format. Encryption is when data is scrambled so that it cannot be understood.
Validation ensures that data is sensible and allowed. Validation checks include a
presence check, range check, type check, length check, format check and check
digit. Verification is the process of checking data has been transferred correctly.
Verification can be done visually or by double data entry.