Caoi Baq Lesson2
Caoi Baq Lesson2
Caoi Baq Lesson2
Generally, there are six main steps in the data processing cycle:
Step 1: Collection
The collection of raw data is the first step of the data processing cycle. The type of raw data collected has a huge
impact on the output produced. Hence, raw data should be gathered from defined and accurate sources so that the
subsequent findings are valid and usable. Raw data can include monetary figures, website cookies, profit/loss statements
of a company, user behavior, etc.
Step 2: Preparation
Data preparation or data cleaning is the process of sorting and filtering the raw data to remove unnecessary and
inaccurate data. Raw data is checked for errors, duplication, miscalculations or missing data, and transformed into a
suitable form for further analysis and processing. This is done to ensure that only the highest quality data is fed into the
processing unit.
The purpose of this step to remove bad data (redundant, incomplete, or incorrect data) so as to begin assembling
high-quality information so that it can be used in the best possible way for business intelligence.
Step 3: Input
In this step, the raw data is converted into machine readable form and fed into the processing unit. This can be in the form
of data entry through a keyboard, scanner or any other input source.
Step 5: Output
The data is finally transmitted and displayed to the user in a readable form like graphs, tables, vector files, audio,
video, documents, etc. This output can be stored and further processed in the next data processing cycle.
Step 6: Storage
The last step of the data processing cycle is storage, where data and metadata are stored for further use. This
allows for quick access and retrieval of information whenever needed, and also allows it to be used as input in the next
data processing cycle directly.
Type Uses
Batch Processing Data is collected and processed in batches. Used for large amounts of data.
Eg: payroll system
Real-time Data is processed within seconds when the input is given. Used for small amounts of data.
Processing Eg: withdrawing money from ATM
Online Data is automatically fed into the CPU as soon as it becomes available. Used for continuous
Processing processing of data.
Eg: barcode scanning
Multiprocessing Data is broken down into frames and processed using two or more CPUs within a single computer
system. Also known as parallel processing.
Eg: weather forecasting
Time-sharing Allocates computer resources and data in time slots to several users simultaneously.
Want to begin your career as a Big Data Engineer? Check out the Big Data Engineer Training Course and get
certified.
FAQs
Q1. What is Manual Data Processing?
Manual Data Processing is when the entire process is done by humans without using any automation service or
electronic devices. It’s a low cost method of data processing but it is definitely time and labor intensive.
Q2. What is Mechanical Data Processing?
In Mechanical Data Processing, data is processed without human intervention using machines and computers to
automate the process. This includes using simple devices such as calculators, typewriters, etc. With the mechanical
data process, there are less errors and the processing is faster and less intensive.
Q3. What is Electronic Data Processing?
Electronic Data Processing or EDP is the use of automated methods to process commercial data. This process uses
computers to process simple data in large volumes. Examples of this include stock inventory, banking transactions,
etc. This process does not include human intervention and is prone to fewer errors.
Q4. What is Batch Data Processing?
Batch Data Processing is when processing and analysis happens on data that has been stored for a longer period
of time. This process is often applied to large datasets such as payroll, credit card or banking transactions, etc.
Q5. What is Real-time Data Processing?
Real-time Data Processing is when data is processed quickly and in a short-period of time. This system is used
when results are required in a short amount of time, for example stock selling.
Q6. What is Automatic Data Processing
Automatic Data Processing is when a tool or software is used to store, organize, filter and analyze the data. It is
also known as Automated Data Processing.
Name: Date:
Section: Score:
I. IDENTIFICATION: Identify the word/term that is defined or described by the following statements/ example. (5
Points)
1. ____________________, the process of sorting and filtering the raw data to remove unnecessary and inaccurate
data.
2. ____________________, the raw data is converted into machine readable form and fed into the processing unit.
3. ____________________, the raw data is subjected to various data processing methods using machine learning
and artificial intelligence algorithms to generate a desirable output.
4. ____________________, The data is finally transmitted and displayed to the user in a readable form like graphs,
tables, vector files, audio, video, documents, etc. This output can be stored and further processed in the next data
processing cycle.
5. ____________________, The last step of the data processing cycle is storage, where data and metadata are stored
for further use.
II. TRUE or FALSE: Read each statement below carefully. Write TRUE on the space AFTER each number if you think a
statement it corrects or right or FALSE if you think the statement is incorrect or wrong. (5 Points)
1. ___________, Batch Processing means data is collected and processed in batches. Used for large amounts of data.
2. ___________, Real-time Processing means data is processed within seconds when the input is given.
3. ___________, Online Processing means data is automatically fed into the CPU as soon as it becomes available.
4. ___________, Multiprocessing is also known as parallel processing.
5. ___________, Time-sharing Allocates computer resources and data in time slots to several users simultaneously.
III. ESSAY: Discuss briefly and give or enumerate example. Please use the back of the paper. (10 Points)
What is the future of Data Processing?