1639 GCS210462 NguyenQuangHuyASM1 Brief
1639 GCS210462 NguyenQuangHuyASM1 Brief
1639 GCS210462 NguyenQuangHuyASM1 Brief
Assignment Brief 1
Submission Format
Format The submission is in the form of an individual written report that shows
how you have manage the project. This should be written in a concise,
formal business style using single spacing and font size 12. You are required
to make use of headings, paragraphs and subsections as appropriate, and
all work must be supported with research and referenced using the Harvard
referencing system. Please also provide a bibliography using the Harvard
referencing system.
Submission Students are compulsory to submit the assignment in due date and in a way
requested by the Tutors. The form of submission will be a soft copy in PDF
posted on corresponding course of http://cms.greenwich.edu.vn/
Note The Assignment must be your own work, and not copied by or from another
student or from books etc. If you use ideas, quotes or data (such as
diagrams) from books, journals or other sources, you must reference your
sources, using the Harvard style. Make sure that you know how to reference
properly, and that understand the guidelines on plagiarism. If you do not,
you definitely get fail.
LO1: Examine appropriate research methodologies and approaches as part of the research
process.
LO2: Conduct and analyse research relevant for a computing research project
LO3: Communicate the outcomes of a research project to identified stakeholders
The assignment offers students the chance to explore various aspects of big data from the
perspective of computing professionals or data scientists. It also encourages investigations
into the applications, benefits, limitations, and responsibilities associated with big data and
provides solutions to the problems it aims to solve.
Vocational scenario
Introduction to theme
Big Data
Over the past decade, the term "big data" has gained increasing popularity. Initially, it
referred to data generated in massive volumes, such as internet search queries, weather
sensor data, and social media information. Nowadays, big data represents large amounts
of information from diverse sources that cannot be processed conventionally or without
computational intervention. Big data can be stored in structured, unstructured, or semi-
structured formats. Many systems and organizations generate massive quantities of big
data on a daily basis, some of which are publicly available for analysis. Consequently,
machine learning systems have been developed to sift through this data, rapidly identify
patterns, and solve problems. This has led to the emergence of data science analytics as a
discipline to design, build, and test machine learning and artificial intelligence systems.
Leveraging big data requires a broad range of knowledge and skills, creating new
opportunities for previously inaccessible organizations. It allows businesses to gain a
comprehensive understanding of global trends, enabling more accurate and up-to-date
decision-making. Big data can help identify potential business risks earlier and minimize
costs without compromising innovation. However, the rapid application of big data raises
concerns about security, the ethical storage of personal data from multiple sources, and
the sustainability of energy requirements in large data warehouses.
Task
Students are to choose their own research topic for this unit. Strong research projects are
those with clear, well focused and defined objectives. A central skill in selecting a research
objective is the ability to select a suitable and focused research objective. One of the best
ways to do this is to put it in the form of a question. Students should be encouraged by
tutors to discuss a variety of topics related to the theme to generate ideas for a good
research objective.
The range of topics discussed could cover the following:
• Storage models.
• Cyber security risks.
You have to set your own research question in the research proposal base on the previous
range of topic. The research question must be specific enough example: the audience of
the research (job, age..), kind of devices(personal devices, household appliances, or
combination of some kinds).
Recommended Resources
Report: Big Data & Investment Management: The Potential to Quantify Traditionally Qualitative
Factors https://tinyurl.com/yff4uenz
Video: Big Data In 5 Minutes|What Is Big Data?|Introduction To Big Data|Big Data Explained
https://www.youtube.com/watch?v=bAyrObl7TYE
Book: Principles and Practice of Big Data Preparing, Sharing, and Analysing Complex Information
https://www.sciencedirect.com/book/9780128156094/principles-and-practice-of-big-data
Book: Systems Simulation and Modelling for Cloud Computing and Big Data Applications
https://tinyurl.com/2s3wkehn
Journal: Big Data with Cloud Computing: Discussions and Challenges https://www.sciopen.com/arti
cle/pdf/10.26599/BDMA.2021.9020016.pdf
Journal: The social implications, risks, challenges and opportunities of big data
https://tinyurl.com/yw593svk
Journal: Policy discussion – Challenges of big data and analytics driven demand-side management
https://tinyurl.com/kyb3j6x7
Journal: Towards felicitous decision making: An overview on challenges and trends of Big Data
https://www.sciencedirect.com/science/article/abs/pii/S002 0025516304868
P1. Appropriate research question, aim, related documents in the research proposal .............................. 8
1. Introduction .......................................................................................................................................... 8
P2. Examine appropriate research menthods and approaches to primary and secondary research .......... 9
2. Conclusion ........................................................................................................................................... 11
P3. Conduct primary and secondary research using appropriate methods for a computing research
project that consider costs, access and ethical issues ................................................................................ 11
P4. Apply appropriate analytical tools, analyze research findings and data .............................................. 12
P5. Communicate research outcomes in an appropriate manner for the intended audience. ................. 14
REFERENCES ................................................................................................................................................ 18
1. Introduction
Introduce
Our project is to research and develop software that helps users detect and remove malicious code from
websites based on big data. This will be software used to detect malicious code from sites with large
data and prevent it from problems related to personal data to protect user information. In addition, the
project may encounter problems that arise during project implementation. Our task is to find reasonable
solutions to solve those problems and develop more systems.
Example of ojectives:
Objective a: Collect data from thousands of popular websites, including those of large organizations,
online stores, and online forums.
Objective b: Develop a machine learning algorithm based on big data to detect the behavior of malicious
code, such as abnormal sequences in source code or malicious activity.
Objective c: Build a cross-platform user interface application, allowing users to view notifications about
malware threats and remove them easily.
Potential issues that arise throughout the project:
Big data processing complexity: Processing and storing big data from millions of websites requires
robust database systems and server infrastructure. This can put great pressure on infrastructure and
technical resources.
Detection for new and advancing malware: Online malware is constantly evolving, and detecting new
manifestations of malware can be challenging. Software must be updated regularly to keep up with this
change.
Security and privacy: Collecting and processing data from websites can cause privacy and security issues.
There must be measures in place to protect data from attack and misuse.
1. Literature review
Research Methodologies
Research methodology is a way of explaining how a researcher intends to carry out their research. It's a
logical, systematic plan to resolve a research problem. A methodology details a researcher's approach to
the research to ensure reliable, valid results that address their aims and objectives. It encompasses what
data they're going to collect and where from, as well as how it's being collected and analyzed. A
research methodology gives research legitimacy and provides scientifically sound findings. It also
provides a detailed plan that helps to keep researchers on track, making the process smooth, effective
and manageable. A researcher's methodology allows the reader to understand the approach and
methods used to reach conclusions.
• Other researchers who want to replicate the research have enough information to do so.
• Researchers who receive criticism can refer to the methodology and explain their approach.
• It can help provide researchers with a specific plan to follow throughout their research.
• The methodology design process helps researchers select the correct methods for the objectives.
• It allows researchers to document what they intend to achieve with the research from the outset.
Primary research: Primary research refers to research that has involved the collection of original data
specific to a particular research project (Gratton & Jones, 2010). When doing primary research, the
researcher gathers information first-hand rather than relying on available information in databases and
other publications.
• Surveys
2. Conclusion
There are many research methods, but I decided to choose primary research. Because Primary
research guarantees that the information collected is up-to-date and relevant, enabling accurate
trends to be revealed. And it cost low price.
P3. Conduct primary and secondary research using appropriate methods for a
computing research project that consider costs, access and ethical issues
2. Primary Research
To better understand Software research and development makes it possible for users to detect and
remove malicious code from websites based on big data. The main function used by primary research
are:
Survey: Include questions related to Software research and development to gather opinion from social
media.
Interview: Conduct with people to understand them better and gather detail
Focus group: Host a group of people with diverse opinion and ideas.
Survey Method
We using Google Forms (A free tool for creating online survey and questionaires) and interview and
Focus Group.
Age: User ranger from 18 to 45 years old, Including middle and young age groups.
Figure :
Customize and format visualization: After we have connected the dataset, we are going to choose a
suitable row and a suitable column and fill it into bar.
Create dashboard and import into Tableau Public website: After we completed all sheet about our
dataset, we would combined all the dataset and edited a dashboard including all our sheets. After that,
we would import into Tableau Public website to share our group and many other user access.
We must analyze the concerns of stakeholders, identify phenomena and organize the implementation of
recommendations. The report will be vague and unusable if the implementing agency and interests of
stakeholders are not identified. Stakeholders in this project:
⚫ Teammates: My team has supported me a lot in choosing the approach and topic development
method to apply to my project research as well as directing the construction of a reasonable
research process to develop the project. Leverage your strengths and overcome your weaknesses
to complete your research project as effectively as possible.
⚫ The people who conducted the interviews and surveys: They are very important people who
contributed greatly to my research project.
⚫ Myself: Much research has helped me create a Website project and see its importance at every
stage of my research and development project.
Survey Analysis
This chart displays survey respondents' agreement density for their roles of 'Student', 'Housewife', 'Self-
employed' and 'Employee' with different levels of agreement opinions vary from 'Disagree' to 'Agree'.
And this chart is completed including 2 elements:
This chart shows the gender percentage of survey participants by age, and the majority were male
This chart displays survey respondents' agreement density based on gender with varying levels of
agreement from 'Strongly Disagree' to 'Agree'. And this chart is completed including 2 elements:
Column: “Gender”, “Do you think it is really important to prevent malicious code on websites?”
https://brandongaille.com/14-pros-and-cons-of-business-intelligence/
3. GLADUN, S. 2022. Auto Sales Apps: Why Car Dealers Need a Mobile App [Online]. Available:
https://agilie.com/blog/auto-sales-apps-why-car-dealers-need-a-mobile-app
5. PICKELL, D. 2023. Structured vs. Unstructured Data: What's the Difference? [Online]. Available:
https://www.g2.com/articles/structured-vs-unstructured-data
https://www.profit.co/blog/behavioral-economics/enabling-decision-making-across-
organizationallevels/
7. SAMSUKHA, A. 2020. Car Buy-Sell Mobile App Development Cost & Key Features [Online]. Available:
https://www.emizentech.com/blog/how-to-develop-car-buy-sell-mobile-app.html
https://www.selecthub.com/business-intelligence/critical-business-intelligence-features/