Assessment 3-Group Assignment

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

ICT505 Data Analytics

Assessment 3 – Group Assignment (30%: 20% Report and 10%


Group Presentation)
Overview

 Students are required to submit a report of approximately 1500 words along with exhibits
to support findings with respect to the provided Airbnb dataset. This report should
consist of a review on data quality issues, common techniques used for data cleaning,
and data pre-processing. Additionally, students are required to present their work in the
class on week 8 in 10 min.
- Introduction:
Airbnb has significantly disrupted the traditional hospitality industry, with more travelers opting
for Airbnb as their primary accommodation provider. In this assignment, students will preprocess
and clean the Barwon South West dataset obtained from Inside Airbnb to ensure data quality and
prepare it for further analysis.
- Dataset
Inside Airbnb - Barwon South West, Vic, Victoria, Australia

The dataset could be downloaded from the link below:

http://data.insideairbnb.com/australia/vic/barwon-south-west-vic/2023-
12-26/visualisations/listings.csv

-Tasks for the Report:


Task 1: Identify Common Data Quality Issues (5 marks)
Explain common data quality issues present in the dataset that require cleaning. This includes
but is not limited to missing values, inconsistent data formats, outliers, and duplicate records.
Explain about data quality issues in Airbnb dataset.

Task 2: Describe Data Cleaning Techniques (5 marks)


Explain common techniques used for data cleaning, including handling missing values, handling
outliers, dealing with inconsistencies, and removing duplicates. Provide examples of how these
techniques can be applied to the Airbnb dataset.
Task 3: Data Pre-processing (8 marks)
Perform data pre-processing to prepare the dataset for analysis. This includes selecting relevant
columns, cleaning the dataset by removing irrelevant information, handling missing values, and
transforming data formats. Students can do the pre-processing phase in Excel. They are required
to provide an explanation of the preprocessed data and upload it to Moodle.

Report Structure:

Cover Sheet
Title Page
Table of Contents
Common Data Quality Issues (5 marks)
Data Cleaning Techniques (5 marks)
Data Pre-processing (8 marks)
References (2 marks)

This unit requires you to use APA system of referencing. See Sydney International’s quick
reference guide. It should be used in conjunction with the online tool Academic Writer:
https://extras.apa.org/apastyle/basics-7e/#/.

Report (20 marks): Expected word count 1,500 words Students are expected to submit their
assessments via Turnitin on Moodle. Group Report Due is Sunday Week 7, 5th of May.

Presentation (10 marks) : 10 minute group presentation delivered in-class on week 8.

Group Formation: Students are responsible for self-assigning themselves to a group in Moodle.
Each group consists of 4 members.
Marking Rubric:

Task Description Poor (1) Fair (2) Good (3) Very Good (4) Excellent (5)
Identify and Incomplete or Identification of at least Accurate Accurate Accurate
Identify explain at inaccurate three common data identification identification of identification of
Common least three identification quality issues and/or of at least three at least three more than three
Data Quality common data of data quality incomplete or inaccurate common data common data common data
Issues quality issues issues and/or explanation of impact. quality issues quality issues quality issues that
(5 marks) that require no explanation that require that require require cleaning
cleaning, and of impact. cleaning and cleaning and and
describe how clear detailed comprehensive
each issue can explanation of explanation of explanation of
impact data how each issue how each issue how each issue
analysis. can impact can impact data can impact data
data analysis. analysis. analysis.
Describe Identify and Does not Identifies one or two data Identifies three Identifies three Correctly
Data describe at identify any cleaning techniques but common data common data identifies and
Cleaning least three data cleaning with inaccurate cleaning cleaning provides clear,
Techniques common techniques descriptions or lacks techniques techniques with thorough, and
(5 marks) techniques and/or provides clarity and with brief or adequate comprehensive
used for data inaccurate understandingIdentifies vague descriptions explanations for
cleaning descriptions demonstrating each of the three
one or two data cleaning descriptions
lacking clarity understanding common data
techniques but with lacking clarity.
and of their purpose cleaning
understanding inaccurate descriptions or and application. techniques,
lacks clarity and demonstrating
understanding. exceptional
understanding of
their purpose and
applicatio
Data Pre- Pre-process Significant Minor errors in pre- Proper pre- Proper pre- Proper pre-
processing the data by errors in pre- processing or incomplete processing processing with processing with
(8 marks) selecting processing or pre-processing. with some minimal errors no errors and
relevant incomplete errors or lack and clear comprehensive
columns, pre-processing. of explanation explanation for explanation for
handling for some most decisions. all decisions.
missing decisions.
values,
cleaning
prices,
formatting
data, and
explaining
pre-
processing
decisions.
Referencing Referencing Inadequate or Referencing is present but Referencing is Referencing is Referencing is
(2 marks) no referencing incomplete or mostly complete and thorough,
provided. inconsistent. complete and consistent, with accurate, and
consistent, few errors consistently
with minor follows the
errors. specified style
guide without
errors.

TEQSA: PRV14311 Australia Advance Education Group Pty Ltd. trading as P a g e |3


CRICOS: 03836J Sydney International School of Technology and Commerce
ABN 74 613 055 440 |ACN 613 055 440
Level 14/233 Castlereagh Street, Sydney NSW 2000

You might also like