Assignment #1-Fall 2023

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

DS 5002 - Data Science: Tools & Techniques

Fall 2023

Assignment # 1
Max Marks: 100 Due Date: September 12, 2023

Q1. Perform experiments using WEKA tool with three classification algorithms on three different datasets (Small,
Medium, & Large). The datasets must be downloaded from any of the following sources:
1. https://www.kaggle.com/datasets
2. https://www.dataquest.io/blog/free-datasets-for-projects/
3. https://www.data.gov/
4. http://archive.ics.uci.edu/ml/datasets.html
5. https://www.springboard.com/blog/free-public-data-sets-data-science-project/
6. https://www.forbes.com/sites/bernardmarr/2016/02/12/big-data-35-brilliant-and-free-data-sources-for-
2016/#68239d7cb54d
7. https://github.com/awesomedata/awesome-public-datasets

Present all your results in a compact form (tables and graphs). Intelligently comment on the results relating them to the
evaluation approaches and algorithms. The Experimenter interface (rather than the Explorer interface) of WEKA may be a
more efficient way of performing experiments. Read about WEKA interfaces and implementation on the web tutorial to
gain better understanding. (30)
 Plagiarized assignments will get zero credit and will result in F grade.
The submission consists of a detailed report along with details of the algorithms used, experimental setup, data sets
used, and analysis of results (tables, graphs etc.). WEKA:
8. http://www.cs.waikato.ac.nz/ml/weka/
9. http://www.cs.waikato.ac.nz/ml/weka/downloading.html

Q2. Draw and describe the data science pipeline for efficient monitoring and management of Smart City Crime
Management System. Which tools, algorithms and techniques can be used from (AI, data mining, machine
learning & data science) perspective? Describe the Big Data aspect of “Variety”, “Volume”, & “Velocity”
component with examples in this System.
You need to monitor (additionally) different roads and traffic round the clock, road condition, location,
suspicious activity/behavior, accidents, theft, robbery, car snatching, ambulances, stampede, processions, etc.
Explain the use of each module for specific applications w.r.t. Lahore Safe City Project? (30)
Reference:
https://psca.gop.pk/
https://www.arup.com/projects/the-lahore-safe-city-project
https://punjabpolice.gov.pk/node/6403

Q3. Comparative analysis of open source as well as commercial data analysis tools [data mining tools, machine learning
tools, statistical analysis tools]. Can we use ChatGPT and/or BARD in for Data Science applications (20)
Q4. How Data Science can be used to monitor, manage, and predict Elections? How can we reduce the socio-economic
impact of elections in Pakistan by using ICT/IT? You need to discuss election related issues, damages, & impact on
governance, policies, etc., Propose a complete Smart Elections Management System for Pakistan (20)
GOOD LUCK

You might also like