Assignment DMBI 2

Uploaded by

This document contains questions about analyzing and summarizing statistical data. It asks the reader to calculate measures of central tendency (mean, median, mode) and dispersion (range, interquartile range) for several datasets. It also asks the reader to compute distances between data points and discuss similarity measures for different data types.

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Assignment DMBI 2

Uploaded by

IMMORTAL'S PLAYZ

0% found this document useful (0 votes)

139 views2 pages

Original Title

assignment DMBI 2.docx

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

139 views2 pages

Assignment DMBI 2

Uploaded by

IMMORTAL'S PLAYZ

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 2

Search inside document

Chapter 2

2.1 Give three additional commonly used statistical measures that are not already illustrated in this
chapter for the characterization of data dispersion. Discuss how they can be computed efficiently in
large databases.
2.2 Suppose that the data for analysis includes the attribute age. The age values for the data tuples
are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35,
36, 40, 45, 46, 52, 70.
(a) What is the mean of the data? What is the median?
(b) What is the mode of the data? Comment on the data’s modality (i.e., bimodal, trimodal, etc.).
(c) What is the midrange of the data?
(d) Can you find (roughly) the first quartile(Q1) and the third quartile(Q3) of the data?
(e) Give the five-number summary of the data.
(f) Show a boxplot of the data.
(g) How is a quantile–quantile plot different from a quantile plot?
2.3 Suppose that the values for a given set of data are grouped into intervals. The intervals and
corresponding frequencies are as follows:
age frequency
1–5 200
6–15 450
16–20 300
21–50 1500
51–80 700
81–110 44
Compute an approximate median value for the data.
2.4 Suppose that a hospital tested the age and body fat data for 18 randomly selected adults with
the following results:
age 23 23 27 27 39 41 47 49 50 52 54 54 56 57 58 58 60 61
%fat 9.5 26.5 7.8 17.8 31.4 25.9 27.4 27.2 31.2 34.6 42.5 28.8 33.4 30.2 34.1 32.9 41.2 35.7
(a) Calculate the mean, median, and standard deviation of age and % fat.
(b) Draw the boxplots for age and %fat.
(c) Draw a scatter plot and a q-q plot based on these two variables.
2.5 Briefly outline how to compute the dissimilarity between objects described by the following:
(a) Nominal attributes (b) Asymmetric binary attributes (c) Numeric attributes (d) Term-frequency
vectors
2.6 Given two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8):
(a) Compute the Euclidean distance between the two objects.
(b) Compute the Manhattan distance between the two objects.
(c) Compute the Minkowski distance between the two objects, using q=3.
(d) Compute the supremum distance between the two objects.
2.7 The median is one of the most important holistic measures in data analysis. Propose several
methods for median approximation. Analyze their respective complexity under different parameter
settings and decide to what extent the real value can be approximated. Moreover, suggest a
heuristic strategy to balance between accuracy and complexity and then apply it to all methods you
have given.
2.8 It is important to define or select similarity measures in data analysis. However, there is no
commonly accepted subjective similarity measure. Results can vary depending on the similarity
measures used. Nonetheless, seemingly different similarity measures may be equivalent after some
transformation. Suppose we have the following 2-D data set:
A1 A2
x1 1.5 1.7
x2 2.0 1.9
x3 1.6 1.8
x4 1.2 1.5
x5 1.5 1.0
(a) Consider the data as 2-D data points. Given a new data point, x = (1.4,1.6) as a query, rank the
database points based on similarity with the query using Euclidean distance, Manhattan distance,
supremum distance, and cosine similarity.
(b) Normalizethedatasettomakethenormofeachdatapointequalto1.UseEuclidean distance on the
transformed data to rank the data points.

Education - Post 12th Standard - CSV
Document11 pages
Education - Post 12th Standard - CSV
Zohaib Imam
88% (16)
Practical Engineering, Process, and Reliability Statistics
From Everand
Practical Engineering, Process, and Reliability Statistics
Mark Allen Durivage
No ratings yet
Unilorin Post Utme Past Questions PDF
Document293 pages
Unilorin Post Utme Past Questions PDF
Daniel Ayodeji Olawusi
100% (2)
18Pmbf10 Application of Spss in Research 2 0 2 2
Document14 pages
18Pmbf10 Application of Spss in Research 2 0 2 2
Shanthi
No ratings yet
TahafutAl FalasifahtheIncoherenceOfPhilosophers ImamAl Ghazali Text
Document138 pages
TahafutAl FalasifahtheIncoherenceOfPhilosophers ImamAl Ghazali Text
Sulaeman
No ratings yet
Theories of Profit
Document6 pages
Theories of Profit
vinati
100% (1)
Data Mining Solution
Document7 pages
Data Mining Solution
Fritzie West
No ratings yet
No 2
Document2 pages
No 2
Asyraf Gary
No ratings yet
21CS63 - Unit1 Practice Questions
Document3 pages
21CS63 - Unit1 Practice Questions
chaithanyasgowda10
No ratings yet
DM&DW Individual Assignment (50%)
Document4 pages
DM&DW Individual Assignment (50%)
abrham
No ratings yet
Data Mining Worksheet One
Document2 pages
Data Mining Worksheet One
Abrham Danail
No ratings yet
E-Tivity 2.2 Tharcisse 217010849
Document7 pages
E-Tivity 2.2 Tharcisse 217010849
Tharcisse Tossen Tharry
No ratings yet
Unit 1 Assignment
Document6 pages
Unit 1 Assignment
Vishnu Karthik
0% (1)
Getting To Know Your Data: 2.1 Exercises
Document8 pages
Getting To Know Your Data: 2.1 Exercises
bilkeralle
100% (1)
Data Mining Assignment 2
Document2 pages
Data Mining Assignment 2
tempman tempman
No ratings yet
FDS Pyq2
Document10 pages
FDS Pyq2
sonuchaure548
No ratings yet
Assignment 1
Document2 pages
Assignment 1
Afaan Ali
No ratings yet
Probability and Statistics Week 1 Text Book
Document10 pages
Probability and Statistics Week 1 Text Book
Hina Hanif Usman
No ratings yet
Use of Statistics by Scientist
Document22 pages
Use of Statistics by Scientist
vzimak2355
No ratings yet
Updated Cs3352 - Foundations of Data Science - Duraimurugan
Document16 pages
Updated Cs3352 - Foundations of Data Science - Duraimurugan
athirayanpericse
No ratings yet
Data Gathering, Organization, Presentation and Interpretation
Document10 pages
Data Gathering, Organization, Presentation and Interpretation
Annie Claire Visoria
No ratings yet
Department of Computer Science and Engineering
Document3 pages
Department of Computer Science and Engineering
Md.Ashiqur Rahman
No ratings yet
Cia1 Paper
Document2 pages
Cia1 Paper
vik
No ratings yet
DA Exam Paper
Document6 pages
DA Exam Paper
cubilily
No ratings yet
MODEL EXAM II Answer Key - For Merge
Document20 pages
MODEL EXAM II Answer Key - For Merge
devi
No ratings yet
Data Preprocessing: L1+ Freq
Document13 pages
Data Preprocessing: L1+ Freq
Anonymous LIQ5pC37
No ratings yet
Tutor Test and Home Assignment Questions For de
Document4 pages
Tutor Test and Home Assignment Questions For de
achaparala4499
No ratings yet
CS3353
Document2 pages
CS3353
Narendran Muthusamy
No ratings yet
FDS Important Q
Document5 pages
FDS Important Q
santhoshmanojpkm
No ratings yet
Unit 4 L1
Document3 pages
Unit 4 L1
RAHUL SHARMA
No ratings yet
Assg 2 Pre-Processing
Document1 page
Assg 2 Pre-Processing
trupti.kodinariya9810
No ratings yet
DMV
Document56 pages
DMV
gohesa8202
No ratings yet
Dev Answer Key
Document17 pages
Dev Answer Key
jayapriya kce
100% (1)
Chapter2 Stats
Document9 pages
Chapter2 Stats
Poonam Naidu
No ratings yet
FDS QB
Document3 pages
FDS QB
shwethasri366
No ratings yet
11 4variationswithinadataset
Document4 pages
11 4variationswithinadataset
Christian Batista
No ratings yet
Additional Mathematics SPM Form 4 Chapter 7 Statistics
Document23 pages
Additional Mathematics SPM Form 4 Chapter 7 Statistics
Niceman Natiqi
No ratings yet
Sample Ques
Document8 pages
Sample Ques
MD. MAHABUB RANA SAIKAT
No ratings yet
CHAPTER8 QS026 semII 2009 10
Document13 pages
CHAPTER8 QS026 semII 2009 10
Saidin Ahmad
No ratings yet
PS03 Descriptive Statistics
Document8 pages
PS03 Descriptive Statistics
srw
No ratings yet
Assignment 1 Sol
Document13 pages
Assignment 1 Sol
kshambl
No ratings yet
Knee Point Detection For Detecting Automatically The Number of Clusters During Clustering Techniques
Document10 pages
Knee Point Detection For Detecting Automatically The Number of Clusters During Clustering Techniques
Marub Asub
No ratings yet
Unsupervised Neural Networks
Document30 pages
Unsupervised Neural Networks
api-19937584
No ratings yet
Solutions To II Unit Exercises From Kamber
Document16 pages
Solutions To II Unit Exercises From Kamber
jyothibellaryv
83% (42)
Organizing and Summarizing Data: Statistics
Document23 pages
Organizing and Summarizing Data: Statistics
salah ashraf
No ratings yet
Review Data Solutions
Document5 pages
Review Data Solutions
jh5488941
No ratings yet
Using SAS For The Design Analysis and Visualization of Complex Survey
Document15 pages
Using SAS For The Design Analysis and Visualization of Complex Survey
arijitroy
No ratings yet
QB FDS
Document5 pages
QB FDS
thilakavathishanmugam
No ratings yet
LU 3 Descriptive Statistics in SPSS
Document60 pages
LU 3 Descriptive Statistics in SPSS
Kristhel Jane Roxas Nicdao
No ratings yet
Education - Post 12th Standard - CSV
Document11 pages
Education - Post 12th Standard - CSV
Ruhee's Kitchen
No ratings yet
ML Assignment-1
Document7 pages
ML Assignment-1
Likhitha Pallerla
No ratings yet
Siu L. Chow Keywords: Associated Probability, Conditional Probability, Confidence-Interval Estimate
Document20 pages
Siu L. Chow Keywords: Associated Probability, Conditional Probability, Confidence-Interval Estimate
Tammie Jo Walker
No ratings yet
EDA QB Full Answers
Document18 pages
EDA QB Full Answers
ÃÑŠHÜ
No ratings yet
Mvda - Question Bank
Document14 pages
Mvda - Question Bank
tsreevaishnavi223
No ratings yet
Practice For Math Test
Document7 pages
Practice For Math Test
Fani T
No ratings yet
STA3022Test2 2023 v2
Document6 pages
STA3022Test2 2023 v2
alutakaunda
No ratings yet
Penyerahan Dan Penilaian Tugasan
Document8 pages
Penyerahan Dan Penilaian Tugasan
Hunny Bee
No ratings yet
Qbank
Document5 pages
Qbank
abhinandanpaul1
No ratings yet
Bayesian Modeling Using The MCMC Procedure
Document22 pages
Bayesian Modeling Using The MCMC Procedure
Kian Jahromi
No ratings yet
Leveraging Distortions: Explanation, Idealization, and Universality in Science
From Everand
Leveraging Distortions: Explanation, Idealization, and Universality in Science
Collin Rice
No ratings yet
Fairness and Machine Learning: Limitations and Opportunities
From Everand
Fairness and Machine Learning: Limitations and Opportunities
Solon Barocas
No ratings yet
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
From Everand
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Wouter Verbeke
No ratings yet
Introduction to Modeling Cognitive Processes
From Everand
Introduction to Modeling Cognitive Processes
Tom Verguts
No ratings yet
Presentations PPT Unit-5 29042019034847AM
Document39 pages
Presentations PPT Unit-5 29042019034847AM
IMMORTAL'S PLAYZ
No ratings yet
Presentations PPT Unit-5 27042019063837AM
Document14 pages
Presentations PPT Unit-5 27042019063837AM
IMMORTAL'S PLAYZ
No ratings yet
Mobile Computing and Wireless Communication (2170710) : Unit - 1: Introduction (Signals)
Document25 pages
Mobile Computing and Wireless Communication (2170710) : Unit - 1: Introduction (Signals)
IMMORTAL'S PLAYZ
No ratings yet
Mobile Computing and Wireless Communication (2170710) : Unit - 1: Introduction
Document60 pages
Mobile Computing and Wireless Communication (2170710) : Unit - 1: Introduction
IMMORTAL'S PLAYZ
No ratings yet
Finite Element Analysis of Von-Mises Stress Distri
Document7 pages
Finite Element Analysis of Von-Mises Stress Distri
Anonymous P8Bt46mk5I
No ratings yet
ICSE Work Power Energy Test
Document2 pages
ICSE Work Power Energy Test
Harsh Gulati
No ratings yet
Inverse Square Law of Heat Experiment
Document18 pages
Inverse Square Law of Heat Experiment
John
No ratings yet
Waec Confirmed Selected Topics For Wassce 2024
Document7 pages
Waec Confirmed Selected Topics For Wassce 2024
godfredquaye901
No ratings yet
IM3 Syllabus 2013-2014
Document6 pages
IM3 Syllabus 2013-2014
hmacdonald82
No ratings yet
Lab8 نسخة
Document17 pages
Lab8 نسخة
hosenalmalke.com
No ratings yet
Unit 6
Document10 pages
Unit 6
Anonymous XU2T0mf
100% (1)
8.5 Rationalize Practice
Document3 pages
8.5 Rationalize Practice
John Lloyd Generoso
No ratings yet
Mathematics D - 5
Document4 pages
Mathematics D - 5
Aditya Ghose
No ratings yet
Chapter 3 Transient
Document15 pages
Chapter 3 Transient
chibssa alemayehu
No ratings yet
An Analysis of Yingzao Fashi in Shape Grammar
Document8 pages
An Analysis of Yingzao Fashi in Shape Grammar
ShenWeizhen
No ratings yet
44 Degrees: The If I Were An Archer Fish Page From Lesson 1
Document6 pages
44 Degrees: The If I Were An Archer Fish Page From Lesson 1
JM Lomoljo
No ratings yet
Laser Experiments: Diffraction
$Laser Experiments: Diffraction$
Document6 pages
Laser Experiments: Diffraction
Sai Renu
No ratings yet
GROUP 1 SEC. 22 MPA Chapter 5 Nos. 1&2 Introduction and Application of Research Statistics 1
Document19 pages
GROUP 1 SEC. 22 MPA Chapter 5 Nos. 1&2 Introduction and Application of Research Statistics 1
Rey Subia
No ratings yet
A Primer of Ecological Statistics 2nd Edition Edition Ellison Download PDF
Document84 pages
A Primer of Ecological Statistics 2nd Edition Edition Ellison Download PDF
inkyugiac79
No ratings yet
DOE-Assignment 2 (Teddy)
Document25 pages
DOE-Assignment 2 (Teddy)
Tewodros Birhan
No ratings yet
Research Paper Final - FRAS
Document4 pages
Research Paper Final - FRAS
pijowac169
No ratings yet
Zelmanov 1944
Document236 pages
Zelmanov 1944
Oswaldo Toledano
No ratings yet
Chalkboard Mathematics Class Orientation Education Presentation
Document15 pages
Chalkboard Mathematics Class Orientation Education Presentation
21. M. Abyan Setiawan
No ratings yet
Blown Diffuser
Document56 pages
Blown Diffuser
Ardiansyah Alif Fachrizal Ilmy
No ratings yet
Ge 104 Module 2
Document23 pages
Ge 104 Module 2
Nica De Juan
No ratings yet
CSC103: Database Management Systems
Document25 pages
CSC103: Database Management Systems
aftab saeedi
No ratings yet
Pre RMO
Document3 pages
Pre RMO
Dhrubajyoti Ghosh
100% (1)
Cem Orientation 2018
Document18 pages
Cem Orientation 2018
Ahmed Sabry
No ratings yet
The History of Modern Science and Mathematics
Document25 pages
The History of Modern Science and Mathematics
Raavan Ragav
0% (1)
Comment On The Self
Document2 pages
Comment On The Self
robert hook
No ratings yet