Vac QP
Vac QP
Vac QP
FOR WOMEN
(Autonomous)
PG & Research Department of Computer Science and
Applications
Value Added Course
DATA SCIENCE USING PYTHON (VA2K2364)
(2023-24)
UNIT – I to UNIT - V
1.What is the primary goal of data science?
a) Predicting the future
b) Analyzing historical data
c) Making decisions based on data
d) All of the above
3. Which of the following best describes data in the context of data science
a) Information that is already processed and analyzed Raw
b) facts and figures that need interpretation
c) Predictive Models
d) None of the above
7. Which Python library is commonly used for data manipulation and analysis in data science?
a) Pygame
b) Matplotli
c) b Pandas
d) Numpy
8. Which library is used for creating data visualizations in Python?
a) Pandas
b) Matplotlib
c) Numpy
d) Scikit-learn
10. What is the term for a function that calls itself within its own code?
a) Recursive function
b) function Reusable
c) function Looped
d) function Iterative
11.What is the main purpose of linear functions in data science?
a) To model non-linear relationships
b) To represent data using a straight line
c) To handle categorical data
d) To perform statistical tests
12. In a linear function, what does the coefficient of the independent variable represent?
a) Intercept
b) Slope
c) Correlation coefficient
d) Error term
13.Which of the following is NOT a common application of linear functions in data science?
a) Linear regression
b) Time series analysis
c) Principal component analysis
d) Forecasting
14.Which type of plot is most suitable for visualizing the distribution of a single continuous
variable in data science?
a) Bar plot
b) Scatter plot
c) Histogram
d) Pie chart
15.In data science, what type of plot is used to show the relationship between two continuous
variables?
a) Box plot
b) Line plot
c) Scatter plot
d) Stacked bar plot
16.What is the primary purpose of a box plot in data science?
a) To show the distribution of a single variable
b) To compare multiple categories
c) To display correlations
d) To show time series data
17.In the context of linear regression, what does the slope represent?
a) The point where the regression line intersects the y-axis
b) The change in the dependent variable for a one-unit change in the independent
variable
c) The correlation between variables
d) The standard error of the regression
18.Which term represents the point where the regression line intersects the y-axis in a linear
regression equation?
a) Slope
b) Intercept
c) Correlation coefficient
d) Residual
19.In a simple linear regression model, the equation is y = mx + b. What does 'b' represent in
this equation?
a) The slope of the regression line
b) The variance of the dependent variable
c) The intercept of the regression line
d) The residual error
20.Which type of plot is best suited to visualize the distribution of a single numeric variable?
a) Scatter plot
b) Histogram
c) Box plot
d) Line plot
21.What is statistics?
a. The study of static objects
b. The science of data collection, analysis, interpretation
c. The study of mathematical equations
d. The science of probability theory
22.Why is statistics important?
a. It helps in making predictions with 100% accuracy
b. It provides tools for dealing with uncertainty and variability in data
c. It is used to create artistic visualizations
d. It is only relevant in the field of mathematics
23.What does the 25th percentile represent in a dataset?
a. The value below which 75% of the data falls
b. The value below which 25% of the data falls
c. The average of the highest and lowest data points
d. The value above which 25% of the data falls
24.If the median and the 50th percentile are the same, what can you say about the data?
a. The data is normally distributed
b. The data is negatively skewed
c. The data is positively skewed
d. Nothing specific can be concluded
25.What does the standard deviation measure?
a. The central tendency of a dataset
b. The spread or dispersion of data points in a dataset
c. The probability of an event occurring
d. The percentage of data below a certain value
26. A low standard deviation indicates
a. Data points are close to the mean
b. Data points are widely spread from the mean
c. A perfect normal distribution
d. An error in data collection
27.How is variance calculated?
a. It is the square root of the mean
b. It is the average of squared differences from the mean
c. It is the range of data values
d. It is the difference between the maximum and minimum values
28.If two datasets have the same mean but different variances, what can you infer?
a. The datasets have the same degree of variation
b. The datasets have different degrees of variation
c. The datasets have the same distribution
d. The datasets are identical
29.In a normally distributed dataset, where does the median percentile (50th percentile) lie?
a. At the minimum value
b. At the maximum value
c. At the mean (average) value
d. Exactly in the middle of the data range
30.If the variance of a dataset is zero, what can you conclude about the data?
a. All data points are equal
b. The data is normally distributed
c. The data is highly variable
d. The data is a large dataset
31.What does the term "correlation" refer to in the context of data science?
a) A measure of the strength and direction of a linear relationship between two variables.
b) The process of gathering and organizing data for analysis.
c) The prediction of future data trends.
d) The removal of outliers from a dataset.
32. In correlation analysis, what does a correlation coefficient of -0.9 indicate?
a) A strong positive relationship between two variables.
b) A strong negative relationship between two variables.
c) No relationship between two variables.
d) Perfect causality between two variables.