Big Data Analytics (BDAG 19-5) : Quiz: GMP - 2019 Term V

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Quiz: GMP – 2019 Term V

Big Data Analytics (BDAG 19-5)


Marks: 20 Time: 20 min

For question nos. 1 - 10 identify the correct choice(s) for each question and write it on your
answer script. Each question carries 2 marks. Indicate the correct choice(s) on the question
paper itself and return the question paper after the quiz.

1. Suppose, you applied a logistic regression model on a given data and got a training accuracy
X and testing accuracy Y. Now, you want to add a few new features in the same data. Select
the options(s) which is/are correct in such a case. Consider all the other features remaining
same.
a. Training accuracy decreases
b. Training accuracy increases or remains the same
c. Testing accuracy decreases
d. Testing accuracy increases or remains the same

2. Which of the following statements is TRUE?


a. Linear regression error values have to be normally distributed but in case of logistic
regression it is not the case.
b. Logistic regression error values have to be normally distributed but in case of linear
regression it is not the case.
c. Both linear regression and logistic regression error values have to be normally
distributed.
d. Both for linear regression and logistic regression, the error values need not be
normally distributed.

3. The logit function is the natural log of odds. What could be the range of logit function in the
domain x = [0, 1].
a. (– ∞ , ∞)
b. (0, 1)
c. (0, ∞)
d. (– ∞, 0)

4. In neural networks, nonlinear


activation functions such as SIGMOID function
a. Speed up the gradient calculation in backpropagation, as compared to linear units
b. Are applied only to output units of the networks
c. Help the model to learn nonlinear decision boundaries
d. Always produce output values in the range [0, 1]
5. Which of the following assumptions do we make while deriving linear regression
parameters? (Multiple options may be correct. Choose all correct options to get full credit.)
a. The true relationship between the response variable y and the predictor variable x is
linear
b. The model errors are statistically independent
c. The errors are normally distributed with 0 mean and constant standard deviation
d. The predictor x is non-stochastic and is measured error-free

6. For k cross-validation, smaller k value implies less variance.


a. The statement is always TRUE
b. The statement is always FALSE
c. The statement is sometimes TRUE, but not always
d. There is no relation between the value of k and the variance in the model

7. A neural network with multiple hidden layers and multiple nodes in each hidden layer using
a suitable activation function can form non-linear boundaries in a classification problem. The
statement is:
a. Always TRUE
b. Always FALSE
c. Depends on the data
d. Depends on the activation function being used

8. You have collected a dataset containing 10,000 rows of tweet text and no other information.
You have created a document term matrix of the data, treating every tweet as a document.
Which of the following is correct, in regards to document term matrix?
a. Removal of stopwords from the data will affect the dimensionality of the data
b. Normalization of words in the data will reduce dimensionality of the data
c. Both the statements a and b are correct
d. None of the statements a and b are correct

9. Imagine, you are solving a classification problem with two highly imbalanced classes. The
majority class is observed 99% of the records in the training dataset. Your model has 99%
accuracy on the test data class prediction. Which of the following is TRUE in such as case?
a. Classification accuracy, Precision, and Recall are all good metrics
b. None of Classification accuracy, Precision, and Recall are good metrics
c. Classification accuracy is not a good metric, while Precision and Recall are
d. Classification accuracy is a good metric, while Precision and Recall are not

10. Which of the following is a characteristic of an overfitted model?


a. Both training accuracy and validation accuracy are very high but test accuracy is not
satisfactory
b. Training accuracy increases with number of iterations used in building the model,
while validation accuracy starts falling after certain number of iterations are used.
c. Training accuracy starts decreasing after certain number of iterations used in
building the model, but the validation accuracy consistently increases
d. None of the above

You might also like