Assessment - 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Assessment- 2

Question 1- A company manager says that the average balance on their credit cards is $500. Do
you think that this assertion is justified? Use a one-sample t-test to draw your conclusion.

Answer- To check if the manager's statement that the average balance on their credit cards is
$500 is true, we can do a one-sample t-test. This test helps us compare the average balance
from a sample to the claimed average of $500.
• Null Hypothesis: Average balance of credit card is $500
• Alternate Hypothesis: Average balance of credit card is not $500

t-Test: Two- Sample Assuming Unequal Variances

Based on the one-sample t-test results, we cannot reject the null hypothesis. This means the
data does not provide enough evidence to dispute the manager's claim that the average
balance on the credit cards is $500. Thus, the manager's assertion is supported by the sample
data.

Question 2- Is there a difference between men and women as far as average balance is
concerned? Use a two-sample t-test to draw your conclusion.

Answer- To determine if there is a significant difference between the average balances of men
and women, we can perform a two-sample t-test. This test will compare the means of two
independent samples (men and women) to see if they differ significantly.

• Null Hypothesis: There is no difference in the average balances between men and
women.
• Alternative Hypothesis: There is a difference in the average balances between men and
women.

• Two-Tailed Test: The t-value (-0.428) is less than the critical t-value (±1.966), and the p-
value (0.669) is greater than 0.05. We cannot reject the null hypothesis.
• One-Tailed Test: The t-value (-0.428) is less than the critical t-value (1.649), and the p-
value (0.334) is greater than 0.05. We cannot reject the null hypothesis.

Therefore, the two-sample t-test results show no significant difference in average balances
between men and women.
Question 3- Is there a difference between students and non-students as far as average balance
is concerned? Use a two-sample t-test to draw your conclusion.

Answer- To determine if there is a significant difference between the average balance of


students and non-students, we can perform a two-sample t-test.

• Null Hypothesis: There is no difference in the average balance between students and
non-students.
• Alternative Hypothesis: There is a difference in the average balance between students
and non-students.

Since the t-statistic is much higher than the critical value and the p-value is well below 0.05, we
reject the null hypothesis. This shows that students have a significantly higher average balance
than non-students.

Question 4- It is generally assumed that if there are more credit cards then the balance on the
cards will be more. Based on this dataset, do you think this is true? Calculate a correlation
coefficient and show a scatter plot to support your answer.

Answer- No, the assumption is incorrect. There is no correlation between the number of credit
cards and the balance on the cards.
Correlation coefficient:

The correlation coefficient is very low, indicating almost no relationship between the number of
cards and the card balance.

Scatter Plot:

The trend is very less.

Question 5- Examine whether the following demographic variables influence balance: (a) age,
(b) years of education, (c) marital status. For age and years of education, use scatter plots to
depict their relationship with balance and calculate the correlation coefficient. For the
relationship between marital status and balance, use a two-sample t-test to draw your
conclusion
Answer 5a- Correlation coefficient is almost equal to zero, which implies there is no relation
between age on credit balance.

Scatter Plot:

Answer 5b- Correlation coefficient is almost equal to zero, which implies there is no relation
between years of education on credit balance.
Scatter plot:

5c- Null Hypothesis: Average balance of credit card for Single and Married is same.

Alternate Hypothesis: Average balance of credit card for single and married is different.

P value is greater so null hypothesis, cannot be rejected which means there is no significant
changes caused due to marital status.
Question 6- “Ethnicity of the cardholder does not matter as far a balance is concerned.” Carry
out an analysis of variance (ANOVA) and discuss whether this statement is supported by the
data or not.

Answer-
• Null Hypothesis: The ethnicity of the cardholder has no impact on the balance; it
remains the same regardless of ethnicity.
• Alternate Hypothesis: The balance varies based on the cardholder's ethnicity.

The ANOVA results show that the p-value is greater than 0.05, indicating that ethnicity has no
significant impact on balance.

Question 7- A general principle that credit card companies often follow is to assign a higher
credit limit to people with a higher credit rating. Does the data show that this principle is being
followed?

Answer- Yes, the principle is followed.

Correlation coefficient

It has a good agreement.


Scatter plot:

Credit card companies often assign higher credit limits to people with higher credit ratings. This
practice is supported in our case based on the correlation.

Question 8- Run a simple linear regression of balance on the credit limit. (Here credit limit is the
X and the balance is the Y). Report the coefficients and the R-squared. Show a scatter plot.
State inference.

Answer- Simple linear regression


Scatter Plot:

The credit limit is a significant predictor with a substantial correlation, specifically an R-squared
value of 0.74.

Question 9- Run a simple linear regression of balance (Y) on credit rating (X). Report the
coefficients and R-squared. Show a scatter plot. State inference.

Answer- Simple regression analysis:


Scatter Plot:

Credit ratings strongly influence credit balances, showing a significant correlation between the
two factors.

Question 10- Consider your findings in questions 8-9. Discuss business mechanisms to increase
or decrease the balance on credit cards. Try to quantify your answers. In this context, focus on
possible specific strategies using variables in Q8 and Q9 that the business could adopt to
increase the balance on credit cards.
Answer-
• The credit card rating and credit limit are clearly influential factors affecting the credit
card balance, demonstrating strong correlations and serving as significant predictors.
Higher ratings and limits tend to result in higher balances, underscoring their
importance in determining balance levels.
• According to this analysis, individuals with higher ratings and larger credit limits tend to
have increased balances, whereas those with lower ratings and smaller credit limits
typically experience decreased balances.

Question 11- The credit limit is provided as a consolidated amount for all the credit cards the
cardholder has. Run a multiple linear regression of Balance (Y) on Limit and Cards as two X
variables. Report the coefficients. Discuss the effect on the balance of (a) increasing the credit
limit on the same number of cards and (b) increasing the number of cards without altering the
total credit limit.

Answer- Multiple regression analysis:

The credit limit and number of cards are both highly influential predictors of credit balance,
with substantial impacts.

The correlation coefficient is 0.865, and the R-squared value is 0.748, indicating strong
relationships. Specifically, an increase of one unit in credit limit (measured on a larger scale
compared to cards) is associated with a balance increase of approximately $0.17, with a
standard error of 34.2.
Similarly, adding one additional card is linked to an average balance increase of $26.03,
underscoring the positive effect of increasing the number of cards on balance.

Question 12- Run a simple linear regression equation with Income as X and Balance as Y. Report
the coefficients. Is the coefficient of Income significantly different from zero? What does this
say about the effect of income on balance?
Answer-

The correlation coefficient between the two variables is 0.46. According to the regression
analysis, the coefficient for income is 6.048, indicating a substantial effect that ranges from 4.90
to 7.18. This suggests that increasing income by one unit results in a predicted balance increase
of approximately $6.04, which is statistically significant. The t-statistic, at 10.4 standard errors
away from zero, further supports the significance of income as a predictor of balance.

Question 13- Based on the equation derived in question 12, what is the estimated balance for a
person with an income of USD 100k per year?
Answer-
Using the derived equation Y= 6.084X+246.51, where X represents income and Y represents
balance:

For a person earning $100,000 annually, the estimated balance is approximately $851.35.

Question 14- Based on the dataset, explore the relationship between credit card balance (Y)
and (a) Income (b) Age (c) Education (c) Limit, and (d) Rating as X variables? Estimate a multiple
linear regression model and report the statistical significance of each of these variables.
Answer- Multiple regression model:
Explanation:

Income and rating are identified as crucial predictors affecting changes in credit card balance,
supported by their statistically significant p-values in multiple regression analysis. Together,
these variables explain 87.7% of the variance in balance. Upon refining the model to include
only income and rating, the explained variance remains high at 87.5%, indicating their
predominant influence.

Further analysis of residuals reveals distinctive patterns: income residuals predominantly skew
negative, especially among lower income groups, suggesting a non-linear relationship. In
contrast, rating residuals exhibit a more balanced distribution across different rating levels,
indicating a more consistent impact on balance.

In summary, income and rating emerge as the primary drivers influencing credit card balance,
while variables such as limit, age, and education show no significant contribution.

You might also like