Ba Capstone Final K
Ba Capstone Final K
Ba Capstone Final K
INSTITUTE OF MANAGEMENT
CA 17, 36th Cross, 26th Main, 4th “T” Block, Jayanagar, Bangalore-560041.
(Autonomous Institution Affiliated to BCU)
CAPSTONE PROJECT
Semester II Batch
2021-2023
1
1. Is there statistical evidence to support that the cost of treatment and body weight are
related? Support your answer with all necessary tests.
We have to develop an SLR model and validate it to check whether there is a linear relationship
between the cost of treatment and the weight. Let Y= cost of treatment and X = body weight of
the
patient. The corresponding SLR model is given by
Y=B0+B1, body weight
The dataset 'DAD Hospital Data.xls' has cost of treatment and weight for 120 patients admitted to
the DAD Hospital The regression output for the model using the software Jamovi is shown in
Tables below
That is, the relationship between the cost of treatment and the body weight is given by
Y= 129426 + 1847 x Body weight.
2
The p-value for the coefficient 'Body Weight' is 0.010 which is less than 0.05; thus, the
independent variable body weight is significant at a 0.05 or at 95% confidence level. From the
model, we can interpret that the cost of treatment increases at the rate of 1847 per 1 kg increase in
the body weight. However, before we accept the model, we must check the important assumptions
of normality and homoscedasticity for the below Figure is the P-P plot which shows the observed
cumulative probability of standardized residuals expected cumulative probability of a normal
distribution (diagonal line). The figure is a plot between the standardized residual and the
standardized response v (Y). The plot between residual and independent v values can also be used
for determining the exist heteroscedasticity.
3
PLOTS
4
5
The model in the equation Y= 129426 + 1847 x Body weight may be used for predicting the cost
of treatment.
2. Comment on the value of R-square. Does a low R-square value indicate that the model is
not useful?
The R-square value for the model Y= 129426 + 1847 x Body weight is only 0.121. That is, the
model is explaining only 12.1% of the variation in the value of Y. Low B-square values do not
imply that the model is useless. The primary objective of regression is to find whether there is
relationship between the response variable (cost of treatment) and the independent variable (body
weight of the patent). The regression model establishes this relationship since the p-value of the
weight coefficient is less than 0.05 and both normality and homoscedasticity assumptions are
satisfied reasonably. A low R-square value may create problems when we use the model for
prediction since the error is likely to be higher.
3. Interpret the value of the coefficient of weight in the model developed in question 1. What
will be the average difference in cost of treatment for patients aged 50 and patients aged 51?
6
The regression model is given by
The coefficient for weight is 1847. That is, for every 1kg increase in weight, the cost of treatment
increases by a factor of 1847. The average costs of treatment for people of aged 50 and 51 are
given by
The difference in average cost of treatment for patients aged 50 and 51 is Rs.1847
4. Is it possible to conclude that a patient weighing 50 kg is likely to spend at least INR 500
more than the one weighing 49 kg at 90% confidence level?
The treatment cost of patient aged 49 is 2,19,929
Assume that the treatment cost of patient aged 50 is at least 500 more than the patient aged 50.
It should at least be 2,20,429.
If the null hypothesis is true, then the difference between the average treatment costs for a patient
aged 51 and a patient aged 50 will be less than 500. The corresponding test statistic is as follows:
The t-critical value for alpha = 0.1 and df = 11delta is 1.2888. Since the t-statistics is greater than
t-critical, we reject the null hypothesis. The p-value (in a one-tailed t-test) corresponding to the t-
value of 1.406 is 0.0811. Since the p-value is less than 0.1 (10% significance), we will reject the
null hypothesis. Thus, we conclude that the value of a is greater than 0.003182 and so the
difference between the costs of treatment for patients aged 51 and 50 is at least 500 at 10%
significance (or 90% confidence).
5.DAD hospital is planning to introduce a package price for the treatment and they would like
to charge INR 3,00,000 for the patients weighing 50 kg. That is, the patient is charged INR
7
3,00,000 irrespective of the actual treatment cost. What is the probability that the treatment
cost is likely to exceed the package price?
Note that in the population, ln(Y) ln normal distribution with mean 11.8040 +0.0074x (= 12.174)
and the standard deviation 0.3975. If the hospital is planning to charge 300,000 for patients aged
years, we have to find the probability that the actual cost of treatment would be higher than 300,000.
Note that ln(3, 0) = 12.61154 So, the probability that the cost of treatment will exceed ₹300,000 is
by P[In(Y) ≥ 12.61154]. We know that the mean is 12 and the corresponding standard deviation is
0.3975. the normal distribution, we can get the correspond probability as 0.1355 [in Microsoft Excel,
the probability = 1 Normal distribution (12.61154, 12.174, 0.3975, TRUE). If DAD charges 300,000
for a patient aged 50 years 13.55% of the cases, the actual cost of treatment is likely to exceed
300,000.