Academia.eduAcademia.edu

R programming for Categorical data analysis__Class 3

Class 3: Chapter 2 Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data Class 3: Chapter 2 R section of EPH 705 R Example Exercise Min Lu Division of Biostatistics University of Miami Spring 2017 1 / 11 Overview Class 3: Chapter 2 Min Lu Object: 1 Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data 2 R Example 3 Exercise Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise 2 / 11 Pearson chi-squared test Class 3: Chapter 2 Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Pearson chi-squared test of goodness-of-fit of a set of multinomial probabilities: We begin with a sample of N items each of which has been observed to fall into one of k categories. We can define x = (x1 , x2P , . . . , xk ), as the observed numbers of items in each cell. Hence ki=1 xi = N Pearson χ2 Test of multinomial probabilities Exercise Test H0 : π = (π1 , π2 , . . . , πk ), where k X πi = 1 i=1 through χ2 = k X i=1 (xi − Ei )2 , where Ei = N πi Ei 3 / 11 Class 3: Chapter 2 R Code for confidence interval of odds ratio Min Lu Object: a — The number of individuals who both suffer from exposure and disease. Odds ratio and Relative risk b — The number of individuals who suffer from disesase but not exposed. Pearson chi-squared test c — The number of individuals who suffer from exposure but are healthy. Test of trend for ordinal data d — The number of individuals who neither suffered from exposure nor disease. library(fmsb) res <- oddsratio(a = 5, b = 10, c = 85, d = 80, conf.level = 0.95) R Example Exercise ## Disease Nondisease Total ## Exposed 5 85 90 ## Nonexposed 10 80 90 ## Total 15 165 180 res ## ## ## ## ## ## ## ## ## Odds ratio estimate and its significance probability data: 5 10 85 80 p-value = 0.1787 95 percent confidence interval: 0.1541455 1.4366513 sample estimates: [1] 0.4705882 4 / 11 Class 3: Chapter 2 R Code for confidence interval of relative risk (1) Min Lu Object: X — The number of disease occurence among exposed cohort. Odds ratio and Relative risk Y — The number of disease occurence among non-exposed cohort. Pearson chi-squared test m1 — The number of individuals in exposed cohort group. Test of trend for ordinal data m2 — The number of individuals in non-exposed cohort group. library(fmsb) res <- riskratio(X = 5, Y = 10, m1 = 90, m2 = 90, conf.level = 0.95) R Example Exercise ## Disease Nondisease Total ## Exposed 5 85 90 ## Nonexposed 10 80 90 print(res) ## ## ## ## ## ## ## ## ## Risk ratio estimate and its significance probability data: 5 10 90 90 p-value = 0.1787 95 percent confidence interval: 0.1779702 1.4047292 sample estimates: [1] 0.5 5 / 11 Class 3: Chapter 2 Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise R Code for confidence interval of relative risk (2) detach("package:fmsb", unload = TRUE) library(epitools) ## Warning: package 'epitools' was built under R version 3.6.3 tapw <- c("Intermediate", "Highest") outc <- c("Case", "Control") dat <- matrix(c(2, 29, 35, 64), 2, 2, byrow = TRUE) dimnames(dat) <- list(`Tap water exposure` = tapw, Outcome = outc) riskratio(dat, rev = "c", correction = T) ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## $data Outcome Tap water exposure Control Case Total Intermediate 29 2 31 Highest 64 35 99 Total 93 37 130 $measure risk ratio with 95% C.I. Tap water exposure estimate lower upper Intermediate 1.000000 NA NA Highest 5.479798 1.397111 21.49306 $p.value two-sided Tap water exposure midp.exact fisher.exact chi.square Intermediate NA NA NA Highest 0.001018658 0.001261178 0.00392597 $correction [1] TRUE attr(,"method") [1] "Unconditional MLE & normal approximation (Wald) CI" 6 / 11 Class 3: Chapter 2 R Code for Pearson chi-squared test Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise ◮ The function chisq.test can perform the Pearson chi-squared test of goodness-of-fit of a set of multinomial probabilities. For example, with 3 categories and hypothesized values (0.4, 0.3, 0.3) and observed counts (12, 8, 10), x <- c(12, 8, 10) p <- c(0.4, 0.3, 0.3) chisq.test(x, p = p) ## ## Chi-squared test for given probabilities ## ## data: x ## X-squared = 0.22222, df = 2, p-value = 0.8948 7 / 11 Class 3: Chapter 2 Test for trend in proportions Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise #x Number of events #n Number of trials #score Group score x <- c(0, 0.5, 1.5, 4, 7) no <- c(17066, 14464, 788, 126, 37) yes <- c(48, 38, 5, 1, 1) patients <- no + yes chiresult <- prop.trend.test(x = yes, n = patients, score = x) chiresult ## ## Chi-squared Test for Trend in Proportions ## ## data: yes out of patients , ## using scores: 0 0.5 1.5 4 7 ## X-squared = 6.5701, df = 1, p-value = 0.01037 # calculate r r <- sqrt(chiresult$statistic/(sum(patients) - 1)) print(as.numeric(r)) ## [1] 0.01420229 8 / 11 In class exercise Class 3: Chapter 2 Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise In Canadian Journal of Sociology 15 (1), 1990, page 47, Smith claimed that the sample showed a close match between the age distributions of women in the sample and all women in Toronto between the ages of 20 and 44. This is especially true in the youngest and oldest age brackets. Tabel: Sample and Census Age Distribution of Toronto Women. Age 20-24 25-34 35-44 Total Number in Sample Percent in Census 103 216 171 490 18 50 32 100 Using the data in Table 1, conduct a chi-square goodness of fit test to determine whether the sample does provide a good match to the known age distribution of Toronto women. Use the 0.05 level of significance. 9 / 11 Take home exercise Class 3: Chapter 2 Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise To test the hypothesis that a random sample of 100 students major in Public Health has been drawn from a population in which men and women are equal in frequency, the observed number of men and women would be compared to the theoretical frequencies of 50 men and 50 women. There were 39 men in the sample and 61 women observed. Could we still conclude that the gender of students is equal in frequency? 10 / 11 Class over Class 3: Chapter 2 Min Lu Object: Odds ratio and Relative risk Pearson chi-squared test Test of trend for ordinal data R Example Exercise 11 / 11