Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
2 votes
1 answer
30 views

Alternative to R-squared when calculating glm model using survey package

I'm writing my master thesis using the European Social Survey. The data requires weighting, so therefore I have to use the survey package and its ...
Moritary 's user avatar
1 vote
0 answers
45 views

Test (quantify) the association between categorical and numerical variable in survey data

I've just started working with survey data and I want to test independence between a numerical variable and a categorical one. I've heard of weighted ANOVA, but how can I test the normality and ...
Benco Myo's user avatar
0 votes
0 answers
15 views

How to Calculate Cumulative Incidence for Each Case ID in Competing Risks Analysis Using R?

I’m working with a dataset in R and trying to calculate the cumulative incidence for each case ID in the presence of competing risks. My dataset looks like this: ...
Ali Roghani's user avatar
0 votes
0 answers
12 views

Pooling survey data with panel households

I am analyzing some data from a national survey. I am working with two sets of data for the same variable (wealth) and some of those households are panel (i.e. interviewed both times). Now I want to ...
Tom's user avatar
  • 1
1 vote
0 answers
18 views

Handling Missing Groups in Stratified Sampling for Weighted Mean Calculation

I conducted a statistical survey using a stratified sample to measure the knowledge of Italian students on a specific topic. The population was stratified according to the following categories: Area (...
Erik De Luca's user avatar
0 votes
1 answer
23 views

Defining a variable based on two different variables with different weighting schemes in NHANES

I want to define a derived variable (Let's call it $Z$) using two original variables ($X$ and $Y$) in NHANES. If any of $X$ or $Y$ meets the criteria, the value for $Z$ should be 1; otherwise, it ...
Abdullah Abdelaziz's user avatar
0 votes
0 answers
13 views

Whether to specify ddf in confint.svyglm()

I have a main question about whether to specify ddf in confint.svyglm(). Using the two specifications below generate slightly ...
Guoqiang Zhang's user avatar
0 votes
0 answers
29 views

Weighing Data Issue

I am looking at e-cig prevalence within a city. I used surveys to collect data from residents, and I have a query around weighing data. I have made the assumption, due to over and underrepresentation ...
Aidan's user avatar
  • 1
2 votes
0 answers
25 views

Use calibration weights to correct for unit non-response bias?

I have a question about how calibration weights can be used to sufficiently correct for unit non-response bias. Suppose the sample is s and the response set is r. Calibration is applied to the ...
Guoqiang Zhang's user avatar
1 vote
1 answer
274 views

How do I use something like predict.glm (in R) with a svyglm model and why don't my predictions match my data?

I'd like to estimate "cost" using some covariates with a weighted gamma model using svyglm. The weights sum to 1, and there are about 10,000 rows in the dataframe df total, with columns ...
Mark's user avatar
  • 202
1 vote
0 answers
66 views

R survey package: compare weighted prevalence with literature result

I have a large dataset (>3000 cases) with results of serological tests. Based on these results i want to calculate the weighted prevalence of the disease. The data is categorical (0 = disease ...
Zoetzurechucky's user avatar
0 votes
1 answer
36 views

How to treat age-eligibility thresholds in household surveys (e.g. HRS)?

Most household surveys have age-eligibility thresholds. The HRS interviews individuals aged 51 and older, plus their spouse (if any) using PPS sampling. Do I need to drop individuals who are younger ...
cascom's user avatar
  • 41
1 vote
1 answer
219 views

When to use replicate weights in complex survey analysis

I am curious about when it is recommended to use replicate weights in survey analysis. I compared the usual survey analysis with using replicate weights, as illustrated below. Based on the paper "...
Guoqiang Zhang's user avatar
2 votes
1 answer
76 views

Different ways to define survey design object under MCAR assumption

I have a stratified random sample, and would like to conduct complete-case analysis, assuming Missing Completely At Random. However, I find that there seem to be two ways to define survey design ...
Guoqiang Zhang's user avatar
1 vote
2 answers
157 views

Is it reasonable to subset a survey design object by dependent (outcome) variable and fit a weighted logistic regression model?

I would like to study which factors are associated with an outcome which has more than two categories. After considering multinomial logistic regression model (which I find is very challenging to ...
Guoqiang Zhang's user avatar
4 votes
1 answer
106 views

Is it valid to compute survey weights using the distribution of a hypothetical/"imaginary" population?

I'm informally reviewing a study by a coworker, but as I'm not a specialist of survey weighting procedures, I need a second opinion about something I disagree with him. He collected a convenience ...
Daniela's user avatar
  • 57
0 votes
0 answers
63 views

Longitudinal survey data weights

I'm dealing with a longitudinal survey data now. Do we need to use survey sample weights while running panel data models (e.g., Fixed Effects, Random Effects)? In other words, would it be fine if I ...
Greenhill's user avatar
  • 233
1 vote
1 answer
89 views

How to interpret survey data with demographic information?

I have little statistical experience, but am helping to run a community needs survey for my organization. Many of the questions have a yes/no or multiple choice answer format. For example, "Do ...
April's user avatar
  • 11
1 vote
1 answer
96 views

Domain (subgroup) estimation in a stratified random sample

I want to ask a question about domain estimation (i.e., estimation of a parameter among subpopulations) in a stratified random sample. It seems to me that in a stratified random sample, domain ...
Guoqiang Zhang's user avatar
1 vote
1 answer
89 views

Variance test with survey weights

When not working with surveys, you do a variance test with var.test(). What do you do when you want to account for complex survey design?
Santiago Valdivieso's user avatar
0 votes
1 answer
55 views

Post stratification on a twophase object in the R survey package

I am working with NHANES survey data and I am trying to use a twophase survey design along with calibration to adjust for item non-response. For now, I am following an example from Lumley's Survey ...
tbuckley's user avatar
0 votes
1 answer
88 views

Can I make a proportional-to-size without replacement sample (PPS WOR) self weighted?

Let's say I have 100 schools and each has a different number of students. I want to estimate which % of students are in schools with electricity. Simulation and theory indicate it is more efficient to ...
Fernando Irarrázaval G's user avatar
2 votes
1 answer
92 views

Post-stratification with missing subpopulations in the survey

I have some survey data on a population described by age, gender, and weight. It’s quite skewed so I want to reweight it to a known target population (a larger ...
cgreen's user avatar
  • 1,002
5 votes
1 answer
198 views

svychisq with statistic="Chisq" vs. statistic="adjWald"

I'd like to run analyses using Rao-Scott Chi-squared test using svychisq to account for survey weights. It seems that statistic="Chisq" gives the Rao-Scott version, however, now that I'm ...
AskingSomeQuestions's user avatar
0 votes
0 answers
54 views

Confidence intervals for binary proportions in a subdivided population

I am trying to estimate a confidence interval for the population proportion level of a binary (yes / no) variable. The population comes in three segments - easy to sample (I have a sample); hard to ...
Amorphia's user avatar
  • 997
3 votes
1 answer
307 views

Design-based standard errors in svyglm but w/o weights or stratification

For inverse probability weighting (IPW) in R the use of survey::svyglm is well established. I want to compare the results of 10K+ (!) models with and without the ...
jay.sf's user avatar
  • 910
0 votes
0 answers
48 views

Is it appropriate to pre-stratify and post-stratify along different delineations of the same variables in a single survey?

I am working in the context of opt-in, web-based surveys. Often the desire is for accurate population estimates, and often at a country-wide level. The standard approach at this organization is to ...
spathartic's user avatar
3 votes
0 answers
294 views

How to perform exploratory factor analysis with complex survey data for ordinal and dichotomous variables?

I would like to perform an exploratory factor analysis in R with complex sample survey data involving ordinal and dichotomous variables. The svyfactanal() function ...
user156625's user avatar
1 vote
0 answers
54 views

multiple questions about setting up survey weights with the application of a combination of different types of weights - ESS

Checking out this file published by the european social survey website https://www.europeansocialsurvey.org/docs/methodology/ESS_weighting_data_1.pdf it states that : • when analysing data for one ...
An116's user avatar
  • 367
1 vote
0 answers
42 views

A question about a proper weighting method of a multi-stage survey design

Referring to the question in the link below : How can I use Propensity Scores to adjust for survey non-response bias? I read the paper published by Lee and Valliant (2009) and the pew research method ...
An116's user avatar
  • 367
1 vote
0 answers
263 views

Comparing proportions from overlapping samples with a complex survey design

I am trying to test whether there has been a statistically significant increase in the prevalence of a condition between two years of a complex survey using R. The samples are partly overlapping ...
Mark O'Donovan's user avatar
2 votes
1 answer
112 views

How to update survey weights after fixing strata membership/assignments?

Imagine we conduct a survey using stratified sampling, and after the survey closes, we find out that we had misclassified some of the respondents. They actually belong to different strata than we had ...
jdcrossval's user avatar
3 votes
0 answers
111 views

How do I calculate a finite-population correction for a weighted sample?

I have a weighted sample from a population. These are probability weights (as samples are taken with unequal probability). What's the variance in the estimated total? All samples are exchangeable (...
Closed Limelike Curves's user avatar
1 vote
1 answer
493 views

Problem with weigts in survey analysis of GSS cross-sectional data

I have a dataset made from https://gss.norc.org/get-the-data There is a description from the codebook how to use weights: ...
myoth myoth's user avatar
1 vote
0 answers
24 views

Nonresponse weight adjustments in multi-stage household surveys

I have a question about nonresponse weighting in complex sample surveys in multi-stage designs, like say, The US National Comorbidity Survey Replication (NCS-R), the Health and Retirement Study (HRS), ...
SurveyStatLearner's user avatar
1 vote
0 answers
142 views

What are some limitations of survey raking weights?

The title of this question says it all. I know that all methods have limitations, and while I know some of the strengths of raking weights (e.g., often, only marginal distributions for auxiliary ...
Andy's user avatar
  • 11
1 vote
0 answers
103 views

Comparison between weighted and unweighted surveys

I was hoping to get your take on comparing the results of a weighted survey with an unweighted survey. I have the results of two different surveys conducted in my state. Survey A was a multi-stage, ...
TPM's user avatar
  • 604
2 votes
0 answers
278 views

How should you use scaled weights with svydesign() in the survey package in R?

I am using the survey package in R to analyse the "Understanding Society" social survey. The main user guide for the dataset specifies (on page 45) that the weights have been scaled to have ...
datavisdev's user avatar
1 vote
0 answers
44 views

Weights for unknown category in the population

I have a dataset with survey responses where the variable gender has three categories: male, female and nonbinary, but the population dataset used to sample the individuals to send the survey has only ...
José's user avatar
  • 111
2 votes
1 answer
42 views

Do survey sample standard errors take into account/correct for anything else besides finite population?

Regular standard errors are biased when the data comes from a survey sampling design. The article "Wait Wait, Don't Tell Me... You're Using the Wrong Proc!" explains that the bias is due to ...
Michael Webb's user avatar
  • 2,280
1 vote
1 answer
1k views

Using R survey package to rake with missing data

I have some survey data and would like to address non-reponse bias by raking on 3 demographic variables for which I have decent population estimates from the American Community Survey. The problem I ...
derNincompoop's user avatar
0 votes
2 answers
49 views

Why do you only need to identify the first cluster level in svydesign(), even if you have multi level clustering?

Going through the a course on Survey Weights and it says that even though a dataset may sample using 3 clusters (like Counties, City Blocks, and households), you only need to specify the first level ...
Kevin's user avatar
  • 1
1 vote
1 answer
2k views

What goes wrong with post-stratification if a combination of values does not exist in the sample? How to fix it?

This question is both about the "survey" package in R and about the mechanics behind it. I'm more interested in the mechanical reasons why this package fails to create weights when ...
Tea Tree's user avatar
  • 280
2 votes
1 answer
54 views

How to decide if to use weights or not when estimating some $\mu$ of a population that has sub-populations with different $\mu_i$?

Setting and Notation Let's assume we have a population with (for example) two sub-population. Say, males and females. In the population they are split 50%-50%. We care about the population level ...
Tal Galili's user avatar
  • 21.9k
4 votes
1 answer
229 views

How to (statistically) test for the difference in the mean of an outcome variable when using weights, vs not using them?

Notations $y$ = outcome variable of interest. $w$ = weights to use on outcome $y$ (estimated with some method, be it post-stratification, IPW, or something else) $\bar y$ = summary statistic of ...
Tal Galili's user avatar
  • 21.9k
2 votes
1 answer
193 views

Why two different surveys will not give the exact same result?

I faced a challenge recently with our commercial team about our annual market survey result. As you can guess, our commercial team has no background in statistic or math etc. A brief introduction, ...
DanielG's user avatar
  • 171
2 votes
0 answers
87 views

Raking weights - How can they possibly recover the joint distribution?

So I've recently encountered raking weights as a way to obtain weights for population level estimates: https://www.pewresearch.org/methods/2018/01/26/how-different-weighting-methods-work/. My question ...
Andy's user avatar
  • 21
1 vote
0 answers
673 views

How to compute weights using logistic regression

I'm not interested in creating a logistic regression model which is weighted, I'm interested in using a logistic regression model to compute the weights of a survey. It seems that, given the weights ...
baxx's user avatar
  • 936
1 vote
0 answers
287 views

How should I weight the "other" gender response with survey data?

I have a survey with multiple demographics, including gender, age group, and state (USA). Because I want my sample to be more representative of the United States as a whole, I am using raking to ...
Max Candocia's user avatar
0 votes
1 answer
716 views

How to combine complex surveys from different populations?

I have 4 national representative surveys (DHS) and let us assume one survey belong to one country (e.g. Lesotho, Namibia, South Africa, and Zimbabwe). The sampling method used in these surveys is ...
Esteban M. Correa's user avatar