Questions tagged [survey-weights]
survey weights are used when data are collected according to a probability sampling design with unequal probabilities of selection and/or response
222 questions
2
votes
1
answer
30
views
Alternative to R-squared when calculating glm model using survey package
I'm writing my master thesis using the European Social Survey. The data requires weighting, so therefore I have to use the survey package and its ...
1
vote
0
answers
22
views
Incidence influenced by test frequency
I’m puzzled with a study design. Say I want to study a disease incidence, but this disease is asymptomatic mostly and its detection merely relies on regular testing. Some people test more frequently ...
1
vote
0
answers
45
views
Test (quantify) the association between categorical and numerical variable in survey data
I've just started working with survey data and I want to test independence between a numerical variable and a categorical one.
I've heard of weighted ANOVA, but how can I test the normality and ...
0
votes
0
answers
15
views
How to Calculate Cumulative Incidence for Each Case ID in Competing Risks Analysis Using R?
I’m working with a dataset in R and trying to calculate the cumulative incidence for each case ID in the presence of competing risks. My dataset looks like this:
...
0
votes
0
answers
11
views
Calculating change in distribution in two cross-sectional survey years - calculating weights
I am trying to compare two cross-sectional survey years (e.g., 2001 and 2011) on the same variable. These are different, independent populations (this is NOT a longitudinal study; rather, two cross-...
1
vote
0
answers
46
views
Quantile regression to compare two populations at different time points
I am trying to compare two different populations (e.g., 2001 and 2011) on the same variable. These are different, independent populations (this is NOT a longitudinal study; rather, two cross-sectional ...
5
votes
0
answers
32
views
How are effect sizes typically defined under complex sampling?
In some fields it is common to quote comparisons in terms of standardised effect sizes. For example, a difference in means gets scaled by the sample standard deviation to give an effect size.
This ...
0
votes
0
answers
12
views
Pooling survey data with panel households
I am analyzing some data from a national survey. I am working with two sets of data for the same variable (wealth) and some of those households are panel (i.e. interviewed both times). Now I want to ...
1
vote
0
answers
18
views
Handling Missing Groups in Stratified Sampling for Weighted Mean Calculation
I conducted a statistical survey using a stratified sample to measure the knowledge of Italian students on a specific topic. The population was stratified according to the following categories: Area (...
0
votes
1
answer
23
views
Defining a variable based on two different variables with different weighting schemes in NHANES
I want to define a derived variable (Let's call it $Z$) using two original variables ($X$ and $Y$) in NHANES. If any of $X$ or $Y$ meets the criteria, the value for $Z$ should be 1; otherwise, it ...
1
vote
0
answers
31
views
Combining Survey Weights
I have an annual survey that is somewhat complex in design. Sample frames are pulled quarterly and overlap. Each quarter's sample is removed from the subsequent quarter's frame. Samples are stratified ...
3
votes
1
answer
95
views
Is bootstrapping inherently Frequentist? If so, how do we do a Bayesian non-parametric two-sample test?
I normally use frequentist statistics but I now want to use Bayesian statistics as I want to carry out a two-sample (randomised control trial) test that includes prior information. I have an existing ...
0
votes
0
answers
19
views
Do replicate weight, when the are available, supersede all the design-based procedures and analysis?
If the agency that conducted a survey and releases micro data at the individual or household level also supplies replicate weights for that sample, do these weights completely supercede any need or ...
0
votes
0
answers
13
views
Whether to specify ddf in confint.svyglm()
I have a main question about whether to specify ddf in confint.svyglm(). Using the two specifications below generate slightly ...
0
votes
0
answers
29
views
Weighing Data Issue
I am looking at e-cig prevalence within a city. I used surveys to collect data from residents, and I have a query around weighing data.
I have made the assumption, due to over and underrepresentation ...
0
votes
0
answers
62
views
Post-hoc tests of interaction for a negative binomial regression with survey weights?
I ran a negative binomial regression with survey weights using the R package survey and the function svyglm() and observed a ...
2
votes
2
answers
81
views
Weighing Questions within a Survey
I work for a public child welfare agency. We have something called a level of care tool that categorizes children based on the amount of care they need. This helps us identify appropriate care options ...
2
votes
0
answers
25
views
Use calibration weights to correct for unit non-response bias?
I have a question about how calibration weights can be used to sufficiently correct for unit non-response bias. Suppose the sample is s and the response set is r.
Calibration is applied to the ...
1
vote
1
answer
274
views
How do I use something like predict.glm (in R) with a svyglm model and why don't my predictions match my data?
I'd like to estimate "cost" using some covariates with a weighted gamma model using svyglm. The weights sum to 1, and there are about 10,000 rows in the dataframe df total, with columns ...
4
votes
1
answer
149
views
Are there coventions on reporting weighted sample sizes?
Suppose I analyze survey data and calculate weighted means. Should I report the sample size if it differs from the unweighted sample size, as would be the case with non-normalized weights? It seems ...
1
vote
0
answers
66
views
R survey package: compare weighted prevalence with literature result
I have a large dataset (>3000 cases) with results of serological tests. Based on these results i want to calculate the weighted prevalence of the disease. The data is categorical (0 = disease ...
1
vote
0
answers
69
views
Multilevel Regression and Poststratification (MRP) weighting in the opposite direction to Poststratification alone
We are using MRP to derive test norms for an IQ test (Culture Fair Test, CFT) based on the TwinLife data. We adjust for age, sex, education, and migration background. Although a probability sample, ...
0
votes
0
answers
31
views
Can I add a variable to a complex sample, and run a regression?
In a survey, a complex sample was collected, and the sample was designed to provide estimates at national level. In other words, individuals from one state were more likely to be sampled due to ...
0
votes
0
answers
37
views
Complex survey design with multiple waves
The organization I work for has collected data from individuals in multiple waves. Their goal was to collect 333 individuals in 6 different groups (genderXgroup). If the first Wave did not reach 333 ...
0
votes
1
answer
36
views
How to treat age-eligibility thresholds in household surveys (e.g. HRS)?
Most household surveys have age-eligibility thresholds. The HRS interviews individuals aged 51 and older, plus their spouse (if any) using PPS sampling.
Do I need to drop individuals who are younger ...
1
vote
0
answers
47
views
Unbiased estimate of mean test score of pupils in a country (sampling frame of schools is avaible only)
My primary goal is to get unbiased estimate of mean test score of every pupil in a country. I have no sampling frame of all pupils to randomly sample from. But I have a sampling frame for every school....
1
vote
1
answer
219
views
When to use replicate weights in complex survey analysis
I am curious about when it is recommended to use replicate weights in survey analysis. I compared the usual survey analysis with using replicate weights, as illustrated below.
Based on the paper "...
2
votes
1
answer
76
views
Different ways to define survey design object under MCAR assumption
I have a stratified random sample, and would like to conduct complete-case analysis, assuming Missing Completely At Random. However, I find that there seem to be two ways to define survey design ...
1
vote
2
answers
157
views
Is it reasonable to subset a survey design object by dependent (outcome) variable and fit a weighted logistic regression model?
I would like to study which factors are associated with an outcome which has more than two categories. After considering multinomial logistic regression model (which I find is very challenging to ...
4
votes
1
answer
106
views
Is it valid to compute survey weights using the distribution of a hypothetical/"imaginary" population?
I'm informally reviewing a study by a coworker, but as I'm not a specialist of survey weighting procedures, I need a second opinion about something I disagree with him.
He collected a convenience ...
2
votes
1
answer
76
views
writing a simple linear (and logistic) regression equation that incorporates survey weights
I am running a basic linear (and then logistic) regression that incorporates basic survey weights. I would like to specify the general regression equation with the survey weights for a manuscript. But ...
0
votes
0
answers
63
views
Longitudinal survey data weights
I'm dealing with a longitudinal survey data now. Do we need to use survey sample weights while running panel data models (e.g., Fixed Effects, Random Effects)? In other words, would it be fine if I ...
1
vote
1
answer
71
views
Is it possible to use poststratification when some observations have missing values on the variables used as strata?
This is a theoretical question, so I don't have data to share.
Let's say I know the percentage of men and women in my population of interest, as well as the distribution of occupations and age ...
1
vote
1
answer
89
views
How to interpret survey data with demographic information?
I have little statistical experience, but am helping to run a community needs survey for my organization. Many of the questions have a yes/no or multiple choice answer format. For example, "Do ...
1
vote
1
answer
96
views
Domain (subgroup) estimation in a stratified random sample
I want to ask a question about domain estimation (i.e., estimation of a parameter among subpopulations) in a stratified random sample.
It seems to me that in a stratified random sample, domain ...
0
votes
0
answers
168
views
How can I reweight survey data
I have survey data from a complex survey with stratification, weights and clustering.
I'm using the survey package in R to run regressions:
...
1
vote
1
answer
89
views
Variance test with survey weights
When not working with surveys, you do a variance test with var.test(). What do you do when you want to account for complex survey design?
1
vote
0
answers
26
views
Is there relationship between propensity score based causal inference and sampling weights?
Consider observational study with single outcome $Y$, single covariate $X$ and treatment assignment variable $W$. Under unconfounded treatment assignment assumption, $E_{sp}[Y(1)]=E[\frac{Y_i^{obs}W_i}...
0
votes
1
answer
55
views
Post stratification on a twophase object in the R survey package
I am working with NHANES survey data and I am trying to use a twophase survey design along with calibration to adjust for item non-response. For now, I am following an example from Lumley's Survey ...
0
votes
1
answer
88
views
Can I make a proportional-to-size without replacement sample (PPS WOR) self weighted?
Let's say I have 100 schools and each has a different number of students. I want to estimate which % of students are in schools with electricity. Simulation and theory indicate it is more efficient to ...
0
votes
0
answers
109
views
Weighted logistic regression for complex survey design
I want to run a logistic regression on my dataset in R. I want to test the probability of which direction a fish is facing depending on my variables. I am considering using a weighted logistic ...
1
vote
0
answers
241
views
Nb/Poisson regression with weighted survey data resulting in counts with decimals
I am analysing suicide counts and thus it seems appropriate to use Nb/Poisson regression. However, my counts come from survey data and are only whole numbers when unweighted. Once I realise the ...
2
votes
1
answer
195
views
Appropriate way to use post-stratification weights when running statistical tests SPSS
I have used Complex Samples in SPSS (and SUDAAN in SAS, Survey in R) when working with survey data that were collected using a sampling design that was not random. For example, when an oversample was ...
1
vote
1
answer
783
views
how to understand the weights in PSM?
When using propensity score matching or weighting, a column of weights is generated that is used to estimate the effect of interest.
According to a blog I read, there are three types of weights ...
2
votes
0
answers
38
views
Where does weight factors come into play with real data (in R)?
I do understand the IMHO simple concept using weights to fit a sample to another (real) population. Calculating the weight factors is not hard. Regarding to my MWE there maybe is a more elegant R way ...
2
votes
1
answer
92
views
Post-stratification with missing subpopulations in the survey
I have some survey data on a population described by age, gender, and weight. It’s quite skewed so I want to reweight it to a known target population (a larger ...
0
votes
1
answer
469
views
Calculate risk ratio in weighted population
I have a propensity score weighted population (using IPTW) and I want to compute risk ratios on my weighted population. For this, I am using a weighted Poisson regression.
Let's suppose that "...
2
votes
0
answers
94
views
What is the (Ratio estimator for the) covariance of two weighted means? [closed]
In a previous question I've asked How to estimate the (approximate) variance of the weighted mean?, specifically, how to prove the following formula:
$$
\widehat{\sigma_{\bar{y}_w}^2} = \frac{1}{(\sum{...
5
votes
1
answer
198
views
svychisq with statistic="Chisq" vs. statistic="adjWald"
I'd like to run analyses using Rao-Scott Chi-squared test using svychisq to account for survey weights. It seems that statistic="Chisq" gives the Rao-Scott version, however, now that I'm ...
5
votes
1
answer
225
views
Why does the survey package in R and SPSS complex samples add-on give different standard errors?
I was comparing results that I generated in R for complex survey analysis using the survey package to results from SPSS using the complex samples analysis add-on. The sample size is large ~ N=5500
...