Questions tagged [survey-sampling]
Creating samples from a well-specified population (human: all adults; registered voters; individuals with diabetes; students of a university; establishment: all firms; firms with employment of 200 or more in New York City; resource: all land of a country or a state/province) using a probabilistic method, with the purpose of inference to that specific population
250 questions
0
votes
0
answers
6
views
Comparing two identical surveys with different samples sizes and response levels
Silly question, but my mind has gone blank. I’m trying to undertake a simple comparison between two samples (organisational performance) using an online survey. To obtain a CI of 80% with a margin of ...
3
votes
1
answer
58
views
Would withholding marks until students respond to a survey bias the responses?
My university is running an anonymous survey, mostly to check if we understand how we are going to be assessed, if we are comfortable with the material, and if we find the material well organised. ...
0
votes
0
answers
27
views
Proper Difference-in-Difference Model for Time Variant Groups
Take the following example... I have two areas: Area A and Area B. Area A are individuals in a geographic area who are exposed to a health intervention. The health intervention is applied to the ...
1
vote
0
answers
31
views
Combining Survey Weights
I have an annual survey that is somewhat complex in design. Sample frames are pulled quarterly and overlap. Each quarter's sample is removed from the subsequent quarter's frame. Samples are stratified ...
0
votes
0
answers
8
views
Can you build up statistical validity with multiple month's worth of the same survey questions?
The company I work for conduct consulting where we analyse company survey responses for statistical validity against the general company population.
To prevent survey fatigue, the company sends out ...
0
votes
1
answer
35
views
Sample size for survey
My interest is to perform a statistically significant survey on a population of 1700 people, that can be described in different categories, so each person belongs to only one category.
I have two ...
1
vote
0
answers
54
views
Multi-level Model and Multi-level Data
I have a question about multi-level models with multi-level survey data. I am working with survey data that has a two-stage sampling design with primary sampling units defined as schools randomly ...
3
votes
1
answer
41
views
Statistical Non-Response and Drop Out
In statistical studies, it is possible that there might be biases:
Someone groups of people are more likely to be represented compared to others groups of people (e.g. poorer people have difficult ...
0
votes
0
answers
6
views
Studying more than one sampling unit in a single randomization
So we have a list of organizations that dedicate themselves to a certain social service.
Our goal is to ask both laborers and customers an overlapping(but not fully) set of questions for each one of ...
0
votes
0
answers
45
views
Is naive mean estimator uniformly worse than HT (IPW) or Hajek estimators in survey sampling? If not, why is it less discussed in the literature?
Consider a toy example: we are interested in the average height of $n$ students $\bar{\tau}=\frac{1}{n}\sum_{i=1}^n\tau_i$, but for some reason, we can only access a random subset $S$ of it. Every ...
1
vote
1
answer
74
views
Sample a random subgraph from an undirected, unweighted graph, what's the probability of "every two nodes's distance is at least 3 in the subgraph"?
This may be a problem in sampling theory or graph theory. I have done many research but I still didn't find valid solutions.
I know a simple random sample is representative of the population. Now I ...
2
votes
0
answers
25
views
Use calibration weights to correct for unit non-response bias?
I have a question about how calibration weights can be used to sufficiently correct for unit non-response bias. Suppose the sample is s and the response set is r.
Calibration is applied to the ...
2
votes
1
answer
107
views
In stratified sampling, why is the stratum population variance obtained by dividing by 1 less than the stratum size
I am aware that flavors of this question get asked a lot, for e.g., here. I am fine with the sample variance being divided by $n-1$ and that is what makes it an unbiased estimator of the population ...
0
votes
0
answers
24
views
What weight to use for using NAMCS and NHAMCS together?
I am interested in analyzing the total prescription of aspirin in NAMCS and NHAMCS (https://www.cdc.gov/nchs/ahcd/index.htm) during a given year for all visits. NAMCS and NHAMCS each had a weight and ...
0
votes
2
answers
168
views
Reliability of online surveys
I'm trying to get an idea about reliability of online surveys: I found some indication that "internet-based surveys produce data that is at least as reliable, valid, and of equal quality as data ...
1
vote
0
answers
30
views
Sampling inquiry for thesis [closed]
I have a mixed-method thesis ongoing and I plan collecting data on my own college (namely college X), specifically from students and faculty members on my department. Evidently, that would be ...
0
votes
1
answer
114
views
comparing two samples drawn using two different sampling methods
This is a hypothetical question, so I don't have a lot of additional details to give. However my question is pretty straightforward:
Is it theoretically valid to conduct tests (e.g. for comparing ...
0
votes
0
answers
31
views
Can I add a variable to a complex sample, and run a regression?
In a survey, a complex sample was collected, and the sample was designed to provide estimates at national level. In other words, individuals from one state were more likely to be sampled due to ...
1
vote
0
answers
80
views
Measuring the reliability of a survey data
I have a survey data and I applied KR20 on it. The KR20 score is 0.63 which means this survey result is not consistent and reliable, at least not in a reliable range with the definition of a reliable ...
0
votes
0
answers
36
views
Sampling error for proportion with finite population - which correction to use?
I am trying to calculate sampling error for a questionnaire that was answered by some of the participants in a program (say about . I want to calculate the sampling error for the proportion of the ...
0
votes
1
answer
36
views
How to treat age-eligibility thresholds in household surveys (e.g. HRS)?
Most household surveys have age-eligibility thresholds. The HRS interviews individuals aged 51 and older, plus their spouse (if any) using PPS sampling.
Do I need to drop individuals who are younger ...
1
vote
0
answers
47
views
Unbiased estimate of mean test score of pupils in a country (sampling frame of schools is avaible only)
My primary goal is to get unbiased estimate of mean test score of every pupil in a country. I have no sampling frame of all pupils to randomly sample from. But I have a sampling frame for every school....
1
vote
1
answer
219
views
When to use replicate weights in complex survey analysis
I am curious about when it is recommended to use replicate weights in survey analysis. I compared the usual survey analysis with using replicate weights, as illustrated below.
Based on the paper "...
2
votes
1
answer
76
views
Different ways to define survey design object under MCAR assumption
I have a stratified random sample, and would like to conduct complete-case analysis, assuming Missing Completely At Random. However, I find that there seem to be two ways to define survey design ...
1
vote
2
answers
157
views
Is it reasonable to subset a survey design object by dependent (outcome) variable and fit a weighted logistic regression model?
I would like to study which factors are associated with an outcome which has more than two categories. After considering multinomial logistic regression model (which I find is very challenging to ...
0
votes
0
answers
130
views
What is the difference between a repeated cross-sectional survey design and a trend survey design?
Most of the references I have checked for repeated cross-sectional design and trend design (a type of longitudinal design) have said that they are one and the same. However, my professor says that ...
1
vote
1
answer
96
views
Domain (subgroup) estimation in a stratified random sample
I want to ask a question about domain estimation (i.e., estimation of a parameter among subpopulations) in a stratified random sample.
It seems to me that in a stratified random sample, domain ...
0
votes
0
answers
168
views
How can I reweight survey data
I have survey data from a complex survey with stratification, weights and clustering.
I'm using the survey package in R to run regressions:
...
2
votes
1
answer
41
views
Cluster sampling result in larger sample-to-sample variability
I'm reading STATA's Survey Data Reference Manual.
There is written that:
Cluster sampling typically results in larger sample-to-sample variability than sampling individuals directly.
Do you have an ...
0
votes
1
answer
88
views
Can I make a proportional-to-size without replacement sample (PPS WOR) self weighted?
Let's say I have 100 schools and each has a different number of students. I want to estimate which % of students are in schools with electricity. Simulation and theory indicate it is more efficient to ...
2
votes
1
answer
195
views
Appropriate way to use post-stratification weights when running statistical tests SPSS
I have used Complex Samples in SPSS (and SUDAAN in SAS, Survey in R) when working with survey data that were collected using a sampling design that was not random. For example, when an oversample was ...
1
vote
0
answers
24
views
Small Area Estimation techniques when no micro information is available
Small area estimation (SAE) techniques combine information from household surveys with existing auxiliary information at population level to make inferences of certain indicators for population groups ...
7
votes
1
answer
229
views
Who created the "soup analogy" for sampling
The soup analogy is,
You only need a single spoon to sample the soup, provided it is well stirred.
It has been used several times here Sampling distributions of sample means and What is your ...
4
votes
1
answer
99
views
What are the differences and common points, if any, between oversampling as a survey design method and oversampling in a machine learning context?
I've seen the term "oversampling" used in a survey design methodology context and in a machine learning context (e.g. methods like SMOTE). I'm intrigued by the differences between the two.
...
2
votes
0
answers
94
views
What is the (Ratio estimator for the) covariance of two weighted means? [closed]
In a previous question I've asked How to estimate the (approximate) variance of the weighted mean?, specifically, how to prove the following formula:
$$
\widehat{\sigma_{\bar{y}_w}^2} = \frac{1}{(\sum{...
5
votes
1
answer
225
views
Why does the survey package in R and SPSS complex samples add-on give different standard errors?
I was comparing results that I generated in R for complex survey analysis using the survey package to results from SPSS using the complex samples analysis add-on. The sample size is large ~ N=5500
...
1
vote
0
answers
31
views
Conformal prediction for model-assisted survey estimation
In model assisted survey estimation, one typically uses the generalized difference estimator:
$$
\hat{t}_{ma} = \sum_{k \in U} \hat{m}(\mathbf{x}_k) + \sum_{k \in S} \frac{y_k - \hat{m}(\mathbf{x}_k)}{...
5
votes
0
answers
159
views
Question concerning svydesign and svyglm in R
I have a complicated data set which was made by a multistage stratified cluster design. I had originally analysed this using glm, however now realise that I have to use svyglm. I'm not quite sure ...
0
votes
0
answers
25
views
Definition of rotated panel sampling
I am doing exercises and I come across a question that asks me to describe sampling with rotated panel.
What does rotated panel sampling mean?
2
votes
1
answer
46
views
Can I CUT a sample to become representative?
Suppose that, from a finite population, we estimated the minimum sample size as 1000 to reach our desired confidence level and error.
Data was collected using an online survey and the survey remained ...
0
votes
0
answers
48
views
Is it appropriate to pre-stratify and post-stratify along different delineations of the same variables in a single survey?
I am working in the context of opt-in, web-based surveys. Often the desire is for accurate population estimates, and often at a country-wide level. The standard approach at this organization is to ...
1
vote
0
answers
24
views
How can I show probability of selection change when adding stratification to a survey design
I have a survey that uses a stratified sampling approach with optimal allocation.
The team conducting the survey has asked that we make two changes:
Subdivide one of the strata into smaller pieces. ...
2
votes
0
answers
196
views
Survey with two simple random samples without repetition
I have an particular exercise of sampling survey, or sampling theory, which I report below.
One is interested in knowing the price per gram of gold produced by 100 companies. A monthly survey of a ...
0
votes
1
answer
21
views
Population bias in survey leading to inaction
This isn’t exactly an academic statistics question, but it is a real problem that I’m trying to understand with regards to bias in survey statistics leading to issues in real-world decision making.
I’...
3
votes
1
answer
113
views
Is post-stratification inherently non-Bayesian?
It is increasingly common to employ regression with post-stratification. Since probability-weighting is incoherent in Bayesian inference (thus why sampling/survey weights and weighted psuedo-...
0
votes
0
answers
97
views
(i.i.d) Random sampling assumption in practical situations
This is a practical question.
Assume that there are two finite populations X and Y in the real world. For example, we want to compare $\bar{X}$ and $\bar{Y}$. We can use a probability sampling scheme ...
8
votes
5
answers
2k
views
When sampling a population for surveys we can often limit our sample size to hundreds, but when doing a Monte Carlo simulation we need way more. Why?
I’m a bit of a stats-noob, so I am not sure I will manage to formulate this question properly, but let me do my best.
I‘m trying to develop an intuition for sample sizes and when they are sufficient ...
0
votes
1
answer
50
views
How to interpret this Cumulative distribution function? [closed]
My colleagus has asked me to read through the ETOS material for estimating confidence intervals (published by Statistics Sweden) and point estimates, knowing I have a bachelor's degree in statistics. ...
1
vote
0
answers
22
views
How to top-up a panel survey to also get population estimates
I'm running two sequential surveys on the same population, and I have a question about the proper sampling methodology in the second round of a panel that is also meant to get population estimates. I ...
2
votes
2
answers
69
views
Analyzing data from a non-randomized sampling design (ecological monitoring)
I have 2 questions about analyzing data that was not randomly sampled from a population.
I work with "ecological monitoring" data that involves repeatedly taking measurements from the same ...