Newest 'selection-bias' Questions

0 votes

0 answers

8 views

Selection Correction Method with the Variance of the Truncated Normal Distribution

Consider the following data generating process: $$Y=\beta_0+\beta_1X_1+\beta_2X_2+\beta_3u+\varepsilon,$$ $$D=1\left[\gamma_0+\gamma_1Z+\gamma_2X_1+\gamma_3X_2+u>0\right],$$ where $D=1$ if the unit ...

MinChul Park

441

asked Dec 13 at 16:23

0 votes

0 answers

10 views

In heckman model can i use dependent variable in selection equation as independent variable in outcome equation? [closed]

I have included above said variable as dependent in selection and independent in outcome equation.. But in results the choosen and most importent variable got omitted.. It is showing ommitted because ...

Tej

1

asked Dec 13 at 15:47

9 votes

3 answers

1k views

How to handle bias in 1-5 star ratings?

I was discussing with a friend my bad experience with a health insurance company, and as support for my impression, I pointed out to her that trustpilot gave it a very low score. There were 70 reviews,...

user6376297

761

asked Dec 2 at 19:29

2 votes

2 answers

86 views

Definition of selection bias vs confounding bias

I've been learning about causal inference, having read Pearl's Primer and Parts I and II of "What If?". I was under the impression that the definition of "There is confounding" was ...

ThighCrush

225

asked Oct 10 at 7:51

3 votes

2 answers

72 views

Limitations of propensity score matching

While studying propensity score matching, I was struck by the following thought: When we are running a logistic regression model to estimate $p(Z=1∣X)$ through some form of parametrization and we are ...

richardjoseph

31

asked Oct 9 at 9:10

3 votes

1 answer

71 views

How to Hyperparameter Tune without sample Bias?

While searching for ways to fine-tune the hyperparameters (HP) of my models I found out multiple reference to Cross Validation Techniques (K-folds, LPO, OOB.632+) and Ways to Select the Best ...

Linces games

31

asked Sep 28 at 23:14

0 votes

0 answers

42 views

Is this confounding bias or selection bias or both?

Can confounding and selection bias (biased sampling) be the same? In epidemiology, selection bias and confounding are often considered as two different biases. I wonder if they can be same in certain ...

Vincent

431

asked Sep 13 at 18:05

9 votes

3 answers

768 views

Is prescreening not detrimental for paid surveys?

Survey sites like Swagbucks have often a prescreening mode in which one is asked questions like your annual income, whether you own car or not. It is observed that most of the time if one selects ...

Splendid Digital Solutions

191

asked Jul 9 at 11:28

1 vote

0 answers

19 views

Inverse Probability of Weighting in Directed acyclic graph for a binary collider as a selection bias

For a confounder, like the following figure, it is commonly suggested that use of the Inverse Probability of Weighting can remove the path from confounder to exposure so that it removes the backdoor ...

Elong Chen

11

asked Jun 12 at 2:42

3 votes

1 answer

41 views

Statistical Non-Response and Drop Out

In statistical studies, it is possible that there might be biases: Someone groups of people are more likely to be represented compared to others groups of people (e.g. poorer people have difficult ...

user412241

asked May 9 at 18:12

0 votes

0 answers

20 views

Positivity Assumption in Propensity Score Methods for Pre- and Post-Treatment [duplicate]

I am designing a research project and could use some guidance. My research question focuses on estimating the effect of a new co-responder policing program on use-of-force and arrests. I want to see ...

galaxy-friday1017

11

asked Apr 18 at 16:36

0 votes

1 answer

63 views

conditional-on-positives bias

I am reading the Bad COP section on https://matheusfacure.github.io/python-causality-handbook/07-Beyond-Confounders.html#bad-cop. I am confused if $$ E[Y|T = 1] - E[Y|T = 0] = \\ E[Y|Y > 0, T = 1]...

Anonny

143

asked Mar 30 at 22:03

1 vote

0 answers

121 views

Regression Discontinuity Design, staggered treatment allocation

I'm unsure if this complex allocation rule is appropriate for RDD. I will have data for a staggered rollout treatment where there will be about 10 rounds of selection over two years for services (...

dcoy

372

asked Mar 19 at 18:23

1 vote

1 answer

38 views

Bias introduced by removing early censors

Suppose we have right-censored survival data on some population, and want to compare individuals with "good outcome" (who have no event in the first X months) to individuals with "bad ...

Nuclear Hoagie

10.4k

asked Mar 12 at 16:43

0 votes

4 answers

129 views

Small-sample binary logit and linear models - response to referees [closed]

Background: This cross-sectional study collected 30 thrombosis samples. We evaluated the presence or absence of MP components (dependent variable), where 24 cases had MP (coded as 1) and 6 cases did ...

zhiheng yi

147

asked Feb 27 at 11:07

1 vote

1 answer

52 views

Average treatment effect (ATE) estimation via matching method while outcomes of control population are constant

I want to estimate the average effect of a treatment that was given with a selection bias. To do this, I'd like to use a matching method. Basically, this method involves finding, for each treated ...

HnbBarca

11

asked Feb 17 at 9:18

0 votes

0 answers

37 views

Find correlation from biased observations

I have a set of observations of a variable Z (shown as the colormap) as a function of two other variables A and B. I want to study how Z varies with respect to A, B, and both A and B (eg. if A ...

Euryproktos

1

asked Feb 12 at 14:43

2 votes

0 answers

37 views

How to estimate the age of players correctly?

I have the data of players active on a gaming console and the playtime hours corresponding to the games they have played and their age. I want to analyze the top (say 10) games that the people between ...

Ritik P. Nayak

313

asked Jan 18 at 7:12

3 votes

1 answer

68 views

Deriving conditional independence statements for causal graphs with selection nodes

In "basic" causal graphs / DAGs / probabilistic graphical models (PGMs), conditional independence statements can be derived using the d-separation criterion. How does this work if selection ...

Eike P.

3,098

asked Jan 8 at 18:59

1 vote

1 answer

22 views

Choice and endogeneity

An independent variable is endogenous if it is correlated with the error term (source). In the regression framework, this may happen (only?) in case of omitted variables, simultaneity, or measurement ...

robertspierre

2,433

asked Jan 3 at 15:43

0 votes

0 answers

38 views

Difference in Difference and Selection Into Treatment

Suppose I impose that the true model of some variable of interest is: $$ Y_{it} = \alpha_i + \beta_t+\tau_{it}D_{it}+\epsilon_{it} $$ Where $ D_{it} = 1\{E_i \geq t\} $. This is a kind of DID model ...

DarkenExcalibur

303

asked Dec 27, 2023 at 19:27

1 vote

0 answers

271 views

Calculating Inverse Mills Ratio after Probit

I need to compute the Inverse Mills Ratio after the probit command in Stata. From here, I found that predict IMR1, score, will calculate it and store it in IMR1. I ...

user917983

11

asked Oct 22, 2023 at 20:04

5 votes

2 answers

243 views

Correcting for selection bias with standardisation/g-computation

Two sets of methods for correcting for selection bias are g-computation (standardisation) and inverse probability of censoring weighting (IPCW). I'm having a difficult time understanding how to apply ...

Lachlan

1,182

asked Oct 9, 2023 at 13:25

0 votes

0 answers

25 views

Right Way to Sample a Validation Set

I am working on a project that uses training data selection techniques; it involves sampling the training set in some smart way rather than sampling randomly. The goal is to compare different data ...

Mr.Robot

247

asked Sep 19, 2023 at 23:22

1 vote

1 answer

112 views

Understanding selection bias and endogeneity in marketing

Media mix modelling is concerned with estimating causal impact of marketing investments , a goal which have several challenges. In general, multiple regression models are deployed mapping up total ...

kurt eriksson

13

asked Sep 2, 2023 at 19:04

0 votes

0 answers

36 views

Is "skewing the data" and "skewing the results" just selection bias?

I recall various conversations with biologists, ecologists, and foresters that I neglected to ask for clarification on at the time. It doesn't occur in any of my statistics references. Sometimes in ...

Galen

9,680

asked Aug 27, 2023 at 15:50

1 vote

0 answers

22 views

Can I use Shapley values with metadata (i.e. information about observations that I didn't train my model on)?

I'm training a set of models (random forest/XGBoost) for an ordinal regression task. I'm (tentatively) planning to use Shapley values to infer feature performance. I also have some metadata that my ...

Neil

66

asked Jul 6, 2023 at 22:37

1 vote

0 answers

34 views

How can aggregation be helpful in mitigating bias?

I am working on the estimation assessing the impact of exposure to infrastructure (mainly schooling) on the number of children. Since I do not have migration data, my colleague recommended that I ...

Yendao Su

61

asked Jun 23, 2023 at 15:39

1 vote

1 answer

32 views

Selection Bias in Conflict Studies

A common critique I have heard levied against conflict studies (research examining the causes, consequences, and solutions to violence such as civil war, terrorism, etc.) is the problem of selection ...

Brian Lookabaugh

825

asked Jun 9, 2023 at 13:51

2 votes

0 answers

853 views

How to address selection bias in a diff-in-diff study?

We know that selection bias occurs when the treatment and control groups are not comparable, leading to differences in the outcome that are not solely due to the treatment. First edit: By selection ...

funcard

61

asked Mar 27, 2023 at 4:30

0 votes

0 answers

70 views

Heckman correction for correlation estimates

Suppose I observe random $y_{i,1}, y_{i,2}$, and I wish to estimate the correlation between them. However, the $y_{i,j}$ are observed subject to some sample selection criterion. That is, there are ...

shabbychef

15k

asked Mar 10, 2023 at 18:18

1 vote

1 answer

39 views

Comparing a multi-dose drug to no drug exposure in a cohort study: Censoring events between doses

I am interested in assessing the association between the two doses of a dietary supplement on an event of interest. The primary exposure is 'two doses of the supplement', and the comparator is 'no ...

user3qpu

109

asked Feb 21, 2023 at 21:00

0 votes

1 answer

144 views

How to understand random assignment eliminates selection bias in the potential outcomes framework

In Angrist & Pischke's book mostly harmless econometrics, they explain that if the treatment in an RCT $D_i$ is randomly assigned, then $D_i$ is independent of potential outcomes and the following ...

Tomas R

177

asked Jan 17, 2023 at 11:43

1 vote

0 answers

18 views

Selection bias in postmortem data and creating an artificial earlier study endpoint

I want to analyze postmortem (neuropathology) data from dementia patients who are part of a larger ongoing observational study. At the time of the data freeze (i.e. the time at which I access the data)...

AnnaC

11

asked Dec 7, 2022 at 14:23

0 votes

1 answer

124 views

Questions Regarding Sampling Bias

I'm taking a course in R: "Data Analysis in R" on Coursera, and I came across this question during the lecture: A retail store considering updates to their credit card policies randomly ...

JackJackAttack0214

11

asked Nov 17, 2022 at 6:32

2 votes

1 answer

67 views

How to correct for sampling bias in one population when comparing against another

I have two populations that I'd like to compare across certain metrics. However, most members of population A did not respond to our request for data, and those respondents that did are not ...

mdrishan

207

asked Nov 4, 2022 at 21:08

1 vote

1 answer

51 views

Maximum likelihood of Normal density under selection

Consider the density function given by $$ \left[\dfrac{\gamma_{\leq0} \mathbb{1}(t \leq 0) + \gamma_{>0} \mathbb{1}(t > 0)}{\gamma_{\leq0}\Phi\left(- \mu / \sigma\right) + \gamma_{>0}\Phi\...

Student_718

83

asked Sep 7, 2022 at 22:50

1 vote

0 answers

49 views

Why does normalizing difference score>0.25 indicates selection bias which cannot be corrected by regression?

I am reading Propensity Score Analysis(2014) by Guo and Fraser chapter 1 section 4. Denote $\Delta_X$ normalizing difference score of covariate $X$. "Following Imbens and Wooldridge, a $\Delta_X$ ...

user45765

1,465

asked Aug 1, 2022 at 13:52

1 vote

0 answers

148 views

Inverse Mills Ratio Interpretation [closed]

What is the interpretation of inverse mills ratio in Heckman Selection Model ? Why we are including it as an explanatory variable in the OLS estimator?

Shivam Saboo

11

asked Jul 30, 2022 at 3:53

1 vote

1 answer

24 views

Control Group Selection Bias

I found a study that compared minor physical anomalies(MPA) between certain group of patients with the control group to determine if MPAs occur more frequently among these patients compared to the ...

Kim

11

asked Jul 25, 2022 at 6:51

1 vote

1 answer

19 views

Estimating interactions from non-interacting features

Suppose I have a sample $\mathcal{D}=\{(\mathbf{x}^{i}, y^{i})\}_{i=1\dots M}$ of binary variables $\mathbf{X}$ ($N$ of them) and a continuous variable $Y$ that I want to predict based on a linear ...

Sergio

336

asked Jul 22, 2022 at 17:04

1 vote

0 answers

99 views

Nested Cross-Validation with Small dataset

I am currently working with a small dataset (only 175 samples, 45 features) and have been reading on the proper way to cross-validate my model. I had started with a basic cross-validation using a grid ...

Fritos121

11

asked Jul 1, 2022 at 2:57

2 votes

1 answer

315 views

Sampling weights in Cox proportional hazards models

I'd like to use sampling weights in a Cox proportional hazards regression model to address selection due to different response probabilities. I'm calculating the weights as Inverse Probability of ...

r_epi

31

asked Jun 6, 2022 at 15:10

4 votes

3 answers

178 views

When can we get unbiased estimate given biased data?

There was a recent "hot take" tweet by Andrej Karpathy (without any comment or clarification from the author): real-world data distribution is ~N(0,1) good dataset is ~U(-2,2) It provoked ...

Tim

141k

asked May 24, 2022 at 11:28

3 votes

1 answer

364 views

Selection Models of Publication Bias for Multilevel Meta-analyses?

Are there any suitable selection models of publication bias for multilevel meta-analyses? I am currently conducting a 3-level meta-analysis and trying to incorporate selection models to assess ...

makie

73

asked Apr 9, 2022 at 14:32

1 vote

1 answer

30 views

Sample Bias in Study

I have following Study statement: A council wishes to study the digital awareness of its resident senior population (over 65 years), so it questioned in person 50 residents randomly chosen from a ...

Snoke

23

asked Apr 5, 2022 at 18:24

1 vote

1 answer

89 views

How to show how biased a statistic is in a non-random sample, knowing the parameter in the general population?

I have a convenience sample, and want to show readers how biased it is relative to the population it's taken from. It's absolutely certain that the sample is biased, and I want to give readers as many ...

J-J-J

5,873

asked Mar 24, 2022 at 13:40

1 vote

1 answer

139 views

Is it true that a larger, representative dataset is always better to use than a smaller, representative dataset?

By "representative" I mean that the data in the dataset faithfully reflects the "underlying signal" a model is trying to tap in to. Is it always true that, as long as increasing ...

sangstar

131

asked Mar 21, 2022 at 14:57

0 votes

1 answer

28 views

Can I ignore these individuals without introducing bias?

I have a population that falls under 10 classes. Each individual may or may not come with a location - 83% overall have locations and a breakdown by class is: Class # individuals # with location # ...

Chris Browne

113

asked Feb 18, 2022 at 12:50

2 votes

0 answers

68 views

What are the statistical fallacies of illusion of control?

Illusion of control* appears in gambling and events involving randomness. For example, choosing a lottery ticket which has an additional information that participant has a control of choosing, such as ...

patagonicus

2,630

asked Feb 15, 2022 at 22:32

Questions tagged [selection-bias]

Related Tags