Newest 'mathematical-statistics' Questions

0 votes

0 answers

23 views

Resampling for AB Test, to achieve normal distribution under the CLT

I have (finally) wrapped my head around the Central Limit Theorem. Very exciting. However, I am struggling with how to apply, of if it should be applied, to an AB Test. In this example, let's say I ...

plotmaster473

1

asked 3 hours ago

0 votes

0 answers

23 views

Correct usage of "sum" and "mean" for proportions vs continuous variables

What is the proper way to aggregate a measure of "proportions" vs "continuous" by "sum" vs "mean" ? For example, let's say I have "time_on_site" and &...

plotmaster473

1

asked yesterday

0 votes

0 answers

9 views

Stein's method for exchangeable sequence

I have a random (not i.i.d.) sequence that I can show is an exchangeable pair. I am trying to apply relatively recent ideas in Berry-Esseen bounds to the problem (https://link.springer.com/chapter/10....

Pablitorun

133

asked yesterday

0 votes

1 answer

19 views

Interaction term negative when both its components are positive?

I examined the effect of labour cost, labour quality and their interaction (cost*quality) on FDI, but I got positive coff. of both components and negative coff. of interaction term. How could to ...

Lê Huy

3

asked yesterday

0 votes

1 answer

29 views

What is the relationship between OR, se, and CI

Apologies if this is a duplicate question, but I have been unable to find a clear answer. What is the relationship between Odds Ratio, standard error, and confidence intervals? I try to do a meta-...

user447683

3

asked 2 days ago

2 votes

0 answers

34 views

Marginal empirical distribution from joint sample

I have a quite 'simple' doubt that would like to clear. Suppose I have a heiarchical model where data is sampled in the following manner: Sample $U_i$ from $P_U$ Sample $X_i$ from $P_{X|U_i}$ In ...

Stadium Arcadium

31

asked Dec 10 at 10:07

3 votes

1 answer

30 views

What is the difference between Tolerance Interval and Control Limits?

I understand that a tolerance interval indicates a range within which a specified percentage of a population (future value) is expected to fall. Control limits, on the other hand, represent the ...

user510

131

asked Dec 9 at 21:52

0 votes

0 answers

10 views

Fixed-Length Confidence Interval for Gaussian iid Mean [closed]

I have a question on an exercise where the goal is to find a bounded length confidence interval for the mean of Gaussian random variables. More precisely, my question is on part a) of the exercise ...

Ryan Sfeila

1

asked Dec 7 at 20:14

1 vote

0 answers

9 views

Cumulative distribution function of coherent system

The theorem is from here. Theorem 3.1. Let $X_1, \ldots, X_n$ be the i.i.d. component lifetimes of an $n$-component coherent system with signature $s$, and let $T$ be the system's lifetime. Then $$ \...

Unknown

63

asked Dec 5 at 12:54

0 votes

0 answers

16 views

Estimate SD[Z=X*Y] which one is correct?

Given we have a sample of two variables X and Y with sample size n. I want to calculate the standard deviation of Z = X*Y. I don't know which of the two options bellow are correct? Option 1: Simply ...

PTQuoc

193

asked Dec 2 at 15:08

4 votes

1 answer

68 views

How do machine learning topics fit into a traditional undergraduate statistics course on estimation?

I'm recently teaching an undergraduate introduction to statistics course, but as required by program director, need to add some machine learning materials to it. I'm wondering what is the appropriate ...

ExcitedSnail

3,050

asked Nov 29 at 23:33

2 votes

1 answer

44 views

Deriving Scale Parameter in Exponential Family

The following is from p. 98 of Casella & Berger's Statistical Inference (2024 edition): Several of the families introduced in Section 3.3 either are scale families or have scale families as ...

MrAmbiguneDL

21

asked Nov 29 at 18:11

0 votes

0 answers

16 views

Compare two variances

I am reading this paper I have difficulty understanding Section 6: A Linear Time Statistic and Test. At the beginning, they claim that $\text{MMD}^2_l$ has higher variance than $\text{MMD}^2_u$ (we ...

Pipnap

131

asked Nov 26 at 16:30

2 votes

1 answer

59 views

What exactly is the definition of a UMP unbiased test?

I am solving exercises from section 8.3 of Hogg and McKean's "Introduction to Mathematical Statistics." I cannot proceed because the authors have not formally defined UMPU test even though ...

TryingHardToBecomeAGoodPrSlvr

357

asked Nov 26 at 12:44

1 vote

0 answers

41 views

What is the proof of the mean sampling distribution being approximated by Student's t-distribution? [closed]

In the case of non-normal distribution with unknown variance, what is the justification that the sampling distribution of the mean (with large samples) is approximately Student’s t-distribution? ======...

Sam

777

asked Nov 26 at 11:55

1 vote

1 answer

42 views

Rate of convergence in probability

I am reading the paper In this paper, they proved theorem 7 which stated in the following way Theorem 7: Let $p, q, X, Y$ be defined as in Problem 1, and assume $0 \leq k(x, y) \leq K$. Then: $$ \Pr_{...

Pipnap

131

asked Nov 25 at 21:18

2 votes

0 answers

43 views

Occurence extrapolation/statistics in a set of incomplete collections

My statistics courses are a long way off now (I'm a biologist). My problem is probably trivial, but I don't know where to start. My goal is to calculate the occurrence of a gene X in a set of genomes. ...

fdecarpentier

21

asked Nov 20 at 21:07

2 votes

2 answers

44 views

Mathematical Reference for Metropolis-Within-Gibbs Algorithm

Is there a MATHEMATICAL reference for the Metropolis-Within-Gibbs algorithm with proves the algorithm mathematically ? (Presumably, the reference shall use facts in Markov Chain Theory, the fact that ...

温泽海

639

asked Nov 20 at 21:05

1 vote

1 answer

60 views

Outcome Level vs. Treatment Level vs. Fixed-Effects Level in Difference-in-Differences

I am a bit confused about what controls should I include in my event-study (Callaway and Santanna 2021) specifications. One of my models tries to understand the impact of the opening of a public ...

llb1706

43

asked Nov 20 at 11:59

1 vote

2 answers

25 views

Is a sequential binomial sample a multinomial sample?

Say I have N particles and I remove a fraction $f_1$ of these obtaining $k_1$ particles as $$ k_1 \sim \text{Binomial}(N,f_1) $$ and from these selected $k_1$ I have another Binomial draw of $k_2$ ...

RandomNameGenerator

33

asked Nov 20 at 4:10

0 votes

1 answer

27 views

Choose a good estimator in a candidate set

Recently, I've been interested in the following statistical problem: I have a set that consists of some estimators $\hat{A}_i$ of a matrix $A\in \mathbb{R}^{p\times p}$. Then I have some data ...

mathhahaha

209

asked Nov 19 at 12:48

6 votes

1 answer

226 views

MLE in stochastically increasing parametric family

Let $X$ have cumulative density function $F_{\theta}$, suppose this family is stochastically increasing in $\theta$, that is, for $\theta_1<\theta_2$, $F(x;\theta_2) \le F(x;\theta_1)$. We have one ...

Noppawee Apichonpongpan

735

asked Nov 18 at 7:42

0 votes

0 answers

25 views

Variable selection for checking casual relationship of regression model: should or should not? [duplicate]

I am looking for documents and online sources to understand whether or not I should exclude variables from my model through model selection (variable selection). I also tried to use methods of Least ...

Student coding

23

asked Nov 17 at 13:31

1 vote

1 answer

44 views

Generalization error as U shape curve with respect to model complexity (bias variance tradeoff))

Is there any work mathematical rigorously prove that the generalization error for certain learning problems exhibits U shape curve with respect to model complexity (bias variance tradeoff)? Any ...

Hao Yu

233

asked Nov 15 at 15:41

2 votes

2 answers

57 views

(More complete) proof the Fisher information is additive

For independent, identically distributed variables it is well known that the Fisher information is additive, i.e. \begin{align} \mathcal{I}_n(\theta)&=\left<{\left({\frac{\partial}{\partial\...

AccidentalTaylorExpansion

135

asked Nov 15 at 10:04

2 votes

1 answer

41 views

Alternatives for RMSE to Evaluate Goodness of Fit for Stable Distribution Parameters

I am estimating the parameters (alpha, beta, gamma, delta) of a stable distribution from a list of numerical data. I used a package to generate data from one type of stable distribution, specifically ...

Danny Wen

121

asked Nov 15 at 0:34

1 vote

1 answer

91 views

Why could data bootstraping modifiy the slope of a population comming from the same distribution?

Im bootstraping some samples to calculate slopes (with replacement). Once that is done, the slopes that should have the same distribution, do not have the same distribution. To be clear im not asking ...

Chino Chiang

41

asked Nov 14 at 13:56

3 votes

1 answer

137 views

Different parametrizations of the exponential family

I found in this article https://arxiv.org/pdf/1607.06450 , formula 10, a parametrization of the exponential family that I think can be written like this: $$P(t|\eta,s)=e^{\eta t/s}e^{-g(\eta)/s}e^{c(t,...

Thomas

952

asked Nov 13 at 9:45

0 votes

1 answer

50 views

Connecting two different meanings of "degree of freedom"

I have heard at least 2 meanings of "degree of freedom". The parameter in a t-distribution. The the number of values in the final calculation of a statistic that are free to vary (like ...

Iterator516

359

asked Nov 12 at 15:32

0 votes

0 answers

12 views

Non-Analytical Differentiable Hamiltonian Function in Neural Networks

Im writing a study on this paper: https://arxiv.org/pdf/1906.01563v1 Its by Sam Greydanus et al. And they discovered that by using Neural Networks to predict the Hamiltonian (total energy of a system) ...

Ole Askeland

11

asked Nov 11 at 20:45

0 votes

0 answers

28 views

Prove that a test is most powerful when $X_1,\cdots,X_n\sim U(0,\theta)$

Let $X_1,\cdots,X_n\sim U(0,\theta),\theta >0$ be independent random variables. I want to prove that $\phi :\mathbb{R}^n\to [0,1]$ given by $\phi (x):=\begin{cases}1,&\theta _0<x_{(n)}\vee ...

rfloc

163

asked Nov 10 at 23:12

2 votes

1 answer

43 views

A simple Hidden Markov Model

I was clarifying the exact formulas for the EM algorithm of a simphe hidden Markov model. This problem has came out from the problem 25, chapter 9, in which several practical examples of EM algorithms ...

ToBY

123

asked Nov 7 at 11:36

8 votes

3 answers

577 views

Proof of the statement "the best test is unbiased"

There is a corollary from Hogg and McKean's textbook titled "Introduction to Mathematical Statistics" and I have miserably failed to understand the proof. Unfortunately, my question requires ...

TryingHardToBecomeAGoodPrSlvr

357

asked Nov 6 at 12:41

0 votes

0 answers

9 views

Understanding the Implications of Similar Smooth Functions in Generalised Additive Models

I have a question about GAM models. If I fit two GAM models, one including all variables together and the other adding one variable at a time, and the resulting smooth functions are similar, what does ...

F3d33

1

asked Nov 6 at 9:48

3 votes

2 answers

83 views

Estimate a vector $\beta=\underbrace{\beta_1}_{\text{sparse}}+\underbrace{\beta_2}_{\text{dense}}$

In high-dimensional settings, we solve the linear regression using the lasso method which relies on the assumption of sparsity, $$ \hat{\beta}=\underset{\beta\in \mathbb{R}^{p}}{\arg \min}\|Y-X\beta\|...

mathhahaha

209

asked Nov 6 at 2:42

2 votes

1 answer

111 views

definition of regular estimators

In the book "Semiparametric Theory and Missing Data" by Tsiatis, superefficient estimators are defined as "they are unnatural and have undesirable local properties associated with them&...

kara890

311

asked Nov 5 at 18:22

2 votes

2 answers

30 views

Showing what the best line of fit is given the method of least squares - what do I do from here?

CrSb0001

123

asked Nov 5 at 1:48

0 votes

0 answers

25 views

Is there a form of regularized regression that's equivalent to maximum likelihood together with model selection by information criteria?

AIC We often use stuff like AIC for model selection: $$ AIC = 2k - 2\ln(L\hat) $$ where $k$ is the number of parameters and $L̂$ is the maximized likelihood function. https://en.wikipedia.org/wiki/...

Awkward Deduction

1

asked Nov 4 at 22:35

0 votes

0 answers

17 views

A practical way to understand subgaussian parameter

I am currently assuming that the random variable $X$ I am working with is subgaussian with parameter $\sigma^2$. I have simulated data, but I would like to know how to use the generated data to ...

Omega

113

asked Nov 4 at 18:24

0 votes

0 answers

25 views

Chi squared for samples of different sizes

I would like to test for independence of a categorical variable in three different samples, X, Y, and Z. Each sample can be either of category A or of category B. This seems like a straightforward ...

user443087

asked Nov 4 at 13:04

0 votes

0 answers

32 views

Square of the convergence rate $\|\hat{A}-A\|_F^2$

In a high-dimensional setting, if we have an estimator, $\hat{A}\in\mathbb{R}^{n\times n}$, we always try to get the convergence rate measured by matrix norms, such as $\|\hat{A}-A\|_F$. Now, if I ...

mathhahaha

209

asked Nov 4 at 8:21

1 vote

0 answers

21 views

Is $\mathbb{E}\left[\|\hat{\Sigma}\|_F\right]=\|{\Sigma}\|_F$?

In one paper I read, the authors write $$ \mathbb{E}\left[\|\tilde{\Sigma}^{-\frac{1}{2}}\left(\hat{\Theta}-\Omega\right)\|_F^2\right]=\mathbb{E}\left[\|{\Sigma}^{-\frac{1}{2}}\left(\hat{\Theta}-\...

mathhahaha

209

asked Nov 3 at 8:51

1 vote

1 answer

43 views

Unbiased Variance MLE Distribution

If you take $10000$ samples from a normal distribution, the unbiased variance MLE (with Bessel's correction) is $$\hat{\sigma}^2 = \frac{1}{9999}\sum_i (x_i - \hat{\mu})$$ Apparently the distribution ...

Trajan

503

asked Nov 2 at 21:02

4 votes

1 answer

128 views

Closed form solution for bayesian linear regression with 2 responses?

I am thinking about first principles from the point of view of a frequentist moving from regression with 1 response to regression with 2 responses. Reflecting on that I am trying to figure out how to ...

JCWong

1,662

asked Nov 2 at 0:26

1 vote

1 answer

48 views

The interpretation of the term "uncertainty" in statistics vs. information theory vs. machine learning

I have an ensemble model consisting from multiple classifiers and I wish to quantify the uncertainty of the predictions the ensemble model makes. From an information theory / machine learning ...

jjepsuomi

5,907

asked Nov 1 at 13:03

0 votes

0 answers

17 views

Deriving a multiple based on actuals and forecast values

For context, we are using the DeepAR model for demand planning forecasting. Currently the forecast often underrepresents actual demand. It was suggested that we use a higher quantile to overestimate ...

Wolfy

101

asked Oct 31 at 21:57

3 votes

1 answer

86 views

Probability expression in Multi-Task Logistic Regression

I'm trying to understand how the authors of this paper (Learning Patient-Specific Cancer Survival Distributions as a Sequence of Dependent Regressors) obtain the general formula on page for the ...

Dog_69

103

asked Oct 31 at 15:44

4 votes

2 answers

141 views

Variance-bias tradeoff formula for simple linear regression with both X fixed and X random

I am trying to understand the Variance-Bias tradeoff formula using simple linear regression. But there are some formulas I am not able to derive. I will explain what I mean by first doing it for a ...

user394334

164

asked Oct 30 at 11:26

0 votes

0 answers

23 views

Confidence interval for entropy from Basharin’s asymptotic normality result

Setup: Say we have i.i.d. observations $X_1, \dots, X_N$ from the distribution given by $$ 0 < p_i := \mathbb P(X_j = i) < 1, \quad i = 1, \dots, s, \quad \text{and} \quad \sum_{i = 1}^s p_i = 1....

zxmkn

223

asked Oct 29 at 10:25

2 votes

1 answer

47 views

Conditions for Pointwise convergence to imply uniform convergence

I have the following situation. Let $f:\mathbb{R}^p \times \Theta \to \mathbb{R}$ a measurable function. Moreover, let $X_n$ be a sequence of real-valued random vectors. I know that the function ...

Treebeard

23

asked Oct 29 at 4:29

Questions tagged [mathematical-statistics]

Related Tags