Skip to main content

Questions tagged [mathematical-statistics]

Mathematical theory of statistics, concerned with formal definitions and general results.

Filter by
Sorted by
Tagged with
0 votes
0 answers
23 views

Resampling for AB Test, to achieve normal distribution under the CLT

I have (finally) wrapped my head around the Central Limit Theorem. Very exciting. However, I am struggling with how to apply, of if it should be applied, to an AB Test. In this example, let's say I ...
plotmaster473's user avatar
0 votes
0 answers
23 views

Correct usage of "sum" and "mean" for proportions vs continuous variables

What is the proper way to aggregate a measure of "proportions" vs "continuous" by "sum" vs "mean" ? For example, let's say I have "time_on_site" and &...
plotmaster473's user avatar
0 votes
0 answers
9 views

Stein's method for exchangeable sequence

I have a random (not i.i.d.) sequence that I can show is an exchangeable pair. I am trying to apply relatively recent ideas in Berry-Esseen bounds to the problem (https://link.springer.com/chapter/10....
Pablitorun's user avatar
0 votes
1 answer
19 views

Interaction term negative when both its components are positive?

I examined the effect of labour cost, labour quality and their interaction (cost*quality) on FDI, but I got positive coff. of both components and negative coff. of interaction term. How could to ...
Lê Huy's user avatar
0 votes
1 answer
29 views

What is the relationship between OR, se, and CI

Apologies if this is a duplicate question, but I have been unable to find a clear answer. What is the relationship between Odds Ratio, standard error, and confidence intervals? I try to do a meta-...
user447683's user avatar
2 votes
0 answers
34 views

Marginal empirical distribution from joint sample

I have a quite 'simple' doubt that would like to clear. Suppose I have a heiarchical model where data is sampled in the following manner: Sample $U_i$ from $P_U$ Sample $X_i$ from $P_{X|U_i}$ In ...
Stadium Arcadium's user avatar
3 votes
1 answer
30 views

What is the difference between Tolerance Interval and Control Limits?

I understand that a tolerance interval indicates a range within which a specified percentage of a population (future value) is expected to fall. Control limits, on the other hand, represent the ...
user510's user avatar
  • 131
0 votes
0 answers
10 views

Fixed-Length Confidence Interval for Gaussian iid Mean [closed]

I have a question on an exercise where the goal is to find a bounded length confidence interval for the mean of Gaussian random variables. More precisely, my question is on part a) of the exercise ...
Ryan Sfeila's user avatar
1 vote
0 answers
9 views

Cumulative distribution function of coherent system

The theorem is from here. Theorem 3.1. Let $X_1, \ldots, X_n$ be the i.i.d. component lifetimes of an $n$-component coherent system with signature $s$, and let $T$ be the system's lifetime. Then $$ \...
Unknown's user avatar
  • 63
0 votes
0 answers
16 views

Estimate SD[Z=X*Y] which one is correct?

Given we have a sample of two variables X and Y with sample size n. I want to calculate the standard deviation of Z = X*Y. I don't know which of the two options bellow are correct? Option 1: Simply ...
PTQuoc's user avatar
  • 193
4 votes
1 answer
68 views

How do machine learning topics fit into a traditional undergraduate statistics course on estimation?

I'm recently teaching an undergraduate introduction to statistics course, but as required by program director, need to add some machine learning materials to it. I'm wondering what is the appropriate ...
ExcitedSnail's user avatar
  • 3,050
2 votes
1 answer
44 views

Deriving Scale Parameter in Exponential Family

The following is from p. 98 of Casella & Berger's Statistical Inference (2024 edition): Several of the families introduced in Section 3.3 either are scale families or have scale families as ...
MrAmbiguneDL's user avatar
0 votes
0 answers
16 views

Compare two variances

I am reading this paper I have difficulty understanding Section 6: A Linear Time Statistic and Test. At the beginning, they claim that $\text{MMD}^2_l$ has higher variance than $\text{MMD}^2_u$ (we ...
Pipnap's user avatar
  • 131
2 votes
1 answer
59 views

What exactly is the definition of a UMP unbiased test?

I am solving exercises from section 8.3 of Hogg and McKean's "Introduction to Mathematical Statistics." I cannot proceed because the authors have not formally defined UMPU test even though ...
TryingHardToBecomeAGoodPrSlvr's user avatar
1 vote
0 answers
41 views

What is the proof of the mean sampling distribution being approximated by Student's t-distribution? [closed]

In the case of non-normal distribution with unknown variance, what is the justification that the sampling distribution of the mean (with large samples) is approximately Student’s t-distribution? ======...
Sam's user avatar
  • 777
1 vote
1 answer
42 views

Rate of convergence in probability

I am reading the paper In this paper, they proved theorem 7 which stated in the following way Theorem 7: Let $p, q, X, Y$ be defined as in Problem 1, and assume $0 \leq k(x, y) \leq K$. Then: $$ \Pr_{...
Pipnap's user avatar
  • 131
2 votes
0 answers
43 views

Occurence extrapolation/statistics in a set of incomplete collections

My statistics courses are a long way off now (I'm a biologist). My problem is probably trivial, but I don't know where to start. My goal is to calculate the occurrence of a gene X in a set of genomes. ...
fdecarpentier's user avatar
2 votes
2 answers
44 views

Mathematical Reference for Metropolis-Within-Gibbs Algorithm

Is there a MATHEMATICAL reference for the Metropolis-Within-Gibbs algorithm with proves the algorithm mathematically ? (Presumably, the reference shall use facts in Markov Chain Theory, the fact that ...
温泽海's user avatar
  • 639
1 vote
1 answer
60 views

Outcome Level vs. Treatment Level vs. Fixed-Effects Level in Difference-in-Differences

I am a bit confused about what controls should I include in my event-study (Callaway and Santanna 2021) specifications. One of my models tries to understand the impact of the opening of a public ...
llb1706's user avatar
  • 43
1 vote
2 answers
25 views

Is a sequential binomial sample a multinomial sample?

Say I have N particles and I remove a fraction $f_1$ of these obtaining $k_1$ particles as $$ k_1 \sim \text{Binomial}(N,f_1) $$ and from these selected $k_1$ I have another Binomial draw of $k_2$ ...
RandomNameGenerator's user avatar
0 votes
1 answer
27 views

Choose a good estimator in a candidate set

Recently, I've been interested in the following statistical problem: I have a set that consists of some estimators $\hat{A}_i$ of a matrix $A\in \mathbb{R}^{p\times p}$. Then I have some data ...
mathhahaha's user avatar
6 votes
1 answer
226 views

MLE in stochastically increasing parametric family

Let $X$ have cumulative density function $F_{\theta}$, suppose this family is stochastically increasing in $\theta$, that is, for $\theta_1<\theta_2$, $F(x;\theta_2) \le F(x;\theta_1)$. We have one ...
Noppawee Apichonpongpan's user avatar
0 votes
0 answers
25 views

Variable selection for checking casual relationship of regression model: should or should not? [duplicate]

I am looking for documents and online sources to understand whether or not I should exclude variables from my model through model selection (variable selection). I also tried to use methods of Least ...
Student coding's user avatar
1 vote
1 answer
44 views

Generalization error as U shape curve with respect to model complexity (bias variance tradeoff))

Is there any work mathematical rigorously prove that the generalization error for certain learning problems exhibits U shape curve with respect to model complexity (bias variance tradeoff)? Any ...
Hao Yu's user avatar
  • 233
2 votes
2 answers
57 views

(More complete) proof the Fisher information is additive

For independent, identically distributed variables it is well known that the Fisher information is additive, i.e. \begin{align} \mathcal{I}_n(\theta)&=\left<{\left({\frac{\partial}{\partial\...
AccidentalTaylorExpansion's user avatar
2 votes
1 answer
41 views

Alternatives for RMSE to Evaluate Goodness of Fit for Stable Distribution Parameters

I am estimating the parameters (alpha, beta, gamma, delta) of a stable distribution from a list of numerical data. I used a package to generate data from one type of stable distribution, specifically ...
Danny Wen's user avatar
  • 121
1 vote
1 answer
91 views

Why could data bootstraping modifiy the slope of a population comming from the same distribution?

Im bootstraping some samples to calculate slopes (with replacement). Once that is done, the slopes that should have the same distribution, do not have the same distribution. To be clear im not asking ...
Chino Chiang's user avatar
3 votes
1 answer
137 views

Different parametrizations of the exponential family

I found in this article https://arxiv.org/pdf/1607.06450 , formula 10, a parametrization of the exponential family that I think can be written like this: $$P(t|\eta,s)=e^{\eta t/s}e^{-g(\eta)/s}e^{c(t,...
Thomas's user avatar
  • 952
0 votes
1 answer
50 views

Connecting two different meanings of "degree of freedom"

I have heard at least 2 meanings of "degree of freedom". The parameter in a t-distribution. The the number of values in the final calculation of a statistic that are free to vary (like ...
Iterator516's user avatar
0 votes
0 answers
12 views

Non-Analytical Differentiable Hamiltonian Function in Neural Networks

Im writing a study on this paper: https://arxiv.org/pdf/1906.01563v1 Its by Sam Greydanus et al. And they discovered that by using Neural Networks to predict the Hamiltonian (total energy of a system) ...
Ole Askeland's user avatar
0 votes
0 answers
28 views

Prove that a test is most powerful when $X_1,\cdots,X_n\sim U(0,\theta)$

Let $X_1,\cdots,X_n\sim U(0,\theta),\theta >0$ be independent random variables. I want to prove that $\phi :\mathbb{R}^n\to [0,1]$ given by $\phi (x):=\begin{cases}1,&\theta _0<x_{(n)}\vee ...
rfloc's user avatar
  • 163
2 votes
1 answer
43 views

A simple Hidden Markov Model

I was clarifying the exact formulas for the EM algorithm of a simphe hidden Markov model. This problem has came out from the problem 25, chapter 9, in which several practical examples of EM algorithms ...
ToBY's user avatar
  • 123
8 votes
3 answers
577 views

Proof of the statement "the best test is unbiased"

There is a corollary from Hogg and McKean's textbook titled "Introduction to Mathematical Statistics" and I have miserably failed to understand the proof. Unfortunately, my question requires ...
TryingHardToBecomeAGoodPrSlvr's user avatar
0 votes
0 answers
9 views

Understanding the Implications of Similar Smooth Functions in Generalised Additive Models

I have a question about GAM models. If I fit two GAM models, one including all variables together and the other adding one variable at a time, and the resulting smooth functions are similar, what does ...
F3d33's user avatar
  • 1
3 votes
2 answers
83 views

Estimate a vector $\beta=\underbrace{\beta_1}_{\text{sparse}}+\underbrace{\beta_2}_{\text{dense}}$

In high-dimensional settings, we solve the linear regression using the lasso method which relies on the assumption of sparsity, $$ \hat{\beta}=\underset{\beta\in \mathbb{R}^{p}}{\arg \min}\|Y-X\beta\|...
mathhahaha's user avatar
2 votes
1 answer
111 views

definition of regular estimators

In the book "Semiparametric Theory and Missing Data" by Tsiatis, superefficient estimators are defined as "they are unnatural and have undesirable local properties associated with them&...
kara890's user avatar
  • 311
2 votes
2 answers
30 views

Showing what the best line of fit is given the method of least squares - what do I do from here?

I have been working through Multivariable Calculus 4th Edition by James Stewart (©1999) and am currently stuck on what seems to be a stats problem on problem 51 of Chapter 15.7: Suppose that a ...
CrSb0001's user avatar
  • 123
0 votes
0 answers
25 views

Is there a form of regularized regression that's equivalent to maximum likelihood together with model selection by information criteria?

AIC We often use stuff like AIC for model selection: $$ AIC = 2k - 2\ln(L\hat) $$ where $k$ is the number of parameters and $L̂$ is the maximized likelihood function. https://en.wikipedia.org/wiki/...
Awkward Deduction's user avatar
0 votes
0 answers
17 views

A practical way to understand subgaussian parameter

I am currently assuming that the random variable $X$ I am working with is subgaussian with parameter $\sigma^2$. I have simulated data, but I would like to know how to use the generated data to ...
Omega's user avatar
  • 113
0 votes
0 answers
25 views

Chi squared for samples of different sizes

I would like to test for independence of a categorical variable in three different samples, X, Y, and Z. Each sample can be either of category A or of category B. This seems like a straightforward ...
user avatar
0 votes
0 answers
32 views

Square of the convergence rate $\|\hat{A}-A\|_F^2$

In a high-dimensional setting, if we have an estimator, $\hat{A}\in\mathbb{R}^{n\times n}$, we always try to get the convergence rate measured by matrix norms, such as $\|\hat{A}-A\|_F$. Now, if I ...
mathhahaha's user avatar
1 vote
0 answers
21 views

Is $\mathbb{E}\left[\|\hat{\Sigma}\|_F\right]=\|{\Sigma}\|_F$?

In one paper I read, the authors write $$ \mathbb{E}\left[\|\tilde{\Sigma}^{-\frac{1}{2}}\left(\hat{\Theta}-\Omega\right)\|_F^2\right]=\mathbb{E}\left[\|{\Sigma}^{-\frac{1}{2}}\left(\hat{\Theta}-\...
mathhahaha's user avatar
1 vote
1 answer
43 views

Unbiased Variance MLE Distribution

If you take $10000$ samples from a normal distribution, the unbiased variance MLE (with Bessel's correction) is $$\hat{\sigma}^2 = \frac{1}{9999}\sum_i (x_i - \hat{\mu})$$ Apparently the distribution ...
Trajan's user avatar
  • 503
4 votes
1 answer
128 views

Closed form solution for bayesian linear regression with 2 responses?

I am thinking about first principles from the point of view of a frequentist moving from regression with 1 response to regression with 2 responses. Reflecting on that I am trying to figure out how to ...
JCWong's user avatar
  • 1,662
1 vote
1 answer
48 views

The interpretation of the term "uncertainty" in statistics vs. information theory vs. machine learning

I have an ensemble model consisting from multiple classifiers and I wish to quantify the uncertainty of the predictions the ensemble model makes. From an information theory / machine learning ...
jjepsuomi's user avatar
  • 5,907
0 votes
0 answers
17 views

Deriving a multiple based on actuals and forecast values

For context, we are using the DeepAR model for demand planning forecasting. Currently the forecast often underrepresents actual demand. It was suggested that we use a higher quantile to overestimate ...
Wolfy's user avatar
  • 101
3 votes
1 answer
86 views

Probability expression in Multi-Task Logistic Regression

I'm trying to understand how the authors of this paper (Learning Patient-Specific Cancer Survival Distributions as a Sequence of Dependent Regressors) obtain the general formula on page for the ...
Dog_69's user avatar
  • 103
4 votes
2 answers
141 views

Variance-bias tradeoff formula for simple linear regression with both X fixed and X random

I am trying to understand the Variance-Bias tradeoff formula using simple linear regression. But there are some formulas I am not able to derive. I will explain what I mean by first doing it for a ...
user394334's user avatar
0 votes
0 answers
23 views

Confidence interval for entropy from Basharin’s asymptotic normality result

Setup: Say we have i.i.d. observations $X_1, \dots, X_N$ from the distribution given by $$ 0 < p_i := \mathbb P(X_j = i) < 1, \quad i = 1, \dots, s, \quad \text{and} \quad \sum_{i = 1}^s p_i = 1....
zxmkn's user avatar
  • 223
2 votes
1 answer
47 views

Conditions for Pointwise convergence to imply uniform convergence

I have the following situation. Let $f:\mathbb{R}^p \times \Theta \to \mathbb{R}$ a measurable function. Moreover, let $X_n$ be a sequence of real-valued random vectors. I know that the function ...
Treebeard's user avatar

1
2 3 4 5
158