Newest 'computational-statistics' Questions

3 votes

1 answer

111 views

Wilcoxon signed-rank test always finding randomly generated ratios to be different from unity?

I need to test if a group of ratios from an experiment (calculated as condition1/condition2 from paired samples) is significantly different from one. All measures in the experiment are positive, real ...

user443937

33

asked Nov 15 at 18:14

0 votes

0 answers

32 views

Using inducing points for exact gaussian process inference

I'm a bit muddled with the inference of gaussian processes using inducing points, in particular in conditions when this should be exact inference and not an approximation. For a gaussian process $f\...

Charlie

1

asked Sep 8 at 20:36

1 vote

1 answer

62 views

Computing or sampling from a posterior with samples observed through a dimensional reduction transformation

Let $\boldsymbol \theta$ be a vector of parameters, with a known prior $\pi(\boldsymbol \theta)$. Let $\boldsymbol x_1,...,\boldsymbol x_n$ be i.i.d. samples with $\boldsymbol x|\boldsymbol \theta$ ...

heckman

13

asked Aug 19 at 10:18

3 votes

0 answers

35 views

Is it more efficient to optimize precision than covariance matrix?

This might be a silly question, but I want to make sure I'm not missing something. Say that we want to fit a multivariate Gaussian distribution $\mathcal{N}(\mu, \Sigma)$ to some data by maximizing ...

dherrera

1,405

asked Aug 14 at 19:05

0 votes

0 answers

30 views

How do I maximize this specific loglikelihood function in R?

I am interested in determining the parameters mu and lambda that maximizes the function: ...

learner123

11

asked Jul 27 at 0:53

2 votes

1 answer

164 views

Why does the `boot` R package require the `i` argument? When does it make the package easier to use instead of harder? [closed]

I want to use the boot package to calculate bootstrap confidence intervals for the mean. Sure, I could do this by inverting a t-test, but I want to see what happens ...

Dave

67.2k

asked Jul 25 at 10:46

1 vote

0 answers

22 views

Recycling MCMC samples for another data set from the same distribution

Suppose I'm given $\theta_0$ and I want to sample data from a density $f(Y|\theta_0)$ and then sample from the posterior of $\theta|Y$ (given, obviously, some prior). I want to do this lots of times, ...

Thomas Lumley

46.8k

asked Jul 3 at 6:07

1 vote

0 answers

43 views

how to approximate the eigendecomposition of a correlation matrix when the data have been standardized?

Context I am working to develop a penalized regression framework that will scale up to analyzing high dimensional data with a certain correlation structure. Let $X$ represent an $n \times p$ matrix of ...

Tabitha Peter

11

asked Jun 24 at 14:44

0 votes

0 answers

17 views

Why does the forecast for some series degrade when using a VARMA model comparing to independent ARMA models?

I am working with multiple time series that I suspect are correlated, and I have assumed that using a VARMA model would at least not degrade the forecasts of each series, if not improve them. However, ...

Rocio

1

asked Jun 7 at 16:10

1 vote

0 answers

30 views

Statistically determining a count of particles

I perform experiments to do measurements on various pharmaceuticals. One such measurement is interested in the number of particles which is confined into a small volume. The raw data in my experiment ...

gokudegrees

21

asked May 30 at 20:31

1 vote

1 answer

91 views

Using Rao-Blackwell to improve the estimator of P(X/Y < t)

X and Y are independent N (0, 1) random variables, we want to approximate P (X/Y ≤ t), for a fixed number t. The first part of the problem was to describe a naive Monte Carlo estimate. I described ...

stat_student123

13

asked Apr 29 at 16:22

3 votes

0 answers

57 views

XGBoost with time lagged predictors

I have a prediction problem that involves an outcome $Y_t$ and predictors $X_t$ that vary with time $t$. I want to fit a regression of $Y_t$ on $(t,X_t)$ including also lagged versions of $X$, i.e., $...

Iván Díaz

31

asked Apr 23 at 18:52

0 votes

0 answers

15 views

Estimating the alpha-support of a distribution [duplicate]

I want to estimate the $\alpha$-support ($S_{α}$) of a distribution, which is the minimum volume subset of the support $S$ of a probability distribution $P$ ($S= supp(P)$), that supports a probability ...

SpaakC

1

asked Apr 13 at 20:42

1 vote

0 answers

24 views

How to compute the expected value of a function of a random variable given its log-density function? [closed]

Given a log-density function $\mathcal{L}f_{X}(x)$ of an 1d continuous random variable $X \in \mathcal{L}^{\infty}$ and an 1d polynomial function $h: \mathcal{I}(X) \to \mathbb{R}$, the expected value ...

Alice Springs

11

asked Mar 30 at 14:14

0 votes

1 answer

35 views

Wilcoxon Sign Rank Test with differing list lengths [closed]

I am running a Wilcoxon Sign-Rank test with two lists. One contains 5 elements and the other contains 6. They were taken from the same place but under different conditions. I am trying to compute the ...

Indefeasible

1

asked Mar 29 at 3:23

1 vote

0 answers

56 views

Sampling from a very high dimensional Gaussian

I would like to a sample from a Gaussian $N(0,K)$ where $K$ is a kernel gram matrix, so that $K=[K_{ij}]$ with $K_{ij} = k(x_i,x_j)$ for some positive definite function $k$. The first issue is that ...

WeakLearner

1,531

asked Mar 8 at 18:59

0 votes

0 answers

13 views

False negative B coefficient following multiple linear regression? [duplicate]

I'm running a linear regression in SPSS to test for effects of a binary variable (X) on cost of hospital admission. The variable is correlated with a cost increase of around $3000W When the model has ...

James

1

asked Mar 2 at 16:56

3 votes

1 answer

71 views

Expectation and variance of bivariate skew normal distribution

I am fitting a bivariate skew normal distribution to a 2D data through the sn package in R. I get a $2 \times 1$ vector of ...

Kasthuri

173

asked Feb 15 at 15:27

0 votes

0 answers

23 views

Metropolis-Hastings on domain $(2, \infty)$

I'm trying to understand the Metropolis Hastings algorithm in depth by solving some exercise problems. On one of them, I'm asked to use MH to generate samples from $$f(x) = c \frac{1}{\theta}e^{-\frac{...

Christina Kataki

1

asked Feb 1 at 22:46

0 votes

0 answers

25 views

Reference datasets for conditional density estimation

[In case you feel inclined to close this question because I'm asking for a dataset - I'm looking for solutions in the spirit of point 2 (on-topic) in the accepted answer to this question about asking ...

Scriddie

2,439

asked Jan 31 at 12:21

4 votes

1 answer

107 views

Check if a coin flips randomly, but it can have a different number of sides each toss

I would like to check if a coin flips randomly, based on observational data. The catch is, the coin can have two sides, but also three, four, up to nine. The number of sides differs in each ...

Nucular

453

asked Jan 9 at 22:53

0 votes

0 answers

77 views

Efficient way to encode a set of large covariance matrices

I have a computational model that involves having a set of $K$ covariance matrices, $\{\Sigma_1, ..., \Sigma_K\}$ with each $\Sigma_i \in R^{n \times n}$. Storing all these full covariance matrices is ...

dherrera

1,405

asked Dec 12, 2023 at 18:54

1 vote

0 answers

35 views

mixed effect model question

Hi i have a certain task i want to solve: For two months, participants played an app, in which they played 5 different therapeutic games (TGs). At the beginning of each session, they also completed a ...

nof

11

asked Dec 2, 2023 at 10:10

0 votes

0 answers

39 views

Resampling only $N$ particles out of $N(T+1)$ weighted particles

I have a bunch of weighted particles $(Z^{(i, k)}, W^{(i, k)})$ from a distribution $\mu(dz)$ where $i=1, \ldots, N$ and $k=0,\ldots, T$. These defines the following empirical approximation $$ \hat{\...

Physics_Student

667

asked Oct 6, 2023 at 15:07

0 votes

1 answer

90 views

Guidance for statistical analysis on academic collaborations [closed]

I am currently engaged in a research project involving data analysis in the field of academic publications and author collaborations. The dataset I'm working with includes information such as ...

idkwhatiamdoing

11

asked Aug 21, 2023 at 9:13

2 votes

1 answer

254 views

Confusion on Chi-Squared test results

Basically, I have the data where I am trying to assess whether there is any correlation between the gender of the manager and the gender of people within their team. I decided to do the chi-squared ...

prettyPlease

21

asked Aug 20, 2023 at 22:37

2 votes

0 answers

75 views

Bias and Variance of a Honest Random Forest

I am trying to read the paper Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. In the section 3.1(Theoretical Background), page 13 paragraph 2, The authors have ...

yo wa

137

asked Aug 9, 2023 at 10:36

1 vote

0 answers

67 views

Most efficient way of converting a Quarto document to a presentation [closed]

I currently have a large body of statistics lecture notes that I wrote in Rmarkdown/Quarto document format, and I am looking to convert these notes to Quarto presentations in the simplest way possible....

Rmarkdown_user

11

asked Aug 9, 2023 at 4:05

0 votes

0 answers

44 views

Inference of Beta-Bernoulli Distribution

Assume $x_1, x_2, \cdots, x_n$ follows a $Bern(\pi_0)$, Let $y_{ik}$ follows $Beta(\alpha,\beta)$, $i\in \{1,\cdots, n\}$, and $k\in \{1,\cdots, K\}$. Let $z_k$ follows a Bernoulli Distribution with a ...

LAM_MN

1

asked Aug 7, 2023 at 1:21

0 votes

0 answers

252 views

Why do I get NaN p values in some variables when using mgcv to fit generalized additive mixed models?

I am currently trying to fit milk production data collected for three years by using generalized additive mixed model through mgcv package. The problem is, I am getting NaN p values in some variables. ...

Zainab Hassan

1

asked Jun 28, 2023 at 20:06

0 votes

0 answers

61 views

Is this algorithm for robust estimation of the covariance matrix sensible?

I have a high dimensional dataset $\bf{X} \subset \mathbb{R}^d$, which is multimodal and has outliers. I want to estimate a robust measure of association, something like the correlation between two ...

MachineEpsilon

3,086

asked Jun 8, 2023 at 3:17

1 vote

0 answers

19 views

How do I numerically compute $I(X;CX+Y)$?

Given that $X\sim\text{Bernoulli}(\nu)$ for some $\nu\in(0,1)$, and $Y\sim N(0,1)$ are independent random variables. I want to compute the mutual information $I(X;CX+Y)$, where $C$ is some known non-...

Resu

229

asked May 5, 2023 at 5:22

2 votes

1 answer

154 views

Generating MLE in python - Problem witth the function [closed]

After my previous question (here) I tried to improve my work with this distribution. I'm using the parametrization $$f_X(x) = \frac{\theta^2 x^{\theta-1}(\gamma-\log(x))}{1+\theta\gamma} \mathbb{I}(0&...

Lucas cantu

197

asked May 4, 2023 at 20:44

4 votes

2 answers

378 views

How to iteratively calculate weighted standard error to report alongside a weighted mean

I have a group of individuals for which I would like to report a mean and a weighted error. The data that I observe on a daily basis are two independent $iid$ random variables with unknown ...

bmasri

193

asked Apr 28, 2023 at 10:04

4 votes

1 answer

197 views

bootstrap confidence interval and p-value calculations for finite population sizes

I am comparing the difference of medians between two groups of sample sizes $n1$ and $n2$. I would like to confirm that my boostrap approach for finite population size without pooling sample data ...

Docuemada

103

asked Apr 13, 2023 at 18:54

12 votes

6 answers

2k views

How to generate from this distribution without inverse in R/Python?

I am working with a distribution with the following density: $$f(x) = - \frac{(\alpha+1)^2 x^\alpha \log(\beta x)}{1-(\alpha + 1)\log(\beta)}$$ and CDF $$\mathbb{P} (X \leq x) = \int_0^x - \frac{(\...

Lucas cantu

197

asked Apr 9, 2023 at 15:16

0 votes

0 answers

82 views

Exact computation of Bayes factor for multivariate normal

Question: Is there a known, exact expression for the Bayes factor between two multivariate normal hypotheses? Let $H_1$ and $H_2$ be two subsets of $R^d$ with normal priors $\pi(\mu|H_j)$. The sets $...

tims

1

asked Apr 3, 2023 at 16:21

1 vote

1 answer

86 views

XGBoost: Why is the "approximate algorithm" faster?

I am reading T. Chen, C. Guestrin, "XGBoost: A Scalable Tree Boosting System", 2016 (arXiv), which is seemingly full of typos. They propose the so-called "approximate algorithm" (...

paperskilltrees

375

asked Apr 2, 2023 at 13:55

0 votes

0 answers

59 views

Can I use Kendall's correlation to determine the correlation between continuous and binary variables?

I have a dataset where the first 6 columns correspond to binary entries referring to sick/not sick and 2 additional columns with age and a specific score (dim of the dataset 60x8). I need to generate ...

mango

1

asked Mar 7, 2023 at 20:19

0 votes

0 answers

405 views

Testing whether a set of points on the unit sphere is uniformly distributed

The canonical way to do the test is to perform the spherical harmonic transform of the empirical distribution and then check that the power spectrum decays, but this is presumably fairly expensive. Is ...

Igor Rivin

337

asked Mar 3, 2023 at 22:34

1 vote

0 answers

323 views

Algorithm for Irwin Hall Distribution [closed]

I've been trying to create a function for the Irwin Hall distribution that doesn't face the same issue as the unifed package implementation. Because the function suffers from numerical issues, I ...

user1329307

113

asked Feb 27, 2023 at 14:42

0 votes

0 answers

166 views

Overflow when computing binomial distribution for large n [duplicate]

How do you compute a binomial probability distribution for large $n$? If I try the following, I get an integer overflow in any programming language: ...

at01

111

asked Feb 24, 2023 at 6:22

3 votes

0 answers

125 views

How does one create comparable metrics when the original metrics are not comparable?

The problem I have is that I have several groups (say 3 to make discussion concrete) with observations from a true but known distribution $p^*_1, p^*_2, p^*_3$ (or 3 populations). I can compute some ...

Charlie Parker

7,074

asked Feb 4, 2023 at 23:00

2 votes

0 answers

335 views

Rust or C++ for computational statistics? [closed]

I'll work on developing computer-intensive Bayesian sampling algorithms for spatio-temporal applications (e.g. MCMC, KF). So far, I'm thinking of coding my algorithms in C++. However, I've heard that ...

antarctica

75

asked Jan 30, 2023 at 12:39

1 vote

0 answers

125 views

Sampling From Four-Parameter Beta Distribution

Most statistical computing packages have functions to sample out of a two-parameter Beta distribution, but not many offer functions for sampling out of a four-parameter Beta distribution. If I want to ...

nguzman

123

asked Jan 25, 2023 at 0:13

1 vote

0 answers

32 views

How to filter out outliers from dataset? [duplicate]

I am trying to filter out outliers given a data set (maximum of 50 samples). For example: dataSet = (10.0, 10.0, 10.0, 10.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ...

Java2Avaj

11

asked Jan 10, 2023 at 7:59

1 vote

0 answers

94 views

Introducing the third spatial dimension (Z-coordinate) in the generalized dissimilarity model in R

Generalized dissimilarity modeling (R package gdm) is a tool to study the relative effects of the user-defined environmental gradients and the spatial distance decay on the pair-wise dissimilarity ...

Kryštof Chytrý

133

asked Dec 28, 2022 at 21:56

0 votes

0 answers

28 views

How to simulate multivariate posterior distribution with a flat prior in general?

user45765

1,465

asked Dec 27, 2022 at 15:49

0 votes

0 answers

119 views

Numerical Stability when Inverse CDF Sampling from Truncated Density

Let $f(x)$ be the pdf of a random variable that we want to truncate to the interval $[a,b]$ and then sample from it. Let $F(x)$ denote the corresponding cdf. We can use inverse cdf sampling and ...

yrx1702

730

asked Dec 21, 2022 at 8:37

0 votes

0 answers

79 views

Conditional expectation of Uniform given sum of Bernoulli trials

Given: [] Find the conditional probability distribution of theta given Sn and compute the conditional expectation. I believe the distribution of Sn will be a binomial with mean ntheta and variance (...

Santori

1

asked Dec 12, 2022 at 18:17

Questions tagged [computational-statistics]

Related Tags