Questions tagged [mahalanobis]
The mahalanobis tag has no usage guidance.
61 questions
0
votes
0
answers
11
views
Mahalanobis distance as a bound on the Euclidean distance
Can someone please help me figure out how to prove (or disprove) the following statement?
Consider
A random point $\mathbf{M}$ in 2D space, normally distributed with mean $\mathbf{m}=(m_x, m_y)$ and ...
1
vote
0
answers
29
views
How can I identify the distribution of a series of Mahalanobis distances?
If my dataset follows a multivariate t-distribution, what is the cdf of the Mahalanobis distance of a datapoint outside the sample? In other words, if I want to calculate the probability that a ...
1
vote
1
answer
56
views
Mahalanobis distance calculation in MatchIt function
I've started to use MatchIt package recently and it's great! I'm also learning a lot by reading the documentation associated with the package. My question is about calculation of Mahalanobis distance. ...
2
votes
0
answers
40
views
What makes a statistic valid for monte carlo simulation?
A while back I was reading Garland et. al. (1993) about studying whether two groups of animals, say herbivores and carnivores differ in their mean value for some trait, like the amount of territory ...
1
vote
0
answers
83
views
Rescaling for unit variance before finding Mahalanobis distance?
I am reading up on Mahalanobis distance and Principal Component Analysis. According to this description (Section 3), the data is transformed into into uncorrelated variables before rescaling to unit ...
0
votes
0
answers
49
views
How to interpret the cut-off of a Mahalanobis distance in R
Making the cut-off, I see two multivariante outlieres in the plot and get 4 numbers printet in my console.
(43 and 2 in the first row and 2 and 23 in the second row)
What do they stand for? How can I ...
0
votes
0
answers
72
views
Expectation of Mahalanobis Distance and its logarithm
Suppose:
\begin{equation}
X \sim \mathcal{N}(X, \mu, \Sigma_x) \text{ st. } \Sigma_x \sim \mathcal{IW}(\Sigma_x; \Psi, v)
\end{equation}
Where $\mathcal{IW}$ is the Inverse-Wishart distribution. This ...
2
votes
1
answer
111
views
How to obtain the (weighted or unweighted) L1 imbalance measure for "raw" data
I have two questions regarding the JASA paper, Multivariate Matching Methods That Are Monotonic Imbalance Bounding, by Iacus et al. (2011), the authors produce Figure 2 (left panel) where they compare ...
10
votes
3
answers
2k
views
How can this counterintiutive result with the Mahalanobis distance be explained?
I encountered a strange issue when performing Mahalanobis distance matching. Let's say I have one treated unit with the following values on two variables: $T:(17, 4)$. I have two control units with ...
0
votes
1
answer
220
views
Evaluating the quality of matches after Mahalanobis distance matching on stata
I am new to Mahalanobis distance matching and trying to see for information on how I can assess whether the matches I got using Mahalanobis matching are good matches.
...
2
votes
1
answer
326
views
Would using a mahalanobis transformation be the same as standardize the variable?
I am trying to use mahalanobis transformation on a single variable that is autocorrelating with itself in a time series that goes from week to week over several years. The mahalanobis transformation ...
1
vote
0
answers
274
views
Mahalanobis Distance critical alpha cutoff [closed]
I have two predictors and five dependent variables. I'm trying to figure out what the critical alpha cutoff for Mahalanobis distance is for a model with 5 DVs to check for assumption of multivariate ...
1
vote
1
answer
261
views
All Weights Are "1" For Mahalanobis Nearest-Neighbor Matching
When generating weights using nearest-neighbor matching with Mahalanobis distance, all of my weights for each treated and non-treated unit are "1". Below is my syntax using the ...
2
votes
1
answer
234
views
Merits of different matching & weighting methods for multiple treatments
I have three land use classes (natural farming, chemical farming, and forest) and would like to compare the densities of different bird species between them. I would like to get the ATE (I think).
I ...
0
votes
0
answers
73
views
Is it possible to apply the kernel trick to a "mahalnobis distance learner" such as GLS?
1.https://arxiv.org/pdf/0804.1441.pdf
2.https://www.sciencedirect.com/science/article/abs/pii/S0925231210001165
These papers describe kernelizing a mahalanobis distance learner.
I am interested in ...
1
vote
1
answer
57
views
in built Mahalanobis distance function gives different result
i want to calculate mahalanobis distance, from the formula and also theoretical explanation given here
...
-1
votes
1
answer
134
views
How to (dis)prove Mahalanobis $ d^2 $ is non decreasing?
I have been studying multivariate analysis and hypothesis testing , and came across this question in back exercise
Define Mahalabonis measure of distance squared between
two populations with a common ...
1
vote
0
answers
50
views
Running several hierarchical linear regressions, all have the same Mahalanobis Distance. Is this weird?
As the title says I've run several hierarchical linear regressions on SPSS and I've found that all 6 of them have the same Mahalanobis Distance. I'm wondering if this is a normal thing to happen or ...
7
votes
1
answer
1k
views
Why does univariate Mahalanobis distance not match z-score?
I am using Mahalanobis distance for outlier detection. Sometimes my dataset only has 1 feature, sometimes many more. I believe the univariate Mahalanobis distance should be equal to the z-score of the ...
-1
votes
0
answers
43
views
Gaussian distribution, Mahlanobis distance, probability [closed]
**Suppose we have a set of b variables which have x and y coordinates (bx1 by1, bx2 by2,...,bxN byN) each of which have a gaussian distribution. we also have somepoints, suppose (Ux1 Uy1, Ux2 Uy2, ......
2
votes
2
answers
737
views
Confusion on calculating Mahalanobis distance between a point and a cluster
I am slightly confused as to how you calculate Mahalanobis distance given a set of data. I have tried asking my tutor for help but he does not seem interested in helping what so ever and I am ...
1
vote
0
answers
175
views
Distance metric that is robust to collinearity
I'm trying to find a distance metric that takes into account the correlation between vectors. That is, suppose we have matrix $M$ of dimensions $n \times k$, and we take the pairwise distance between ...
4
votes
1
answer
261
views
Multivariate Chebyshev's inequality with Mahalanobis distance
In Chebyshev's inequality, we can generalize the 68-95-99.7 rule from normal distributions to bound how much density is within a certain number of standard deviations from the mean.
$$
P\big(
\big\...
0
votes
0
answers
98
views
Deducing $L^1$ Boundary from Mahalanobis Boundary
Assume maximum likelihood estimators $a,b$ of size $p$, with corresponding estimated covariance matrices $V^a,V^b$. In fact $a,b$ are two regression coefficient vectors.
Denote $q=a-b$ the vector of ...
2
votes
0
answers
192
views
About the calculation of covariance matrix in mahalanobis distance: How $W^TW$ is equal to the covariance matrix? [closed]
I was reading about deep metric learning (from here) and came across the mahalanobis distance. I understood why we can not use euclidean distance if the distribution is not isotropic (the covariance ...
0
votes
0
answers
287
views
Mahalanobis vs centering / standard deviations
Is there a difference whether to use a Mahalanobis distance or transform the data via centering (and normalization) when you are interested in calculating distances?
This means, if you are interested ...
1
vote
0
answers
56
views
Distance defined by second moment, akin to Mahalanobis distance?
In ordinary linear regression ($c=0$) and ridge regression ($c > 0$), for design matrix $X$ with dimensions $N$ observations by $D$ dimensions, the $N \times N$ hat matrix is given by:
$$H = X (X^T ...
3
votes
1
answer
646
views
What is the covariance matrix of the normal order statistics?
I would like to test if a sample comes from a standard normal distribution. I want to do that by sorting the sample values, and measuring the Mahalanobis distance to the expected order statistics from ...
0
votes
0
answers
20
views
Why is the covariance matrix inverted in the definition of the Mahalanobis distance? [duplicate]
I'm on my first course on data science, and I encountered the Mahalanobis distance for the first time. It was mentioned that intuitively, what it does is that it corrects for the fact that some ...
0
votes
0
answers
49
views
How to find 'influential points' in multivariate data with weak covariance
Part-1: I have used PCA and Mahalanobis distance to find outliers. But in both cases, only the highest or lowest values are detected as outliers. I am looking for a way that any data point that does ...
1
vote
1
answer
321
views
Mahalanobis Distance question in R
I have a question when calculating Mahalanobis Distance using R.
For the Mahalanobis distance function below:
mahalanobis(x, **center**, cov, inverted = FALSE, ...)
...
0
votes
0
answers
42
views
Is negative Mahalanobis distance proportional to log probability?
Is the following statement true? I was sure it was, but I was told by someone else that it is not.
$$
p(\mathbf{x}_i | y_i) = \frac{1}{2\pi^{\frac{D}{2}} |\mathbf{\Sigma}_c|^{\frac{1}{2}} } e^{ -\frac{...
0
votes
0
answers
25
views
Effect of correlation in matching
What is the effect of correlations among observed covariates or between observed and unobserved covariates on the quality of matching, when matching iss done using
Propensity score
(Euclidian/...
2
votes
1
answer
334
views
Is there an intuition about the matrix operations in the exponent of the multivariate normal distribution?
In the exponent of the multivariate distribution, there are 2 vectors and a square matrix multiplied together to get a scalar result:
$$(\mathbf{x} - \mu)^{\text{T}}\Gamma^{-1}(\mathbf{x} - \mu)$$
...
2
votes
3
answers
398
views
How to find the variance(s) of a bivariate normal density such that 95% of the mass is within a certain radius from the mean defined by a point A?
I would like to find the variance of a bivariate normal density (BND), centered at the mean M, such that 95% of its mass is within a certain radius, which depends on the position of a point, A.
(Note: ...
15
votes
2
answers
14k
views
What are the pros and cons of using mahalanobis distance instead of propensity scores in matching
I learned about this option of using mahalanobis distance instead of PS to do matching from the matchit() function in R. It seems a more nonparametric approach. Could you state its pros and cons and ...
0
votes
0
answers
444
views
Mahalanobis distance to detect multivariate outliers [duplicate]
I have to detect outliers on 3 variables. On the internet I found the mahalanobis distance but I understood I can use it only on multivariate normally distributed data, and my data isn't. So, do you ...
1
vote
0
answers
573
views
Covariance Matrix Decomposition - Data Decorrelation
So I recently found out about Mahalanobis distance. Given a r.v $x$ in N-dimensional space, an associated metric is defined by
$$M(x) = \sqrt{(x-\mu)^T S^{-1}(x-\mu)}$$
where $\mu$ and $S$ are mean ...
0
votes
1
answer
132
views
When running a multiple regression, are both dependent and independent variables scanned for outliers?
I want to run a multiple regression analysis using SPSS. I have used the Mahalanobis d square method to find outliers. However my question is, do I add the dependent variable to the list of ...
2
votes
1
answer
283
views
How to determine an appropriate "closeness" threshold when matching for causal inference?
Say I have a [yes/no] treatment variable (e.g. the customer complained about their order) and I want to estimate the causal impact of this "treatment" on the average customer's future spend. ...
3
votes
2
answers
410
views
direction of outlier detected by the Mahalanobis distance
Mahalanobis distance provides a value that might be used for the detection of outliers. My question: how to calculate the direction of the outlier (as a vector)?
A simple answer would be to use the ...
1
vote
1
answer
910
views
Anomaly detection using Mahalanobis distance
I am using Mahalanobis distance to identify outliers. I am training using kind of one class classification,by training only on positive samples and trying to predict negative samples using distance ...
1
vote
1
answer
64
views
Detect outliers / detect classes
Currently I have a dataset that contains several products with different prices and quantities. My goal is to detect if the given product was sold as a package or as a unit and I have used the ...
0
votes
1
answer
847
views
Mahalanobis Distance for Continuous and Ordinal Covariates
My dataset of home sales includes covariates such as square_feet which are continuous and others like num_bedrooms which are in <...
2
votes
1
answer
368
views
Why does this formula produce $p_{2}$ probabilities for Mahalanobis distances?
At the IBM website it is written that
The p1 probabilities are standard probabilities of an observation from
a multivariate normal distribution being that far or further from the
centroid, ...
0
votes
0
answers
81
views
Equivalence between Mahalanobis distance and PCA (mathematical proof) [duplicate]
From this article and this post it emerges the strong connection between Mahalanobis distance and PCA. In particular in the first article I reference it says:
" the squared Mahalanobis distance is ...
0
votes
1
answer
376
views
Discordance between various methods of multivariate outliers detection
Here is a small "toy example" dataset, with 15 individuals described by 6 variables (this is R language):
...
3
votes
2
answers
2k
views
Understanding the R stats mahalanobis() function's Output
An acquaintance recommended I use the Mahalanobis distance on my data instead of Euclidean, Manhattan, etc.
I tried using the mahalanobis() function in the R stats package on a data matrix with N ...
2
votes
0
answers
136
views
Determine outliers for robust Mahalanobis distance
I want to apply a robust mahal distance and found an implementation in scikit.
but there is the number of outliers already given in advance. For me, who wants to find out the number of outliers, this ...
5
votes
2
answers
686
views
Something like Mahalanobis distance when the copula is not Gaussian
Mahalanobis distance accounts for different variances of the marginal variables and correlations between the marginal variables. However, there is an implicit (maybe explicit) assumption that ...