Questions tagged [terminology]
Usage and meaning of specific technical words/concepts in statistics.
1,706 questions
1
vote
1
answer
30
views
Is there a commonly used name for the units of entropy measured with base three?
Is there a commonly used name for the units of entropy measured with base three (e.g., for states $-1, 0, 1$):
$$
H=-\sum_ip_i\log_3p_i,$$
similar to bits and nats for bases $2$ and $e$?
1
vote
0
answers
37
views
Inverse probability weighting vs importance sampling
What is the difference between inverse probability weighting and importance sampling?
From their Wikipedia pages:
Inverse probability weighting is a statistical technique for estimating quantities ...
1
vote
1
answer
48
views
The interpretation of the term "uncertainty" in statistics vs. information theory vs. machine learning
I have an ensemble model consisting from multiple classifiers and I wish to quantify the uncertainty of the predictions the ensemble model makes. From an information theory / machine learning ...
8
votes
0
answers
94
views
Name for Generalized Generalized Linear Models
Consider the class of models given by $y\sim F(g^{-1}(\beta^\top\mathbf{x}))$ with $\mathbb{E}[Y]=g^{-1}(\beta^\top\mathbf{x})$.
Most authors I've come across call this a GLM only if F is in the ...
3
votes
1
answer
45
views
Stochastic and deterministic trends in "Forecasting: Principles and Practice"
Section 9.4 of Forecasting: Principles and Practice (2nd edition) discusses stochastic and deterministic trends in dynamic regression models (regression with ARIMA errors). The authors write:
There ...
0
votes
0
answers
23
views
Term for sampling in groups of a given size
Imagine that you want to inspect members of a population to see if they have a target feature. For example, test people to see if they have a certain disease.
One possible approach is to take a sample ...
0
votes
0
answers
59
views
Terminology for types of errors and uncertainty: intrinsic? fitting?
Say I have a sequence of given variables $x_i$, $i=1,\ldots,n-1$ and the response $y_i$ and we explore the model $y_i\sim \text{Normal}(\alpha x_i,\sigma^2)$ with $y_i$ independent of $y_j$ for $i\neq ...
3
votes
0
answers
36
views
"Centroid" induced by metric
A metric $d(x,y)$ on a vector space $V$ can be used to induce the definition of a centroid of a points set $\{x_1,\ldots,x_n\}$ with respect to this metric as a (not necessarily unique) point $c$ that ...
6
votes
1
answer
506
views
What does "regularity condition" mean
While reading a blog post about score matching, I came across the term 'regularity condition'. When I searched for it on Google, I found an answer:
The regularity condition refers to a restriction ...
1
vote
1
answer
37
views
What is bias and bias per group for simple linear regression and mixed models respectively?
I found this nice diagram on the Wikipedia mixed model page showing the difference between a simple linear regression and a linear mixed model, however I don't understand what they mean by bias and ...
0
votes
1
answer
25
views
I am not sure about the use of 'polytomous' in this context in a scientific paper. Can someone help me find the proper word for this context?
This is the text:
As shown in Table 2 and Figure 3, our subgroup analysis of studies midpoint was not statistically significant (p=0.92). This finding was further supported with univariate and ...
4
votes
1
answer
66
views
What should I call this design?
Tha someone help with this question?
A study has 3 fixed effects A, B, and C and one random factor ID. Factor C (two levels) is determined by each ID's inherent characteristic (so basically two groups)...
5
votes
3
answers
217
views
Triangular correlations?
As used in my answer at https://stats.stackexchange.com/a/652022/11887, triangular correlation seems to be a useful concept/terminology that could see more use. But searching I cannot see much use, ...
3
votes
3
answers
56
views
The ability to engage in an unhealthy behavior (e.g. smoking) late in life may indicate strong overall health. Is there a name for this "bias"?
Is there a name for this phenomenon in epidemiology? I'd like to read about examples and approaches to identify and account for it.
The scenario:
Imagine you have an elderly cohort.
Most of these ...
8
votes
2
answers
599
views
Is there a name for the likelihood of the most likely outcome?
In pop statistics, it's common to state what the most likely outcome is as if that is a very meaningful attribute.
But without knowing how likely the most likely outcome is, it isn't really so highly ...
5
votes
3
answers
926
views
Rule of Thumb meaning in statistics
I was wondering what the term "rule of thumb" actually means in statistics. Why did they select this name, for example, for sample size calculation? Is it like an approximation based on ...
0
votes
0
answers
32
views
What do people in machine learning typically mean when they talk about something being ill conditioned?
RIght now I am reading a paper which deals with some optimization methods for machine learning and the author explains how some of the methods "deal with ill conditioning".
I know the term ...
0
votes
0
answers
12
views
Term for non-structural-zero part of a zero-inflated model
One way to describe a a zero-inflated random variable $Y$ is $Z \times Y'$ where $Y'$ is some discrete random variable and $Z \sim \text{Bernoulli}(\psi)$, for some $\psi \in [0,1].$ My question is: ...
1
vote
1
answer
43
views
Outcome vs. elementary event, which of them is assigned a probability?
A previous question here asked about the difference between "outcomes" and "elementary events". The answer was that
An event is some subset of the sample space. One possible event ...
3
votes
0
answers
64
views
What are the "tricks" in machine learning? [closed]
I have come across a few different "tricks" in machine learning methodology, which I list below along with my rudimental understandings.
The Kernel Trick:
This is used in Support Vector ...
1
vote
0
answers
77
views
What do people call such a chart with a strip of activity types per time?
What do people call a chart like that below which has days on Y axis and time of the day on X axis while color represents a kind of activity (data gathering, processing, etc.)?
It is very close to ...
9
votes
3
answers
800
views
What do people call a chart with a strip of peak values in time intervals?
What do people call a chart like one below which has days on Y axis and time of the day on X axis while color represents the level of some value (for example, loading, usage count, etc.)?
The chart ...
2
votes
0
answers
80
views
When authors write "model of the data" is this a shorthand for "model of the data generating process" [duplicate]
Some authors, such as Barber in Bayesian Reasoning and Machine Learning and Rasmussen & Williams in Gaussian Processes for Machine Learning, write phrases such as "model of the data" and ...
6
votes
4
answers
1k
views
Is it incorrect terminology to say "confidence interval of a random variable"?
I have seen claims that "population paramter is not a random variable" when discussing confidence intervals.
eg here
Be sure to note that the population parameter is not a random variable.
...
8
votes
2
answers
623
views
What is it called when two variables causally affect one another?
Suppose two variables X1 and X2 are correlated and we know that X1 causes X2 and X2 causes X1.
For example, leg strength and an interest in cycling interest might be correlated. And (suppose) we know ...
6
votes
1
answer
191
views
Terminology clarification about sample moments
According to MathWorld (link): "The sample raw moments are unbiased estimators of the population raw moments".
While in Wikipedia (link) it is said:
...the $k$-th raw moment of a population ...
9
votes
6
answers
313
views
Why was the term "significance" ($\alpha$) chosen for the probability of Type I error?
I'm currently studying "Statistics 1" as part of my Computer Science degree, and I'm having trouble understanding the concept of "significance."
We were provided with the following ...
1
vote
1
answer
84
views
What is the name/terminology for this application of OLS regression
I don't come from a statistics background and was instructed to follow these steps to fill in missing data. I'm wondering if there is a name for this specific method so that I can learn more of it and ...
10
votes
4
answers
3k
views
What kind of chart is this and how to read it?
I came across this chart that is both weird and intriguing. It is about some literary works produced in the regions mentioned. The x axis is the time and y axis the percentage.
The preceding para has ...
0
votes
0
answers
14
views
Given variable A and B containing data of lemma sentiments, what is the correct term for the variable containing average of var A and var B?
I have a data visualization, showing the sentiment of two lemmas "гей" (var a) and "трансгендер" (var b) in a news corpus throughout the year.
Here is the dataframe sample of my ...
2
votes
3
answers
104
views
what is null hypothesis in a simple term? [duplicate]
I am new here, reading a lot what null hypothesis is but not quite get clear picture. Could someone give me a simple explanatation or example please.
2
votes
0
answers
31
views
What is a limited estimator?
I'm reading this paper where, on pp 3 of the pdf, the authors write (emphasis added):
"As the fitted two-component model is a limited estimator to classify bound/unbound sites, we additionally ...
10
votes
1
answer
968
views
Deviation between Mean and Median
If I have a mean of 15 and median of 10, is there a term for the difference between these?
Can you use this difference of 5 for anything valuable statistically?
2
votes
1
answer
135
views
What's the difference between Mediators, Co-Variates, Moderators and Confounders terms?
I was wondering if someone could shed some light on the difference between the above-mentioned terms since I see them used frequently in many research publications I've read.
If I have an outcome ...
0
votes
1
answer
106
views
Background reason for the terms ‘Isotropic’ and ‘Anisotropic’ in the context of GNN Message Passing
I’m reading a paper on Graph Neural Networks (GNNs) that uses the terms ‘isotropic’ and ‘anisotropic’ in the context of message passing. I understand that these terms originate from physics, chemistry,...
2
votes
1
answer
238
views
Error rate vs Empirical risk - What's the difference between practical and theoretical terms for performance of neural networks?
Motivation
I am currently reading the following book: Understanding machine learning by Shalev-Shwartz and Ben-David. The book uses statistics terminology in its machine learning theory, and it is not ...
0
votes
0
answers
23
views
What is this method name for comparing two financial time series by their difference (subtraction)?
I am trying to find a reference/name for what I am doing for explaining it in an academic work.
My scenario is that: I have financial time-series A and B. I want to answer if A outperforms B or vice-...
14
votes
5
answers
2k
views
What is the statistics term for exact value that occurs in otherwise continous distribution?
For some continuous quantities (e.g. daily rainfall at a certain location), there is one exact value that occurs often (in the case of daily rainfall that's the value of zero: there are days on which ...
0
votes
0
answers
43
views
Terminology: multivariable when multiple levels of categorical variable?
Oftentimes, one sees people use terms such as univariate and multivariate logistic regression, where they clearly refer to number of predictors rather than number of response variables. I know it ...
1
vote
1
answer
47
views
Is a database either a population or sample population? [closed]
I am writing a paper, and I hit a singular obstacle. What are databases in statistics nomenclature? How do we call databases in statistics vernacular? Populations or sample population or databases or ...
5
votes
1
answer
52
views
Bias-variance trade-off for a specific fitted model vs. a class of models: terminology
Consider a data generating process
$$Y=f(X)+\varepsilon$$
where $\varepsilon$ is independent of $x$ with $\mathbb E(\varepsilon)=0$ and $\text{Var}(\varepsilon)=\sigma^2_\varepsilon$. According to ...
8
votes
4
answers
2k
views
Sample notation: When to use capital $N$ vs lowercase $n$?
In statistics and psychological research, what is denoted by capital $N$ vs lowercase $n$?
I work in psychological research and I've seen them used in two ways:
Capital $N$ represents the entirety of ...
0
votes
0
answers
17
views
Mean-parameterizable models that have invariant concentration functions, but that aren't translation-invariant?
Definitions: Sorry for the ad hoc terminology -- comments or answers that provide pointers to standard terminology would be much appreciated. For simplicity I'd like to restrict discussion to real-...
6
votes
1
answer
358
views
Hierarchy principle: who defined it first?
Different questions here deal with the problem of whether to include main effects in interaction models, for example here, here and here (for the opposite problem, omitting interaction coefficients ...
4
votes
0
answers
24
views
Apriori Algorithm confusion: difference between frequency and support
What is the correct way to calculate support?
I have seen two different ways and I'm confused as a result. One way (say first approach) is explained https://en.wikipedia.org/wiki/...
9
votes
2
answers
231
views
Origin of the term "spherical" in relation to covariance matrices?
I understand that a covariance matrix with all diagonal elements equal, and all off-diagonal elements also equal (but different to the diagonal elements) is called "spherical". I am curious ...
0
votes
0
answers
63
views
Residuals and "error terms" in time series
I'm self-studying and I see “residuals” seems to be what is left, after we take away non-random components. So if we have additive decomposition :
$$ Series = Constant + Trend\text{ }_t + Seasonality\...
3
votes
0
answers
60
views
What is a measure of hardness-of-approximation by samples?
Suppose there is a large vector $\mathbf{x}$ of real numbers, and I want to estimate a certain aggregate function $f(\mathbf{x})$ by taking a small sample of the population $\mathbf{x}$. I would like ...
6
votes
1
answer
149
views
Why are complete statistics named "complete"?
I get why sufficient statistics are named "sufficient", but what about "complete" statistics?
I have this definition from F.J. Samaniego, Stochastic Modelling and Statistical ...
2
votes
2
answers
183
views
Terminology: what is the name for sets of aggregated values over periods of time?
In statistics/analytics, let's say we have a time series of data points, i.e., pairs of timestamps and values:
t: 0
y: 42
t: 1
y: 32
t: 2
y: 29
t: 7
y: 0
As for ...