Skip to main content

Questions tagged [descriptive-statistics]

Descriptive statistics summarize features of a sample, such as mean and standard deviations, median and quartiles, the maximum and minimum. With multiple variables, may include correlations and crosstabs. Can include visual displays - boxplots, histograms, scatterplots and so on.

Filter by
Sorted by
Tagged with
1 vote
1 answer
39 views

Data categorization

I have categorized my education dataset for the analysis below. However, I have one occurrence of a respondent who attended a Missionary school that I do not know its level and am unsure where to ...
Amelia Nicodemus's user avatar
0 votes
0 answers
35 views

Employment status categories that include pensioners, learners, students and non schooling

I have collected a dataset on Employment status. I created the following categories; Pensioners, Formally employed, Informally Employed, Self-employed, and Unemployed. I also have Learners or Students ...
Amelia Nicodemus's user avatar
0 votes
0 answers
16 views

Estimate SD[Z=X*Y] which one is correct?

Given we have a sample of two variables X and Y with sample size n. I want to calculate the standard deviation of Z = X*Y. I don't know which of the two options bellow are correct? Option 1: Simply ...
PTQuoc's user avatar
  • 193
0 votes
0 answers
30 views

How to analyze differences in growth data with non-normal distributions?

I have data on fungal growth (growth rate and latency period) from different strains. I want to examine whether these values differ based on origin (2 types), growth medium (2 types), and incubation ...
450003's user avatar
  • 1
0 votes
1 answer
15 views

How can I filter outliers in data that is manually recorded?

Different people have to write down values on a certain type of parameter in order to fill out a table, and people obviously tend to write wrong. Sometimes, by a factor of 1000. This creates a lot of ...
Huragok's user avatar
6 votes
3 answers
536 views

Median Absolute Deviation of Zero

I am using MAD as a measure of the spread of different distributions of numeric data, and some of these distributions have MAD of 0. I am curious, how is this possible? If I understand correctly, MAD ...
imky's user avatar
  • 85
2 votes
4 answers
337 views

What descriptive statistics to report for a paired sample when using a nonparametric test?

I have two variables, A and B, (paired sample, before-after) and I need to conduct a paired sample nonparametric test because the distribution of the difference between A and B is not normal (...
marco's user avatar
  • 43
0 votes
0 answers
17 views

How can I devise a new metric to evaluate monthly loss prevention performance based on historical data?

I'm reporting on loss numbers for my company every month, and I’m looking to create a metric that can clearly indicate whether we’re doing better or worse in preventing losses. The goal is to reflect ...
Eric Saboori's user avatar
8 votes
1 answer
413 views

The Info measure of Hmisc::describe

The documentation for Hmisc::describe (page 77 of the PDF) says: For numeric variables, describe adds an item called Info which is a relative information measure using the relative efficiency of a ...
robertspierre's user avatar
1 vote
1 answer
56 views

How to assess the absolute strength of a correlation between two multifactorial variables when some of their factors are negatively correlated

I have two multifactorial constructs, one of which consists of 5 subfactors, and the other 4. I am looking to evaluate the 'absolute' strength of the association between the two constructs. The issue ...
bjovnyryk's user avatar
1 vote
1 answer
26 views

Is there away to compute Index values (base 100) from Year-over-Year % change (YoY) of the variable?

Let's assume I have a time series like this : Time period YoY Change (%) Y2024 _ Q1 7.00 Y2024 _ Q2 4.85 Y2024 _ Q3 5.77 Y2024 _ Q4 5.66 Y2025 _ Q1 6.54 Y2025 _ Q2 6.48 Y2025 _ Q3 6.36 Y2025 ...
Johannes Konrad's user avatar
2 votes
1 answer
66 views

Is anomaly forecasting in time series analysis possible?

I am currently working on a univariate time series data and I wanted to know if anomaly forecasting is possible in time series. I previously worked on anomaly detection which detects the anomaly when ...
Rayapudi Gautam Kumar's user avatar
0 votes
0 answers
8 views

Comparing retrospective and prospective cohorts

I am a bit confused on comparing retrospective and prospective cohorts … Say I have two cohorts, one retrospective (used as the control) and one prospective cohort (with a program/intervention applied)...
LimitDNE's user avatar
1 vote
0 answers
14 views

How to characterize a dataset

I know we can compute a correlation matrix for continuous variables in a dataset. However, to summarize the degree of correlations between variables, is it possible to do a mean or something else ...
Phoebe's user avatar
  • 153
3 votes
1 answer
273 views

When trying to find the quartiles for discrete data, do we round to the nearest whole number?

I have 2 cases: First, I want to find the first quartile to the set data: 1,2,3,4,5. Normally we calculate the qaurtile as: Now $Q_1 = \frac{2+1}{2} = 1.5$. But they never state if its continuous (i.e....
Reuben's user avatar
  • 131
5 votes
3 answers
218 views

Triangular correlations?

As used in my answer at https://stats.stackexchange.com/a/652022/11887, triangular correlation seems to be a useful concept/terminology that could see more use. But searching I cannot see much use, ...
kjetil b halvorsen's user avatar
0 votes
1 answer
37 views

How to adjust a variable by age, sex and BMI

I'm working on a database with almost 300 patients, of which I have their age, sex, BMI and their level of diabetes (which, for the purpose of the study, is stratified into mild, intermediate or ...
Alex Horrillo's user avatar
2 votes
1 answer
59 views

Looking for statistical function to represent data

Say I have a molecule that contains 9 atoms. For each atom, I have calculated the properties as prop_1, prop_2... prop_5 using a certain methodology. To check if the methodology was stable, I ran the ...
Pro's user avatar
  • 123
0 votes
1 answer
34 views

Can I do weighting on continuous variables?

I have two datasets collected in 2018 and 2023. I was going to check if there's any difference between the two datasets, but the ratio of sex and family size was different from each other (...
BEAU's user avatar
  • 1
3 votes
0 answers
20 views

How can you calculate Q1 and Q3 for even numbers? [duplicate]

I've searched this question everywhere, but I've found different answers and none of them result in the same answer Numpy gave me. I have the following data:[0, 1, 2, 3, 4, 4, 5, 5, 6, 8] When using ...
trder's user avatar
  • 700
1 vote
1 answer
54 views

Appropriate significance test for small sample size, unclear (non-normal) distribution

Given data: 30 datapoints in the form of usability scores (System Usability Scale) 2 groups within data (Group 1: 17, group 2: 13, independent, unequal sizes) Objective: See if there is any ...
anjelomerte's user avatar
1 vote
1 answer
67 views

What analysis do I need for 2 Ivs and 2 DVs?

I think I need a MANOVA, but I may need an ANCOVA and then also do an ANOVA within just the experimental group. I think it's a 2x2 factorial independent measures ANOVA within the intervention group. I ...
Worried Student's user avatar
1 vote
1 answer
21 views

Hausman Test Reults Report

I've been working on a research project using a multi-level regression model, and I'm currently figuring out how to present the Hausman test results. I've seen some papers where authors mention doing ...
Sabrina's user avatar
  • 11
3 votes
3 answers
79 views

Is considering a specific distribution necessary before computing an average?

Many times, I compute averages of variables without considering distributions at all, and I use those computed averages to represent measures of variables in my data without mentioning specific ...
xabzakabecd's user avatar
  • 3,585
0 votes
0 answers
45 views

calculating percentage normalisation

Please don't laugh or close this post I am confused as to how I can calculate the percentage normalization occurring post treatment. So, here is a background of the problem, I have reading from ...
Angelo's user avatar
  • 4,555
3 votes
1 answer
76 views

Reference for Directional Statistics of Plane Orientation

I've got a project I'm working on where I've got the orientation (normal) vectors of planes. These vectors are all within a unit hemisphere where the $z$-coordinate is strictly positive. The ...
David G.'s user avatar
  • 169
2 votes
1 answer
30 views

Analyzing lists and variables of multiple answers

My current issue lies within EMR extracted data for medications. There are multiple variables named: Medication_1, Medication_2, Medication_3, etc... This data may overlap and analyzing each column ...
Abdallah Al-Ani's user avatar
4 votes
1 answer
68 views

How is this formula derived

I have carried out an experiment with several repeats and my program has returned the values of EC50 and its CI(95%) confidence intervals, LogEC50 and the standard error of (LogEC50). The logs are ...
dosh's user avatar
  • 41
0 votes
0 answers
23 views

Normalization of absorbance data by fresh weight of samples

A [protocol] says For each sample the absorbance (Abs) reading was divided by the fresh weight of the sample in grams. The results were normalized using an arbitrary value of 1 for the control ...
user414999's user avatar
0 votes
1 answer
27 views

Survival function for a certain population

If the survival function for a certain population is given by $$ s(x) = \left( \frac{1}{1+x} \right)^4 $$ for $x \ge 0$, how long would you expect a person who is $41$ years old to live? (a) $14$ ...
mattsteiner64's user avatar
2 votes
0 answers
45 views

Standardization of summary statistic of group-linked values

Assume that you measure a summary statistic (e.g., arithmetic mean) in measurement windows of a fixed size along a long sequence of values and that these values are grouped into regions belonging to ...
Michael Gruenstaeudl's user avatar
0 votes
0 answers
23 views

OLS Model with Lags - logged coeff

i am building a OLS model using python, where the dependant and independent variables are lagged. This is a form of econometrics model where i want to figure out how much each independent variable ...
milo204's user avatar
  • 31
0 votes
0 answers
56 views

Comparing one value to mean or median of a group

I have very limited statistical knowledge or experience, so I am looking for some guidance on my approach for statistical analysis. I have a group of 60 values. I am looking to compare the last value ...
snalmznh's user avatar
3 votes
1 answer
61 views

Should we routinely conduct unsupervised learning when reporting descriptive statistics on data?

A standard approach prior to conducting a predictive or inferential analysis is to report some basic univariate descriptive statistics on the study variables: mean, median, minimum, maximum, variance, ...
RobertF's user avatar
  • 6,286
0 votes
1 answer
20 views

how to summarize moving average

This question is about how to summarize moving averages. Please assume the values in column Pct are % of people how have negative opinion about vaccine. column MovgAvg is the two year moving average ...
Ahir Bhairav Orai's user avatar
1 vote
1 answer
57 views

Which statistical model is suitable?

I have the results of a survey of $n=132$ patients with their socio-economic profile and their spending behavior on mobility-coins (my thesis topic). In the survey, we asked people how they would ...
yaseen's user avatar
  • 11
2 votes
2 answers
75 views

type I error in multiple comparisons summary statistics

I understand type I error can increase if we run 'multiple comparisons'. But does that refer to comparisons within a multi-category variable (or a group variable)? or multiple analyses using the same ...
aqen's user avatar
  • 191
1 vote
0 answers
99 views

Maximum Likelihood in High Dimensions [closed]

What are some examples of high-dimensional random variables for which MLE are solved using numerical methods because we are unable to explicitly solve the equations nicely? The only example to comes ...
Nicolas Bourbaki's user avatar
0 votes
0 answers
11 views

Variability of patients in different Hospitals

just to let you know that my statistical background is not very good. I made funnel graphs which shows the effect of polypharmacy against practice size. In the funnel graphs, a dot presents a single ...
Usman YousafZai's user avatar
2 votes
1 answer
28 views

Should a Better User Engagement Model Keep Outperforming Old Models Across Time?

Context Let's say we are talking about a machine learning model that governs some user interaction (e.g. pricing model, recommendation model etc.) on an app. Let's say model v1 is champion (in ...
Della's user avatar
  • 553
3 votes
2 answers
355 views

Is the exponent of the MAD (Median Absolute Deviation) of log transformed Data measuring the relative distance from median in the untransformed data?

I want to confirm whether taking the Exponent of the MAD of Log Transformed Data gives me a measure of relative distance from median of the original untransformed data. So say I have a MAD of 0.2 for ...
Anon9001's user avatar
1 vote
0 answers
28 views

Degrees of freedom for estimation

In the context of estimators, why is it that in general dividing by the degrees of freedom(instead of the sample size) leads to unbiasedness? I see the value in substituting degrees of freedom for ...
secretrevaler's user avatar
1 vote
1 answer
61 views

Percentiles of a distribution of weighted summary statistics

Suppose I have a collection of different independent probability distributions, $\{ P_i(X)\}_{i=1}^N$, each with their own support $I_i$. I know that the $10^{th}$ percentile of a given distribution ...
David G.'s user avatar
  • 169
0 votes
0 answers
8 views

Looking for ideas to improve Weighted Moving Average result

I have two data sets: ...
Allan Xu's user avatar
  • 101
0 votes
0 answers
44 views

Comparing averages of two groups

Say I want to check if the averages of 2 groups (A and B) are the same. For each member in group A, there are some observations; same applies for group B. If I want to take the average of A and B, I ...
Andrei's user avatar
  • 1
3 votes
2 answers
75 views

Is it possible to calculate a standard deviation from the gini coefficient and mean?

I am looking to create an analysis showing how many people in a given country have more than than X dollars in income. I know the average income, population count, and Gini Coefficient of income ...
Andrew's user avatar
  • 31
13 votes
8 answers
2k views

Is descriptive statistics enough to compare test scores of students in a class?

I am reviewing the theory on hypothesis testing and the book I am reading ("Hypothesis Testing" by Jim Frost) stresses the fact that we do hypothesis testing and inferential statistic when ...
rusiano's user avatar
  • 566
1 vote
0 answers
21 views

How to Evaluate Interaction Effects in Propensity Score-Matched Samples:

Suppose I want to study the association between an exposure X and outcome Y, and I have used the propensity score to match each exposed subject with those unexposed but with similar characteristics ...
zjppdozen's user avatar
  • 347
1 vote
1 answer
47 views

How many samples should I test to be 95% sure that no error exists?

If I have a million population of products and I will tolerate no error in them. How many samples should I test to be 95% sure that no error exists? I am new to statistics. I know you might need some ...
Polime's user avatar
  • 11
0 votes
0 answers
89 views

Difference between Cross-Sectional and Nested Case-Control Samples in Cohort Studies

I have a question regarding the cross-sectional sample from the most recent visit and the nested case-control sample extracted from a cohort study, especially when the exposure of interest was ...
zjppdozen's user avatar
  • 347

1
2 3 4 5
37