Skip to main content
added 262 characters in body
Source Link

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(\min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, \max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. The numbers from my dataset correspond to the values of a performance index computed for 30 countries. This index is computed based on economic, social and institutional aspects (GDP, employment rate, industry, services, national debt, government size, etc).

I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(\min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, \max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(\min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, \max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. The numbers from my dataset correspond to the values of a performance index computed for 30 countries. This index is computed based on economic, social and institutional aspects (GDP, employment rate, industry, services, national debt, government size, etc).

I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?

added 2 characters in body
Source Link
User1865345
  • 10.3k
  • 12
  • 23
  • 40

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(min, m-sd)$$(\min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, max)$$(m+sd, \max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(\min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, \max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?

Source Link

Statistical explanation for a way to group data

I would like to know if there is any statistical explanation on why it would be a good idea to group some data into the following way:

  1. find the maximum ($\max$), minimum ($\min$), mean ($m$) and standard deviation ($sd$) of a sample;

  2. group the data into the following 4 subintervals: $(min, m-sd)$, $(m-sd, m)$, $(m, m+sd)$, $(m+sd, max)$.

I mention that there are no outliers in my dataset and we assume that the data is normally distributed. I know that approximately 68% of the data falls within $(m-sd, m+sd)$. I also know that another way to group data is via quartiles (we would also have 4 subintervals: ($\min$, $Q_1$), ($Q_1$, $Q_2$), ($Q_2$, $Q_3$), ($Q_3$, $\max$), where $Q_1, Q_2, Q_3$ would be the quartiles). So, is there any particular advantage (that may be statistically explained) if we group some data via the 1st method?