calculating mean for every n values from a vector

Question

So lets say I have a vector

a <- rnorm(6000)

I want to calculate the mean of the 1st value to the 60th, then again calculate the mean for the 61st value to the 120th and so fourth. So basically I want to calculate the mean for every 60th values giving me 100 means from that vector. I know I can do a for loop but I'd like to know if there is a better way to do this?

apply function is mostly good for data frames/matrices. I am asking about a vector here. — arezaie, Commented Apr 26, 2017 at 13:42

Zheyuan Li · Accepted Answer · 2018-09-23 17:09:21Z

I would use

 colMeans(matrix(a, 60))
.colMeans(a, 60, length(a) / 60)  # more efficient (without reshaping to matrix)

Enhancement on user adunaic's request

This only works if there are 60x100 data points. If you have an incomplete 60 at the end then this errors. It would be good to have a general solution for others looking at this problem for ideas.

BinMean <- function (vec, every, na.rm = FALSE) {
  n <- length(vec)
  x <- .colMeans(vec, every, n %/% every, na.rm)
  r <- n %% every
  if (r) x <- c(x, mean.default(vec[(n - r + 1):n], na.rm = na.rm))
  x
  }

a <- 1:103
BinMean(a, every = 10)
# [1]   5.5  15.5  25.5  35.5  45.5  55.5  65.5  75.5  85.5  95.5 102.0

Alternative solution with group-by operation (less efficient)

BinMean2 <- function (vec, every, na.rm = FALSE) {
  grp <- as.integer(ceiling(seq_along(vec) / every))
  grp <- structure(grp, class = "factor",
                   levels = as.character(seq_len(grp[length(grp)])) )
  lst <- .Internal(split(vec, grp))
  unlist(lapply(lst, mean.default, na.rm = na.rm), use.names = FALSE)
  }

Speed

library(microbenchmark)
a <- runif(1e+4)
microbenchmark(BinMean(a, 100), BinMean2(a, 100))
#Unit: microseconds
#             expr      min        lq       mean    median        uq       max
#  BinMean(a, 100)   40.400   42.1095   54.21286   48.3915   57.6555   205.702
# BinMean2(a, 100) 1216.823 1335.7920 1758.90267 1434.9090 1563.1535 21467.542

Jeff Bezos · Accepted Answer · 2022-02-19 00:37:12Z

4

I recommend sapply:

a <- rnorm(6000)
seq <- seq(1, length(a), 60)
a_mean <- sapply(seq, function(i) {mean(a[i:(i+59)])})

edited Feb 19, 2022 at 0:37

answered Jul 23, 2019 at 15:46

Jeff Bezos

2,2451 gold badge17 silver badges29 bronze badges

Add a comment |

Ronak Shah · Accepted Answer · 2019-04-22 02:01:32Z

Another option is to use tapply by creating a grouping variable.

Grouping variable could be created in two ways :

1) Using rep

tapply(a, rep(seq_along(a), each = n, length.out = length(a)), mean)

2) Using gl

tapply(a, gl(length(a)/n, n), mean)

If we convert the vector to dataframe/tibble we can use the same logic and calculate the mean

aggregate(a~gl(length(a)/n, n), data.frame(a), mean)

OR with dplyr

library(dplyr)

tibble::tibble(a) %>%
          group_by(group = gl(length(a)/n, n)) %>%
          summarise(mean_val = mean(a))

data

set.seed(1234)
a <- rnorm(6000)
n <- 60

Collectives™ on Stack Overflow

calculating mean for every n values from a vector

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged
r
vector
mean
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged rvectormean or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
r
vector
mean
or ask your own question.