You can create a grouping variable based on the number of rows and number of elements to group. For your case, you want to group every 6 rows so your data frame should be divisible with 6. Using iris
to demonstrate (It has 150 rows, so 150 / 6 = 25)
rep(seq(nrow(iris)%/%6), each = 6)
#[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8 8 9 9 9 9 9 9 10 10 10 10
#[59] 10 10 11 11 11 11 11 11 12 12 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 15 15 15 15 15 15 16 16 16 16 16 16 17 17 17 17 17 17 18 18 18 18 18 18 19 19 19 19 19 19 20 20
#[117] 20 20 20 20 21 21 21 21 21 21 22 22 22 22 22 22 23 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25
There are plenty of ways to handle how you want to call it. Here is a custom function that allows you to do that (i.e. create the grouping variable),
f1 <- function(x, df) {
v1 <- as.numeric(gsub('[0-9]{4}M(.*):[0-9]{4}M(.*)$', '\\1', x))
v2 <- as.numeric(gsub('[0-9]{4}M(.*):[0-9]{4}M(.*)$', '\\2', x))
i1 <- (v2 - v1) + 1
return(rep(seq(nrow(df)%/%i1), each = i1))
}
f1("2017M01:2017M06", iris)
#[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8 8 9 9 9 9 9 9 10 10 10 10
#[59] 10 10 11 11 11 11 11 11 12 12 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 15 15 15 15 15 15 16 16 16 16 16 16 17 17 17 17 17 17 18 18 18 18 18 18 19 19 19 19 19 19 20 20
#[117] 20 20 20 20 21 21 21 21 21 21 22 22 22 22 22 22 23 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25
EDIT: We can easily make the function compatible with 'non-0-remainder' divisions by concatenating the final result with a repetition of the max+1
value of the final result of remainder times, i.e.
f1 <- function(x, df) {
v1 <- as.numeric(gsub('[0-9]{4}M(.*):[0-9]{4}M(.*)$', '\\1', x))
v2 <- as.numeric(gsub('[0-9]{4}M(.*):[0-9]{4}M(.*)$', '\\2', x))
i1 <- (v2 - v1) + 1
final_v <- rep(seq(nrow(df) %/% i1), each = i1)
if (nrow(df) %% i1 == 0) {
return(final_v)
} else {
remainder = nrow(df) %% i1
final_v1 <- c(final_v, rep((max(final_v) + 1), remainder))
return(final_v1)
}
}
So for a data frame with 20 rows, doing groups of 6, the above function will yield the result:
f1("2017M01:2017M06", df)
#[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4
rep(seq(nrow(df)%/%6), each = 6)
group_by
andsummarise
fromdplyr
package