3

I have the following data frame:

col1<-c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
col2<-c(1,2,3,44,1,1,2,3,44,44,1,2,44,1,44)
df<-data.frame(col1,col2)

I am trying to group by col1 entries, and find, for each grouping of col1, values of col2 that are equal to 44 and followed immediately by a smaller entry (<44), and FLAG such entries in a new column.

However, this code doesnt seem to work:

df %>% group_by(col1)  %>% mutate(FLAG=(col2==44 & lead(col2,1)<44))

    col1  col2  FLAG
   <dbl> <dbl> <lgl>
1      1     1 FALSE
2      1     2 FALSE
3      1     3 FALSE
4      1    44  TRUE
5      1     1 FALSE
6      2     1 FALSE
7      2     2 FALSE
8      2     3 FALSE
9      2    44 FALSE
10     2    44  TRUE
11     3     1 FALSE
12     3     2 FALSE
13     3    44  TRUE
14     3     1 FALSE
15     3    44    NA

Specifically, entry 10 should be FALSE, since it has no entry <44 in the same grouping directly following it. Any suggestions on how to write code that works more generally to do what I want?

6
  • 4
    I get NA in row 10 when I run your code (which is the expected behavior).
    – eipi10
    Commented Feb 8, 2017 at 18:31
  • I don't. Are you sure? I just double-checked
    – user85727
    Commented Feb 8, 2017 at 18:35
  • Not sure why my computer is giving different results.
    – user85727
    Commented Feb 8, 2017 at 18:44
  • I don't know why we're getting different results. I'm also wondering why you're getting NA in row 15 but not in row 10. What happens when you run your code in a clean session with just dplyr loaded?
    – eipi10
    Commented Feb 8, 2017 at 18:47
  • 2
    Other packages have lead and lag functions that behave differently than dplyr versions. My guess is you had masked the dplyr versions with those from another package. Commented Feb 8, 2017 at 19:24

2 Answers 2

1

Another way by using if_else function of dplyr package

df %>% group_by(col1)  %>% mutate(FLAG=if_else(col2==44 & lead(col2,1)<44,TRUE,FALSE,missing = FALSE))
# Source: local data frame [15 x 3]
# Groups: col1 [3]
# 
# col1  col2  FLAG
# <dbl> <dbl> <lgl>
# 1      1     1 FALSE
# 2      1     2 FALSE
# 3      1     3 FALSE
# 4      1    44  TRUE
# 5      1     1 FALSE
# 6      2     1 FALSE
# 7      2     2 FALSE
# 8      2     3 FALSE
# 9      2    44 FALSE
# 10     2    44 FALSE
# 11     3     1 FALSE
# 12     3     2 FALSE
# 13     3    44  TRUE
# 14     3     1 FALSE
# 15     3    44 FALSE
1
  • when I run this I still get entry 10 as TRUE. Not sure what is going on.
    – user85727
    Commented Feb 8, 2017 at 18:43
1

You can include the condition that lead(col2) may not be NA.

df %>% 
  group_by(col1)  %>% 
  mutate(FLAG = (col2 == 44 & lead(col2, 1) < 44 & !is.na(lead(col2, 1))))

Source: local data frame [15 x 3]
Groups: col1 [3]

    col1  col2  FLAG
   <dbl> <dbl> <lgl>
1      1     1 FALSE
2      1     2 FALSE
3      1     3 FALSE
4      1    44  TRUE
5      1     1 FALSE
6      2     1 FALSE
7      2     2 FALSE
8      2     3 FALSE
9      2    44 FALSE
10     2    44 FALSE
11     3     1 FALSE
12     3     2 FALSE
13     3    44  TRUE
14     3     1 FALSE
15     3    44 FALSE
6
  • Not sure why both your solutions seem to work, but not on my computer....
    – user85727
    Commented Feb 8, 2017 at 18:44
  • 1
    @user85727 maybe try updating R/dplyr?
    – erc
    Commented Feb 8, 2017 at 18:46
  • I think this may be the issue. Weird, it said I had the latest version installed....
    – user85727
    Commented Feb 8, 2017 at 18:47
  • How can I change this code to flag all leading 44s prior to an entry less than 44 so that col2<-c(1,1,1,1,44,44,44,1,44,44,1,2,44,3,44) col1<-c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3) df<-data.frame(col1,col2) gives me TRUE for entries 5,6, and 7 as well?
    – user85727
    Commented Feb 8, 2017 at 19:25
  • @user85727 Sorry, I don't understand how col1 in rows 5 and 6 fulfills the conditions such that "all leading 44s prior to an entry less than 44"
    – erc
    Commented Feb 8, 2017 at 19:44

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.