2

Data

I have 2 data frames as follows:

df <- data.frame(frames = 1:10,
                 X1 = c(0,0,0,10,10,10,10,15,15,15),
                 X2 = c(0,0,0,15,15,15,15,10,10,10),
                 X3 = rep(0,10),
                 X4 = rep(0,10),
                 X5 = rep(0,10))

Where,
frames = Time frame number
X1, X2, ..., X5 contain the unique identification number of vehicles. At any time frame, X1 is the nearest to the user (a driver in a driving simulator), X2 is the ** second-nearest** to the user and so on. For example, at frames 5, the vehicle # 10 is nearest so is inX1 and vehicle # 15 was the second nearest. There were no other vehicles around the user so columns X3 through X5 contain zero.

PLEASE NOTE: df doesn't have position; it has vehicle identification number instead. So 0 means no vehicle, 10 and 15 are the vehicle ID 10 and ID 15. They are not in a sequence and the numbers have no significance, so 10 is not 10 feet or the tenth vehicle, it is just an ID. I'd prefer A, B,... but I get the data in the format of ID numbers. And yes, I want to treat 0's as NAs.

The second data frame has similar structure but contains speeds of the vehicles instead of IDs:

df1 <- data.frame(frames = 1:10,
                 X1 = c(0.00,0.00,0.00,14.53,14.90,14.73,14.60,13.90,14.10,14.90),
                 X2 = c(0.00,0.00,0.00,12.57,12.80,13.10,13.60,14.65,14.70,14.79),
                 X3 = rep(0.00,10),
                 X4 = rep(0.00,10),
                 X5 = rep(0.00,10))

Example: At frames 5, the speed of vehicle # 10 was 14.90 m/s and that of vehicle # 15 was 12.80 m/s.

What I want to do?

I want to combine these 2 data frames to create a new one which looks like this:

> df.final
   ID frames speed
1  10      4 14.53
2  10      5 14.90
3  10      6 14.73
4  10      7 14.60
5  10      8 14.65
6  10      9 14.70
7  10     10 14.79
8  15      4 12.57
9  15      5 12.80
10 15      6 13.10
11 15      7 13.60
12 15      8 13.90
13 15      9 14.10
14 15     10 14.90

How can I achieve that? I have read the tutorial of tidyr package but still can't figure out how to do

2
  • Just to clarify, it seems that although your initial data frames df and df1 are set up to give some information about the position of each vehicle relative to the driver, this information isn't of interest to you in your final data frame? Additionally, you want to treat 0's in the vehicle ID or speed variables as NAs, correct?
    – tsurudak
    Commented Sep 30, 2015 at 21:22
  • @tsurudak Thanks for your response. df doesn't have position; it has vehicle identification number instead. So 0 means no vehicle, 10 and 15 are the vehicle ID 10 and ID 15. They are not in a sequence. And yes, I want to treat 0's as NAs. Commented Sep 30, 2015 at 21:34

1 Answer 1

5

Making use of tidyr and dplyr you can do

library(tidyr)
library(dplyr)

## 'melt' the dataframes into long format
## here, 'gather' is the tidyr equivalent of reshape2::melt
df <- df %>%
  gather(position, car_id, X1:X5)

df1 <- df1 %>%
  gather(position, speed, X1:X5)

## merge (join) by frames and position
df_final <- merge(df, df1, by=c("frames", "position"))

## Alternatively you can used dplyr::inner_join
## df_final <- inner_join(df, df1, by=c("frames","position"))
## although you don't need the 'by' argument here as inner_join
## detects the common/join columns

## filter and sort the results
df_final <- df_final %>%
  filter(car_id != 0) %>%
  arrange(car_id, frames)

which gives

df_final
   frames position car_id speed
1       4       X1     10 14.53
2       5       X1     10 14.90
3       6       X1     10 14.73
4       7       X1     10 14.60
....
0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.