I have a data frame that looks like this:
data <- data.frame(PROG_START = c("Tuesday, October 1, 2019", "1682294400"))
> print(data)
PROG_START
1 Tuesday, October 1, 2019
2 1682294400
As you can see, the PROG_START column has dates in 2 different formats (UTX timestamp and day of week, month, day, year). I would like to standardize the values in this column so that the format for all dates is day-month-year.
I've looked through Stack and haven't had much success in finding a solution. I did find one helpful post that led me to develop the following:
data1 <- data %>%
mutate(Dates = case_when(str_detect(PROG_START, '\\d{10}\\.\\d{3}') ~ PROG_START, TRUE ~ NA_character_ )) %>%
mutate(Dates = as.POSIXct(as.numeric(PROG_START), origin = '1970-01-01', tz = 'UTC')) %>%
mutate(PROG_START = anytime(PROG_START)) %>% mutate(PROG_START = coalesce(PROG_START, Dates)) %>% select(-Dates)
This successfully converted most of the UTX timestamps, however, a minority had the wrong year. For example, 1682294400 was ascribed the year 1682 instead of 2023. It also turned all other formatted dates (i.e., Tuesday, October 1, 2019) into NAs.
I then updated the above code to this:
data1 <- data %>%
mutate(PROG_START = case_when(
str_detect(PROG_START, '\\d{10}\\.\\d{3}') ~ as.POSIXct(as.numeric(PROG_START), origin = '1970-01-01', tz = 'UTC'),
TRUE ~ dmy(PROG_START)
)) %>%
mutate(PROG_START = format(PROG_START, "%d-%m-%Y"))
However, this turned everything into NAs. Not sure how to proceed at the moment. I appreciate the help! Thank you!