Thank you for posting! Here is an outline of an effective post:
Describe your issue
Hello! I’m going through chapter 9 in the handbook. The instructions in part 9.7 about working with date-time class mentions that a “clean” time of admission column with missing values filled-in with the column median should be made because lubridate won’t operate on missing values. However, upon running the given code, it doesn’t get the median of time_admission:
# packages
pacman::p_load(tidyverse, lubridate, stringr)
# time_admission is a column in hours:minutes
linelist <- linelist %>%
# when time of admission is not given, assign the median admission time
mutate(
time_admission_clean = ifelse(
is.na(time_admission), # if time is missing
median(time_admission), # assign the median
time_admission # if not missing keep as is
) %>%
# use str_glue() to combine date and time columns to create one character column
# and then use ymd_hm() to convert it to datetime
mutate(
date_time_of_admission = str_glue("{date_hospitalisation} {time_admission_clean}") %>%
ymd_hm()
)
linelist %>% select(date_hospitalisation, time_admission_clean, date_time_of_admission) %>% head(10)
Output:
date_hospitalisation time_admission_clean date_time_of_admission
1 2014-05-15 <NA> <NA>
2 2014-05-14 09:36 2014-05-14 09:36:00
3 2014-05-18 16:48 2014-05-18 16:48:00
4 2014-05-20 11:22 2014-05-20 11:22:00
5 2014-05-22 12:60 <NA>
6 2014-05-23 14:13 2014-05-23 14:13:00
7 2014-05-29 14:33 2014-05-29 14:33:00
8 2014-06-03 09:25 2014-06-03 09:25:00
9 2014-06-06 11:16 2014-06-06 11:16:00
10 2014-06-07 10:55 2014-06-07 10:55:00
I think it is because the time_admission column is a character class. I tried to mutate it to numeric with the following code but I got all NA’s instead:
linelist <- linelist %>%
mutate(time_admission = as.numeric(time_admission)) %>%
# when time of admission is not given, assign the median admission time
mutate(
time_admission_clean = ifelse(
is.na(time_admission), # if time is missing
median(time_admission), # assign the median
time_admission # if not missing keep as is
)) %>%
# use str_glue() to combine date and time columns to create one character column
# and then use ymd_hm() to convert it to datetime
mutate(
date_time_of_admission = str_glue("{date_hospitalisation} {time_admission_clean}") %>%
ymd_hm()
)
Output:
date_hospitalisation time_admission time_admission_clean date_time_of_admission
1 2014-05-15 NA NA <NA>
2 2014-05-14 NA NA <NA>
3 2014-05-18 NA NA <NA>
4 2014-05-20 NA NA <NA>
5 2014-05-22 NA NA <NA>
6 2014-05-23 NA NA <NA>
7 2014-05-29 NA NA <NA>
8 2014-06-03 NA NA <NA>
9 2014-06-06 NA NA <NA>
10 2014-06-07 NA NA <NA>