Help for module 10

Thank you for posting! Here is an outline of an effective post:

Describe your issue

  • I need help to clean the dates as they disappear and become NA
pacman::p_load(rio, janitor, tidyverse, datapasta, reprex)

# import data
surv_raw <- import(here("data", "raw", "surveillance_linelist_20141201.csv"))
#> Error in here("data", "raw", "surveillance_linelist_20141201.csv"): could not find function "here"

# data
surv_raw <- data.frame(
  stringsAsFactors = FALSE,
  case_id = c("694928","86340d","92d002","544bd1","6056ba"),
  sex = c("m", "f", "f", "f", "f"),
  onset_date = c("11/9/2014","10/30/2014","8/16/2014","8/29/2014","10/20/2014")
)
# try to convert column to class "Date"
surv_clean <- surv_raw %>% 
  clean_names() %>% 
  mutate(onset_date = ymd(onset_date))
#> Warning: There was 1 warning in `mutate()`.
#> ℹ In argument: `onset_date = ymd(onset_date)`.
#> Caused by warning:
#> ! All formats failed to parse. No formats found.

# check the CLEANED date column class and date range
class(surv_clean$onset_date)
#> [1] "Date"
range(surv_clean$onset_date)
#> [1] NA NA

Created on 2024-03-08 with reprex v2.0.2

Session info
sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United Kingdom.utf8 
#> [2] LC_CTYPE=English_United Kingdom.utf8   
#> [3] LC_MONETARY=English_United Kingdom.utf8
#> [4] LC_NUMERIC=C                           
#> [5] LC_TIME=English_United Kingdom.utf8    
#> 
#> time zone: Europe/London
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] reprex_2.0.2    datapasta_3.1.0 lubridate_1.9.3 forcats_1.0.0  
#>  [5] stringr_1.5.0   dplyr_1.1.2     purrr_1.0.2     readr_2.1.4    
#>  [9] tidyr_1.3.0     tibble_3.2.1    ggplot2_3.4.4   tidyverse_2.0.0
#> [13] janitor_2.2.0   rio_1.0.1      
#> 
#> loaded via a namespace (and not attached):
#>  [1] utf8_1.2.3        generics_0.1.3    stringi_1.7.12    hms_1.1.3        
#>  [5] digest_0.6.33     magrittr_2.0.3    evaluate_0.21     grid_4.3.1       
#>  [9] timechange_0.2.0  fastmap_1.1.1     fansi_1.0.4       scales_1.2.1     
#> [13] cli_3.6.1         rlang_1.1.1       munsell_0.5.0     withr_2.5.0      
#> [17] yaml_2.3.7        tools_4.3.1       tzdb_0.4.0        colorspace_2.1-0 
#> [21] pacman_0.5.1      vctrs_0.6.3       R6_2.5.1          lifecycle_1.0.3  
#> [25] snakecase_0.11.1  fs_1.6.3          pkgconfig_2.0.3   pillar_1.9.0     
#> [29] gtable_0.3.4      glue_1.6.2        xfun_0.40         tidyselect_1.2.0 
#> [33] rstudioapi_0.15.0 knitr_1.43        htmltools_0.5.7   rmarkdown_2.25   
#> [37] compiler_4.3.1
1 Like

Hi,

Please see the following thread for a solution to this problem: Exercise for R Training course: Unexpected NA value in date column - #2 by machupovirus

All the best,

Tim