Epicurve is not showing any data

I am trying to make an epicurve using date of onset. I was taught to use ymd() to convert to date class, but when I run my ggplot command, no data appears. I’ve looked at the EPi R Handbook but cannot find a solution. Please help me!!!

I am putting a reporoducble example below:

# Install and Load packages -----------------------------------------------
pacman::p_load(
  rio,           # importing data
  here,          # finding files in your R project
  janitor,       # data cleaning & fast simple tables with tabyl()
  tidyverse,      # mega-package of data management & visualization
  datapasta,
  reprex
)



# Import data -------------------------------------------------------------
linelist_raw <- data.frame(
  stringsAsFactors = FALSE,
  dayonset = c("12nov2006", NA, NA, NA, "12nov2006"),
  age = c(18L, 18L, 17L, 17L, 18L),
  sex = c("male", "female", "female", "male", "female")
)

# Data cleaning -----------------------------------------------------------
linelist_clean <- linelist_raw %>%   # begin with raw data
  clean_names() %>%                  # standardise column names
  mutate(dayonset = ymd(dayonset))   # convert to date class
#> Warning: There was 1 warning in `mutate()`.
#> β„Ή In argument: `dayonset = ymd(dayonset)`.
#> Caused by warning:
#> ! All formats failed to parse. No formats found.


# Make plots --------------------------------------------------------------
# epicurve
ggplot(data = linelist_clean,
       mapping = aes(x = dayonset))+
  geom_histogram()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Removed 5 rows containing non-finite values (`stat_bin()`).

Created on 2023-10-12 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23 ucrt)
#>  os       Windows 10 x64 (build 22621)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United States.utf8
#>  ctype    English_United States.utf8
#>  tz       Europe/Berlin
#>  date     2023-10-12
#>  pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.4.1   2022-09-23 [1] CRAN (R 4.2.1)
#>  colorspace    2.0-3   2022-02-21 [1] CRAN (R 4.2.1)
#>  curl          4.3.2   2021-06-23 [1] CRAN (R 4.2.1)
#>  datapasta   * 3.1.0   2020-01-17 [1] CRAN (R 4.2.1)
#>  digest        0.6.29  2021-12-01 [1] CRAN (R 4.2.1)
#>  dplyr       * 1.1.3   2023-09-03 [1] CRAN (R 4.2.3)
#>  evaluate      0.22    2023-09-29 [1] CRAN (R 4.2.3)
#>  fansi         1.0.3   2022-03-24 [1] CRAN (R 4.2.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.2.3)
#>  forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.2.3)
#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.2.1)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.2.1)
#>  ggplot2     * 3.4.3   2023-08-14 [1] CRAN (R 4.2.3)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.1)
#>  gtable        0.3.4   2023-08-21 [1] CRAN (R 4.2.3)
#>  here        * 1.0.1   2020-12-13 [1] CRAN (R 4.2.1)
#>  hms           1.1.3   2023-03-21 [1] CRAN (R 4.2.3)
#>  htmltools     0.5.6   2023-08-10 [1] CRAN (R 4.2.3)
#>  janitor     * 2.2.0   2023-02-02 [1] CRAN (R 4.2.3)
#>  knitr         1.44    2023-09-11 [1] CRAN (R 4.2.3)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.2)
#>  lubridate   * 1.8.0   2021-10-07 [1] CRAN (R 4.2.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.1)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.2.1)
#>  pacman        0.5.1   2019-03-11 [1] CRAN (R 4.2.1)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.2.3)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.1)
#>  purrr       * 1.0.1   2023-01-10 [1] CRAN (R 4.2.3)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.1)
#>  readr       * 2.1.4   2023-02-10 [1] CRAN (R 4.2.3)
#>  reprex      * 2.0.2   2022-08-17 [1] CRAN (R 4.2.3)
#>  rio         * 1.0.1   2023-09-19 [1] CRAN (R 4.2.3)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.2.3)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.2.3)
#>  rprojroot     2.0.3   2022-04-02 [1] CRAN (R 4.2.1)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.2.3)
#>  scales        1.2.1   2022-08-20 [1] CRAN (R 4.2.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.1)
#>  snakecase     0.11.1  2023-08-27 [1] CRAN (R 4.2.3)
#>  stringi       1.7.8   2022-07-11 [1] CRAN (R 4.2.1)
#>  stringr     * 1.5.0   2022-12-02 [1] CRAN (R 4.2.3)
#>  tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.2.3)
#>  tidyr       * 1.2.1   2022-09-08 [1] CRAN (R 4.2.1)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.2.3)
#>  tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.2.3)
#>  tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.2.3)
#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.2.1)
#>  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.2.3)
#>  withr         2.5.1   2023-09-26 [1] CRAN (R 4.2.3)
#>  xfun          0.40    2023-08-09 [1] CRAN (R 4.2.3)
#>  xml2          1.3.3   2021-11-30 [1] CRAN (R 4.2.1)
#>  yaml          2.3.5   2022-02-21 [1] CRAN (R 4.2.1)
#> 
#>  [1] C:/Users/neale/AppData/Local/R/win-library/4.2
#>  [2] C:/Program Files/R/R-4.2.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
1 Like

Hello @epi_dude.

The issue with your code is that you’re not using the correct lubridate function to convert your dayonset column to the Date class. The ymd function expects a date format with the year information first, followed by the month and day.

In your dayonset column, you have the day first, followed by the month and year. So, the most appropriate function to convert this to Date would be the dmy function.

  • ymd stands for β€œyear month day”
  • dmy stands for β€œday month year”

Remember to verify your current date format before attempting the conversion to the Date class. The corrected code should look like this:

linelist_clean <- linelist_raw %>%   # begin with raw data
        clean_names() %>%                  # standardise column names
        mutate(dayonset = dmy(dayonset)) 
1 Like

Thank you @lnielsen for this fast reply! I think I understand better now how those functions work - I should use the lubridate function (either dmy(), myd(), or ymd() that corresponds to how the dates are structured in the raw/unclean data.

And my code is working now! I have marked your reply as the Solution

2 Likes