I am an Epidemiologist who has just now completed my Intro training to R course. As part of the reprex code exercise, I have created an error in my code and copying the error below to request guidance in troubleshooting this code. Any help will be greatly appreciated. Reprex is below.

  rio,          # for importing data
  here,         # for locating files
  skimr,        # for reviewing the data
  janitor,      # for data cleaning  
  epikit,       # creating age categories
  gtsummary,    # creating tables  
  RColorBrewer, # for colour palettes
  viridis,      # for more colour palettes
  scales,       # percents in tables  
  flextable,    # for making pretty tables
  gghighlight,  # highlighting plot parts  
  ggExtra,      # special plotting functions
  naniar,        # replace values with NA
  tidyverse     # for data management and visualization

# Import data -------------------------------------------------------------

# importing the file not from a project folder using here() as coded below, followed by using read.csv()

file_path <- here ("C:/Users/vxe9/Desktop/intro_course/learning_materials/extra_datasets/H7N9_china_2013_EN.csv")

h1n1 <- data.frame(
  stringsAsFactors = FALSE,
  case_id = c(1L, 2L, 3L, 4L, 5L),
  date_of_symptoms = c("2/19/2013",
  date_of_hospitalisation = c(NA,"3/3/2013",
  date_of_result = c("3/4/2013",
  sex = c("m", "m", "f", "f", "f"),
  age = c("87", "27", "35", "45", "48"),
  province = c("Shanghai",

# clean the imported data

h1n1_cl <- h1n1 %>% 
  clean_names() %>% 
  distinct () %>% 
  # rename variables
  # new name = old name
    symp_date   = date_of_symptoms,
    hosp_date   = date_of_hospitalisation,
    result_date = date_of_result
  ) %>% 
  # transform variables 
  mutate (
    symp_date   = mdy (symp_date),
    hosp_date   = mdy (hosp_date),
    result_date = ymd (result_date),
    sex    = recode (sex,
                     "m" = "male",
                     "f" = "female"),
    age_cat   = age_categories (
      breakers = c(0, 10, 20, 30, 40, 50, 60, 70))) %>% 
  replace_with_na(replace = list(result = c("", "D"))) %>%        #replace_with_na() of naniar package replaces specific values with na
  filter (province != "Anhui") %>% 
#> Warning: There was 1 warning in `mutate()`.
#> β„Ή In argument: `result_date = ymd(result_date)`.
#> Caused by warning:
#> ! All formats failed to parse. No formats found.
#> Warning: Missing from data: `result`

Created on 2024-03-09 with reprex v2.0.2

There are two issues with your code, the first is that you use the ymd() function for result_date even though it’s formatted as mm/dd/YY, i.e., you should continue to use the mdy() function.

Second, when you used the replace_with_na() function, you specified a variable called result, but this does not exist in your data. Perhaps you intended to include this variable or you meant result_date, which is what I assumed.

Please see the code below:

# loading packages
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#>     chisq.test, fisher.test

# creating fake data
h1n1 <- data.frame(
  stringsAsFactors = FALSE,
  case_id = c(1L, 2L, 3L, 4L, 5L),
  date_of_symptoms = c(
    "2/27/2013", "3/9/2013", "3/19/2013", "3/19/2013"
  date_of_hospitalisation = c(
    NA, "3/3/2013",
    "3/19/2013", "3/27/2013", "3/30/2013"
  date_of_result = c(
    "3/10/2013", "4/9/2013", NA, "5/15/2013"
  sex = c("m", "m", "f", "f", "f"),
  age = c("87", "27", "35", "45", "48"),
  province = c(
    "Shanghai", "Anhui", "Jiangsu", "Jiangsu"

# cleaning the data
h1n1_clean <- h1n1 |>
    clean_names() |>
        symp_date = date_of_symptoms,
        hosp_date = date_of_hospitalisation,
        result_date = date_of_result
    ) |>
    mutate (
        symp_date = mdy(symp_date),
        hosp_date = mdy(hosp_date),
        result_date = mdy(result_date),
        sex = recode (sex,
                                         "m" = "male",
                                         "f" = "female"),
        age_cat = age_categories(
            breakers = c(0, 10, 20, 30, 40, 50, 60, 70))) |>
    replace_with_na(replace = list(result_date = c("", "D"))) |>
    dplyr::filter (province != "Anhui") |>

Created on 2024-03-10 with reprex v2.1.0

All the best,
