Error when trying to convert age variable to numeric

Hi,

This is an exercise on posting a reproducible example for a R training course.
I am cleaning the dataset and tried to convert age into a numeric data. I have an unexpected error message.

Here is the reprex:

``` r
# EXAMPLE R SCRIPT FOR REPREX

# install and load packages
pacman::p_load(
rio,        # Importing data  
here,       # Relative file pathways 
janitor,    # Data cleaning and tables
lubridate,  # Working with dates
tidyverse,  # Data management and visualization
datapasta,
reprex
)


# import data
H7N9_raw <- data.frame(
stringsAsFactors = FALSE,
date_of_symptoms = c("2/19/2013",
                     "2/27/2013","3/9/2013","3/19/2013","3/19/2013"),
date_of_hospitalisation = c(NA,"3/3/2013",
                            "3/19/2013","3/27/2013","3/30/2013"),
date_of_result = c("3/4/2013",
                   "3/10/2013","4/9/2013",NA,"5/15/2013"),
sex = c("m", "m", "f", "f", "f"),
age = c("87", "27", "35", "45", "48"),
province = c("Shanghai",
             "Shanghai","Anhui","Jiangsu","Jiangsu"))

# clean data

H7N9_clean <- H7N9_raw %>%
clean_names() %>%
filter(province != "Anhui") %>%
mutate( 
  date_of_symptoms = mdy(date_of_symptoms),              # Convert to date format
  date_of_hospitalisation = mdy(date_of_hospitalisation),
  date_of_result = mdy(date_of_result)) %>%
mutate(sex = recode(sex, "m"  = "male",                  # Recode categories for sex
                    "f" = "female")) %>%
mutate(sex = na_if(sex, "")) %>%                         # Replace empty cells by NA
mutate(age = na_if(age, "?")) %>%                        # Replace ? by NA
mutate(age = as.numeric(age)) %>%                           # Convert to numeric

#check if age is numeric
class(H7N9_clean$age)
#> Error: object 'H7N9_clean' not found

Created on 2024-03-04 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31 ucrt)
#>  os       Windows 10 x64 (build 19045)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United Kingdom.utf8
#>  ctype    English_United Kingdom.utf8
#>  tz       Europe/London
#>  date     2024-03-04
#>  pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.2)
#>  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.3.2)
#>  datapasta   * 3.1.0   2020-01-17 [1] CRAN (R 4.3.3)
#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.2)
#>  dplyr       * 1.1.3   2023-09-03 [1] CRAN (R 4.3.2)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
#>  fansi         1.0.5   2023-10-08 [1] CRAN (R 4.3.2)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.2)
#>  forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.3.2)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.2)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.2)
#>  ggplot2     * 3.4.4   2023-10-12 [1] CRAN (R 4.3.2)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.2)
#>  gtable        0.3.4   2023-08-21 [1] CRAN (R 4.3.2)
#>  here        * 1.0.1   2020-12-13 [1] CRAN (R 4.3.2)
#>  hms           1.1.3   2023-03-21 [1] CRAN (R 4.3.2)
#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
#>  janitor     * 2.2.0   2023-02-02 [1] CRAN (R 4.3.2)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
#>  lubridate   * 1.9.3   2023-09-27 [1] CRAN (R 4.3.2)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.2)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.3.2)
#>  pacman        0.5.1   2019-03-11 [1] CRAN (R 4.3.2)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.2)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.2)
#>  purrr       * 1.0.2   2023-08-10 [1] CRAN (R 4.3.2)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.2)
#>  readr       * 2.1.4   2023-02-10 [1] CRAN (R 4.3.2)
#>  reprex      * 2.0.2   2022-08-17 [1] CRAN (R 4.3.2)
#>  rio         * 1.0.1   2023-09-19 [1] CRAN (R 4.3.2)
#>  rlang         1.1.2   2023-11-04 [1] CRAN (R 4.3.2)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.2)
#>  rprojroot     2.0.4   2023-11-05 [1] CRAN (R 4.3.2)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.2)
#>  scales        1.2.1   2022-08-20 [1] CRAN (R 4.3.2)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
#>  snakecase     0.11.1  2023-08-27 [1] CRAN (R 4.3.2)
#>  stringi       1.8.1   2023-11-13 [1] CRAN (R 4.3.2)
#>  stringr     * 1.5.1   2023-11-14 [1] CRAN (R 4.3.2)
#>  tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.3.2)
#>  tidyr       * 1.3.0   2023-01-24 [1] CRAN (R 4.3.2)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.2)
#>  tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.3.2)
#>  timechange    0.2.0   2023-01-11 [1] CRAN (R 4.3.2)
#>  tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.3.2)
#>  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.2)
#>  vctrs         0.6.4   2023-10-12 [1] CRAN (R 4.3.2)
#>  withr         2.5.2   2023-10-30 [1] CRAN (R 4.3.2)
#>  xfun          0.41    2023-11-01 [1] CRAN (R 4.3.2)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.2)
#> 
#>  [1] C:/Users/Magali.Collonnaz/AppData/Local/R/win-library/4.3
#>  [2] C:/Program Files/R/R-4.3.2/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
1 Like

Hi @magali.collonnaz and welcome to the forum!

I think removing the %>% pipe operator on the last line of your cleaning command will prevent your error. This is preventing the clean data from actually saving as H7N9_clean, and running the class() command before you intend it to be run.

below is the fixed code

All the best,


# install and load packages
pacman::p_load(
  rio,        # Importing data  
  here,       # Relative file pathways 
  janitor,    # Data cleaning and tables
  lubridate,  # Working with dates
  tidyverse,  # Data management and visualization
  datapasta,
  reprex
)


# import data
H7N9_raw <- data.frame(
  stringsAsFactors = FALSE,
  date_of_symptoms = c("2/19/2013",
                       "2/27/2013","3/9/2013","3/19/2013","3/19/2013"),
  date_of_hospitalisation = c(NA,"3/3/2013",
                              "3/19/2013","3/27/2013","3/30/2013"),
  date_of_result = c("3/4/2013",
                     "3/10/2013","4/9/2013",NA,"5/15/2013"),
  sex = c("m", "m", "f", "f", "f"),
  age = c("87", "27", "35", "45", "48"),
  province = c("Shanghai",
               "Shanghai","Anhui","Jiangsu","Jiangsu"))

# clean data

H7N9_clean <- H7N9_raw %>%
  clean_names() %>%
  filter(province != "Anhui") %>%
  mutate( 
    date_of_symptoms = mdy(date_of_symptoms),              # Convert to date format
    date_of_hospitalisation = mdy(date_of_hospitalisation),
    date_of_result = mdy(date_of_result)) %>%
  mutate(sex = recode(sex, "m"  = "male",                  # Recode categories for sex
                      "f" = "female")) %>%
  mutate(sex = na_if(sex, "")) %>%                         # Replace empty cells by NA
  mutate(age = na_if(age, "?")) %>%                        # Replace ? by NA
  mutate(age = as.numeric(age))                            # Convert to numeric
  
  #check if age is numeric
  class(H7N9_clean$age)

1 Like