Hi! I’m trying to make a script for cleaning messy dates. Basically, I have a linelist of deaths and it has a column called “Date of Death”. This column contains dates that are not uniformly formatted so I’d like to parse and format the dates into a uniform date format however, I’m getting the following error (cleanlist object not found) however, when I manually try to run the cleanlist code, it runs without any issues. I just get the issue below if I try to run the whole script. Also, I want to know if how would I properly use the mutate function to update the date_of_death column in the clean_list table/object. Thank you!
# Loading packages --------------------------------------------------------
# Checks if package is installed, installs if necessary, and loads package for current session
pacman::p_load(
lubridate, # general package for handling and converting dates
parsedate, # has function to "guess" messy dates
here, # file management
rio, # data import/export
janitor, # data cleaning
epikit, # for creating age categories
tidyverse, # data management and visualization
magrittr,
dplyr,
reprex, # minimal repr example
datapasta # sample data
)
datecolumn <- "date_of_death"
# Data Import -------------------------------------------------------------
linelist <- data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
`Date of Death` = c("45236","45212","45152",
"JANUARY 19, 2023","June 25, 2023","45200","45164",
"5/16/2023","45277","44930"))
# Data Cleaning -----------------------------------------------------------
cleanlist <- linelist %>% # the raw dataset
clean_names() %>% # automatically clean column names
# Format Dates ------------------------------------------------------------
# parse the date values
mutate(parsedate::parse_date(cleanlist[[datecolumn]]))
#> Error in `mutate()`:
#> ℹ In argument: `parsedate::parse_date(cleanlist[[datecolumn]])`.
#> Caused by error:
#> ! object 'cleanlist' not found
Just to be clear, I only used the slice_head function to print a sample of the data into the console, you wouldn’t use this in the actual process of formatting your dates. Let me know if you have any questions!
Yeah, I tried the code without the slice_head function and it worked fine however, there seems to be a problem in parsing the Excel dates. Ex. the ‘45236’ value was converted to “2024-08-12” but the correct date should be “2023-11-06”
I think that is a common issue with Excel numeric dates, I know the janitor package has the function excel_numeric_to_date() that might be useful for you.