Hi Ian,
You are correct in how to handle missing values in the max
function, as well as many others within R. However, you will need to decide whether this is how you want to handle the situation, or if some other summary may be better.
With regards to why -Inf is returned for case 103, that is because there is only a single value for age_years_mother and it happens to be NA. As a result, since youβve told R to remove NAs, the lowest possible value you could have is -Inf so it is returned. You could write additional code to check first if the only value(s) in these variable are NA and to return NA if so. Again, this will ultimately require you to decide how you want to handle the data, I donβt think thereβs any one answer that is the absolute best.
Finally, I think the following code is a better solution for the problem:
library(tidyverse)
# create demo dataset
demo_raw <- data.frame(
stringsAsFactors = FALSE,
case_number = c(101, 101, 102, 102, 103, 103, 104, 104, 104),
date_abstracted = c(
"2023-04-01",
"2023-04-03",
"2023-04-02",
"2023-04-03",
"2023-04-03",
"2023-04-10",
"2023-04-10",
"2023-04-20",
"2023-05-01"
),
ssx_cataract_a = c(0, 1, 0, 0, 0, 0, 1, 1, 1),
ssx_hearing_a = c(0, 0, 0, 0, 1, 1, 0, 0, 0),
ssx_chd_a = c(1, 0, 1, 1, 1, 1, 0, 0, 0),
ssx_microcephaly_b = c(0, 0, 0, 1, 0, 0, 0, 0, 0),
ssx_delay_b = c(0, 0, 0, 0, 0, 0, 0, 0, 0),
ssx_jaundice_b = c(0, 0, 0, 0, 0, 0, 0, 0, 0),
ig_m = c("NA", "NA", "NA", "NA", "NA", "1", "NA", "NA", "NA"),
ig_g = c("NA", "NA", "NA", "NA", "NA", "1", "NA", "NA", "NA"),
case_classification = c(
"susp",
"susp",
"susp",
"susp",
"prob",
"conf",
"susp",
"susp",
"susp"
),
age_years_mother = c("26", "26", "35", "35", "NA", "NA", "26", "27", "NA")
)
# roll-up and overwrite with hierarchy
demo_roll <- demo_raw |>
# Roll-up values into one row and keeping only unique values
group_by(case_number) |>
summarise(across(everything(), # apply to all columns
~ paste0(unique(.x), collapse = "; "))) # function is defined which combines unique non-NA values
demo_roll |>
mutate(
max_date = map(str_split(date_abstracted, ";"), ymd),
max_age = map(str_split(age_years_mother, ";"), as.integer)
) |>
rowwise() |>
mutate(max_date = max(max_date, na.rm = TRUE),
max_age = max(max_age, na.rm = TRUE)) |>
ungroup()
#> Warning: There were 2 warnings in `mutate()`.
#> The first warning was:
#> βΉ In argument: `max_age = map(str_split(age_years_mother, ";"), as.integer)`.
#> Caused by warning:
#> ! NAs introduced by coercion
#> βΉ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
#> Warning: There was 1 warning in `mutate()`.
#> βΉ In argument: `max_age = max(max_age, na.rm = TRUE)`.
#> βΉ In row 3.
#> Caused by warning in `max()`:
#> ! no non-missing arguments to max; returning -Inf
#> # A tibble: 4 Γ 14
#> case_number date_abstracted ssx_cataract_a ssx_hearing_a ssx_chd_a
#> <dbl> <chr> <chr> <chr> <chr>
#> 1 101 2023-04-01; 2023-04-03 0; 1 0 1; 0
#> 2 102 2023-04-02; 2023-04-03 0 0 1
#> 3 103 2023-04-03; 2023-04-10 0 1 1
#> 4 104 2023-04-10; 2023-04-20; 20β¦ 1 0 0
#> # βΉ 9 more variables: ssx_microcephaly_b <chr>, ssx_delay_b <chr>,
#> # ssx_jaundice_b <chr>, ig_m <chr>, ig_g <chr>, case_classification <chr>,
#> # age_years_mother <chr>, max_date <date>, max_age <dbl>
Created on 2023-05-17 with reprex v2.0.2
Session info
sessioninfo::session_info()
#> β Session info βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> setting value
#> version R version 4.2.3 (2023-03-15)
#> os macOS Big Sur ... 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Toronto
#> date 2023-05-17
#> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#>
#> β Packages βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> package * version date (UTC) lib source
#> cli 3.6.1 2023-03-23 [1] CRAN (R 4.2.0)
#> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.2.0)
#> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.0)
#> dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.2.0)
#> evaluate 0.21 2023-05-05 [1] CRAN (R 4.2.0)
#> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.2.0)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0)
#> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.2.2)
#> fs 1.6.2 2023-04-25 [1] CRAN (R 4.2.0)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
#> ggplot2 * 3.4.2 2023-04-03 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> gtable 0.3.3 2023-03-21 [1] CRAN (R 4.2.0)
#> hms 1.1.3 2023-03-21 [1] CRAN (R 4.2.0)
#> htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.2.0)
#> knitr 1.42 2023-01-25 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.1)
#> lubridate * 1.9.2 2023-02-10 [1] CRAN (R 4.2.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
#> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.3)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> purrr * 1.0.1 2023-01-10 [1] CRAN (R 4.2.2)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> readr * 2.1.4 2023-02-10 [1] CRAN (R 4.2.2)
#> reprex 2.0.2 2022-08-17 [1] RSPM (R 4.2.1)
#> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.2.0)
#> rmarkdown 2.21 2023-03-26 [1] CRAN (R 4.2.0)
#> rstudioapi 0.14 2022-08-22 [1] RSPM (R 4.2.1)
#> scales 1.2.1 2022-08-20 [1] RSPM (R 4.2.1)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.2.2)
#> stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.2.0)
#> styler 1.9.1 2023-03-04 [1] CRAN (R 4.2.0)
#> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.2.3)
#> tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.2.0)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0)
#> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.2.2)
#> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.2.2)
#> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
#> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.2)
#> vctrs 0.6.2 2023-04-19 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> xfun 0.39 2023-04-20 [1] CRAN (R 4.2.0)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.0)
#>
#> [1] /Users/timothychisamore/Library/R/x86_64/4.2/library
#> [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#>
#> ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
All the best,
Tim