Thank you for posting! Here is an outline of an effective post:
Describe your issue
Hello! I’d like to apply na_if() across multiple specific columns in one go.
What steps have you already taken to find an answer?
Presently, I do sequential mutate steps on a “per variable” basis. However, I was wondering if there was a more elegant solution?
Provide an example of your R code
library(tidyverse)
linelist_raw <- data.frame(
stringsAsFactors = FALSE,
sex = c("m", "m", "m", "m", "m", "m", "m", "m", "f", "m"),
hx_vax = c("1", "0", "0", "1", "uk", "uk", "1", "uk", "0", "0"),
pcr = c("1", "nd", "0", "1", "nd", "0", "1", "uk", "0", "1"))
linelist <- linelist_raw %>%
mutate(hx_vax = na_if(hx_vax, "uk"),
pcr = na_if(pcr, "uk"),
pcr = na_if(pcr, "nd"))
In this example dataset, hx_vax has 3 options: 1 = yes, 0 = no, and uk = unknown. pcr has 4 options: 1 = positive, 0 = negative, nd = not done, uk = tested but unknown result.
In the na_if() help file, an option would be to do the following:
linelist <- linelist_raw %>%
mutate(across(where(is.character), ~na_if(., "uk")))
However in the full dataset there are many other character columns that I don’t want to mutate at the moment so I would rather specify which columns.