Best practice is: don’t translate the data frame. Keep the data as-is, and localize the “presentation layer” (titles, captions, axis labels, legend labels, table headers, factor labels) via a translation dictionary + a language parameter.
That gives you repeatable, auditable reports in multiple languages without duplicating pipelines.
Recommended patterns
1) Parameterize the report by language
In your YAML:
---
title: "Surveillance report"
params:
lang: "en" # or "ru"
---
Then run twice (or via a script): once with lang="en", once with lang="ru".
2) Use a translation dictionary (not recode() scattered everywhere)
Create a small table (CSV/YAML/R list) mapping keys → en/ru strings.
Example in R:
i18n <- list(
en = list(title_cases = "Cases over time", x_date = "Date", y_cases = "Cases"),
ru = list(title_cases = "Случаи по времени", x_date = "Дата", y_cases = "Случаи")
)
tr <- function(key, lang = params$lang) i18n[[lang]][[key]]
Then in ggplot:
ggplot(df, aes(date, cases)) +
geom_col() +
labs(
title = tr("title_cases"),
x = tr("x_date"),
y = tr("y_cases")
)
More on i18n here: GitHub - Appsilon/shiny.i18n: Shiny applications internationalization made easy
3) Translate categorical values only when plotting/reporting
For legend labels and tables, translate factor labels at render time:
labels_case_status <- list(
en = c("confirmed" = "Confirmed", "probable" = "Probable"),
ru = c("confirmed" = "Подтверждено", "probable" = "Вероятно")
)
df_plot <- df %>%
mutate(status_lab = dplyr::recode(status, !!!labels_case_status[[params$lang]]))
ggplot(df_plot, aes(date, fill = status_lab)) +
geom_bar() +
labs(fill = tr("status"))
This keeps your “analysis variables” stable (status) and only changes the display labels.
Avoid “translate the final PDF/Word”
Translating the rendered output is usually the worst option:
-
plots become images (hard to translate axis labels cleanly),
-
tables and numbers can be mangled,
-
no reproducible audit trail.
If you must, do it only for narrative text, not for charts/tables.
What I’d do in practice
-
One Rmd/Quarto source
-
params$lang
-
A translation dictionary file (CSV/YAML) committed to the repo
-
A helper tr() function using i18n
-
All labels/titles/headers come from tr()
-
Factor/value label translation happens in a prep_for_reporting(lang) step
Best,
Luis