Hi Tim, hi Elham,
thank you so much for your quick and super helpful answers. Much appreciated!
I somehow managed to create a small version of the data set / reprex / so that you can reproduce it, but the thing is: with this it works perfectly. There is no error. Now the challenge is to find the variable with the problem (I randomly chose 5 variables for the small example, but those did not create the problem). I have an idea, but I cannot include it in my reprex. There are some variables with ä/ü/ö which are shown in R after the upload as a questionmark/?. I have no idea how to change that (but I will find it out) and I have no idea how to put them in the small data set for you, because R neither excepts for instance “RA_Gesch?ftsreise” nor “RA_Geschäftsreise” in the test_data-code. Could it be that those special letters (ä/ü/ö) create the error when I try to skim?
I hope I could express my thoughts in a way so that you can understand what I am trying to say… It might sound a bit confusing. I hope I don’t have to ask strange questions forever!
Best
Navina
# Load packages -----------------------------------------------------------
pacman::p_load(rio,
here,
tidyverse,
skimr,
plyr,
janitor,
lubridate,
gtsummary,
flextable,
officer,
epikit,
apyramid,
scales,
datapasta,
reprex)
# create a small version of your data set
test_data <- data.frame(
stringsAsFactors = FALSE,
row.names = c("2", "3", "4", "5", "6", "7", "8", "9", "10", "11"),
GeburtsJahr = c("1981","1981","1996","1985",
"1979","1997","2004","1950","1985","2018"),
GeburtsMonat = c("11", "3", "4", "9", "3", "2", "12", "10", "10", "11"),
Geschlecht = c("weiblich","männlich",
"weiblich","weiblich","männlich","männlich","weiblich",
"männlich","männlich","männlich"),
Spezies = c("Plasmodium falciparum (M. tropica)","Plasmodium falciparum (M. tropica)",
"Plasmodium falciparum (M. tropica)","-nicht ermittelbar-",
"Plasmodium falciparum (M. tropica)",
"Plasmodium ovale (M. tertiana)","Plasmodium falciparum (M. tropica)",
"Plasmodium falciparum (M. tropica)",
"Plasmodium falciparum (M. tropica)","Plasmodium falciparum (M. tropica)"),
Infektionsland = c("-nicht erhoben-",
"-nicht erhoben-","Kamerun","Nigeria","Togo","Ostafrika",
"-nicht erhoben-","-nicht erhoben-","-nicht erhoben-","Ghana")
)
# Optimize spelling (ä, ü, ...)
test_data$Geschlecht <- iconv(test_data$Geschlecht, from = "ISO-8859-1", to = "UTF-8")
# Look at the data
skimr::skim(test_data)
|
|
Name |
test_data |
Number of rows |
10 |
Number of columns |
5 |
_______________________ |
|
Column type frequency: |
|
character |
5 |
________________________ |
|
Group variables |
None |
Data summary
Variable type: character
skim_variable |
n_missing |
complete_rate |
min |
max |
empty |
n_unique |
whitespace |
GeburtsJahr |
0 |
1 |
4 |
4 |
0 |
8 |
0 |
GeburtsMonat |
0 |
1 |
1 |
2 |
0 |
7 |
0 |
Geschlecht |
0 |
1 |
8 |
11 |
0 |
2 |
0 |
Spezies |
0 |
1 |
19 |
34 |
0 |
3 |
0 |
Infektionsland |
0 |
1 |
4 |
15 |
0 |
6 |
0 |
Created on 2024-02-01 with reprex v2.0.2
Session info
sessionInfo()
#> R version 4.3.0 (2023-04-21 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8
#> [3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.utf8
#>
#> time zone: Europe/Berlin
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] reprex_2.0.2 datapasta_3.1.0 scales_1.2.1 apyramid_0.1.3
#> [5] epikit_0.1.5 officer_0.6.2 flextable_0.9.2 gtsummary_1.7.2
#> [9] janitor_2.2.0 plyr_1.8.9 skimr_2.1.5 lubridate_1.9.2
#> [13] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.3 purrr_1.0.2
#> [17] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.3
#> [21] tidyverse_2.0.0 here_1.0.1 rio_0.5.30
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.0 fastmap_1.1.1 fontquiver_0.2.1
#> [4] pacman_0.5.1 promises_1.2.1 broom.helpers_1.14.0
#> [7] digest_0.6.33 timechange_0.2.0 mime_0.12
#> [10] lifecycle_1.0.3 sf_1.0-14 gfonts_0.2.0
#> [13] ellipsis_0.3.2 magrittr_2.0.3 compiler_4.3.0
#> [16] rlang_1.1.1 tools_4.3.0 utf8_1.2.3
#> [19] yaml_2.3.7 gt_0.9.0 data.table_1.14.8
#> [22] knitr_1.43 askpass_1.2.0 classInt_0.4-9
#> [25] curl_5.0.2 xml2_1.3.5 repr_1.1.6
#> [28] KernSmooth_2.23-20 httpcode_0.3.0 withr_2.5.0
#> [31] foreign_0.8-84 grid_4.3.0 fansi_1.0.4
#> [34] gdtools_0.3.3 e1071_1.7-13 xtable_1.8-4
#> [37] colorspace_2.1-0 crul_1.4.0 cli_3.6.1
#> [40] rmarkdown_2.24 crayon_1.5.2 ragg_1.2.5
#> [43] generics_0.1.3 rstudioapi_0.15.0 tzdb_0.4.0
#> [46] readxl_1.4.3 proxy_0.4-27 DBI_1.1.3
#> [49] cellranger_1.1.0 base64enc_0.1-3 vctrs_0.6.3
#> [52] jsonlite_1.8.7 fontBitstreamVera_0.1.1 hms_1.1.3
#> [55] systemfonts_1.0.4 units_0.8-3 glue_1.6.2
#> [58] stringi_1.7.12 gtable_0.3.4 later_1.3.1
#> [61] munsell_0.5.0 pillar_1.9.0 htmltools_0.5.6
#> [64] openssl_2.1.0 R6_2.5.1 textshaping_0.3.6
#> [67] rprojroot_2.0.3 evaluate_0.21 shiny_1.7.5
#> [70] haven_2.5.3 openxlsx_4.2.5.2 snakecase_0.11.1
#> [73] fontLiberation_0.1.0 httpuv_1.6.11 class_7.3-21
#> [76] uuid_1.1-1 Rcpp_1.0.11 zip_2.3.0
#> [79] xfun_0.40 fs_1.6.3 pkgconfig_2.0.3