hi im new with using Rstudio i’m having trouble with my script. i want create new data set from a source data set with variable from which province/region they are and to export them again as .csv file. here are my initial scripts (kinda use chat gpt on this).
library(dplyr)
library(rio)
library(here)
Import the dataset
lepto ← import(here(“Data”, “lepto_view.csv”))
Column to split by
column_to_split ← “(Current Address)Province”
Ensure the column exists in the dataset
if (!column_to_split %in% colnames(lepto)) {
stop(paste(“Column”, column_to_split, “not found in the dataset”))
}
Split the dataset into a list of data frames by the column
list_of_datasets ← split(lepto, lepto[[column_to_split]])
Export each dataset to a separate CSV file
lapply(names(list_of_datasets), function(province) {
Define the output file path
output_file ← here(“Data”, paste0(“lepto_”, province, “.csv”))
Export the subset
export(list_of_datasets[[province]], output_file)
Hello,
It would be helpful if you could provide additional details and a reproducible example (reprex). From what I understand, you are trying to split data into a list of data by a given variable and then save each of these files and this is how I would approach it:
# loading packages
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(purrr)
library(readr)
# loading data
raw_data <- read_csv(readr_example("chickens.csv"))
#> Rows: 5 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): chicken, sex, motto
#> dbl (1): eggs_laid
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# splitting the data by sex
list_of_data <- raw_data |>
group_split(sex)
# saving the data
walk(.x = list_of_data, .f = \(x)
write_csv(
x = x,
file = tempfile(
pattern = "file",
tmpdir = tempdir(),
fileext = ".csv"
)
))
Created on 2024-11-21 with reprex v2.1.1
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.4.1 (2024-06-14)
#> os macOS 15.1
#> system x86_64, darwin20
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Toronto
#> date 2024-11-21
#> pandoc 3.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/x86_64/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> bit 4.5.0 2024-09-20 [1] RSPM (R 4.4.0)
#> bit64 4.5.2 2024-09-22 [1] RSPM (R 4.4.0)
#> cli 3.6.3 2024-06-21 [1] RSPM (R 4.4.0)
#> crayon 1.5.3 2024-06-20 [1] RSPM (R 4.4.0)
#> digest 0.6.37 2024-08-19 [1] RSPM (R 4.4.0)
#> dplyr * 1.1.4 2023-11-17 [1] RSPM (R 4.4.0)
#> evaluate 1.0.1 2024-10-10 [1] RSPM (R 4.4.0)
#> fansi 1.0.6 2023-12-08 [1] RSPM (R 4.4.0)
#> fastmap 1.2.0 2024-05-15 [1] RSPM (R 4.4.0)
#> fs 1.6.5 2024-10-30 [1] RSPM (R 4.4.1)
#> generics 0.1.3 2022-07-05 [1] RSPM (R 4.4.0)
#> glue 1.8.0 2024-09-30 [1] RSPM (R 4.4.0)
#> hms 1.1.3 2023-03-21 [1] RSPM (R 4.4.0)
#> htmltools 0.5.8.1 2024-04-04 [1] RSPM (R 4.4.0)
#> knitr 1.48 2024-07-07 [1] RSPM (R 4.4.0)
#> lifecycle 1.0.4 2023-11-07 [1] RSPM (R 4.4.0)
#> magrittr 2.0.3 2022-03-30 [1] RSPM (R 4.4.0)
#> pillar 1.9.0 2023-03-22 [1] RSPM (R 4.4.0)
#> pkgconfig 2.0.3 2019-09-22 [1] RSPM (R 4.4.0)
#> purrr * 1.0.2 2023-08-10 [1] RSPM (R 4.4.0)
#> R6 2.5.1 2021-08-19 [1] RSPM (R 4.4.0)
#> readr * 2.1.5 2024-01-10 [1] RSPM (R 4.4.0)
#> reprex 2.1.1 2024-07-06 [1] RSPM (R 4.4.0)
#> rlang 1.1.4 2024-06-04 [1] RSPM (R 4.4.0)
#> rmarkdown 2.29 2024-11-04 [1] RSPM (R 4.4.1)
#> rstudioapi 0.17.1 2024-10-22 [1] RSPM (R 4.4.0)
#> sessioninfo 1.2.2 2021-12-06 [1] RSPM (R 4.4.0)
#> tibble 3.2.1 2023-03-20 [1] RSPM (R 4.4.0)
#> tidyselect 1.2.1 2024-03-11 [1] RSPM (R 4.4.0)
#> tzdb 0.4.0 2023-05-12 [1] RSPM (R 4.4.0)
#> utf8 1.2.4 2023-10-22 [1] RSPM (R 4.4.0)
#> vctrs 0.6.5 2023-12-01 [1] RSPM (R 4.4.0)
#> vroom 1.6.5 2023-12-05 [1] RSPM (R 4.4.0)
#> withr 3.0.2 2024-10-28 [1] RSPM (R 4.4.0)
#> xfun 0.49 2024-10-31 [1] RSPM (R 4.4.0)
#> yaml 2.3.10 2024-07-26 [1] RSPM (R 4.4.0)
#>
#> [1] /Users/timothychisamore/Library/R/x86_64/4.4/library
#> [2] /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
All the best,
Tim
2 Likes