Applied Epi Community

Practical reference for survey sampling and analysis

Hello everyone! Can anyone recommend a reference for survey sampling, such as multi-stage cluster sampling? I understand the theories of the different sampling designs from general biostatistics textbooks. What I’m looking for is something more practical in applying the techniques such as using Excel or R in selecting clusters, PPS, applying sampling weights, and analysis after data collection. Thank you!

Hey Ian, great question … We are working on a new handbook chapter… But i’s currently a work in progress!

You can check the draft out on GitHub, if you click on the files changed tab and then scroll down to view diff on the rmd file (what shows up Green is the latest version):

There’s also these conversations with example code

There’s also already a section in the handbook for doing survey analysis … Doesn’t include all designs yet but might be a good start?

Hope that helps!

1 Like

Thank you for this, Neale! I will look into the Survey Analysis chapter.

Hi @neale ! I’m working my way through the survey analysis guide, and in the cleaning data section, the comment line in the last part of the code states “change to dates” but it looks like it converts the yes/no character variables to TRUE/FALSE logical variables instead?

change to dates

survey_data ← survey_data %>%
mutate(across(all_of(YNVARS),
str_detect,
pattern = “yes”))

Also, part 23.6 starts with joining a Kobo dataset at the household and individual level, but the datasets being referred in the example (survey_data_hh and survey_data_indiv) are not available for download in chapter 2, but rather the joins have been already done in the given survey_data file?

Hi Ian,
This chapter is still a work in progress and the datasets are not yet available from the Handbook. You are right that the mutate(across() command appears to change a vector of yes/no columns.
Hopefully we can address these concerns soon.
Neale

1 Like

Thanks again, @neale !