Code from the Africa CDC Data Science training

Dear All

In October 2025, Africa CDC convened a training workshop in Cape Town for participants from National Public Health Institutes from several African countries. In addition to an introduction to R (presented by Djibril from HISP-SA and myself) tools such as data-flo, Microreact and epicollect5 were introduced.

On the R side of things we used the tutorials from this site, the particular focus on Tutorial 4 on data cleaning and Tutorial 6 on ggplot.

Data Cleaning scripts

The first script is data_cleaning_example.R which works through almost the entire 4th introductory tutorial (the raw code which you can copy and paste easily is here). A simpler version of a final data cleaning pipe, with comments explaining each step, is also available as data_cleaning.R.

Data Visualisation and Feedback

I will soon be posting a script that works through the use of ggplot in the 6th tutorial. In the meantime I would encourage all participants from the workshop to post in this topic with their experiences using R for data cleaning. And to the authors of the Applied Epi tutorial and the Epi R Handbook - thank you! These are invaluable resources!

Peter

1 Like

That sounds like a great workshop, thanks for sharing Peter!

Those tutorials look very practical for hands-on learning.

For anyone continuing beyond the workshop, I’d also recommend exploring:

Looking forward to seeing everyone’s experiences with these recourses!

Luis

I wish to be enrolled in that program for advancing my skills

Hi Jean Bosco - there isn’t a “programme” as such at this point - it was simply a workshop, but I felt a need to keep everyone together after the workshop so posted here.