Package for capture-recapture study to estimate the sensitivity of a surveillance system

Hi all,

I aim to do a capture-recapture study to estimate the sensitivity of a COVID-19 surveillance system.

The dataset I’ve have weekly data from cases captured in a population-specific surveillance system (schools) and weekly data from the nation-wide surveillance system during the same period.

I’ve found the following packages “Rcapture", “multimark”, “openCR”, and "Rmark” and I would like to hear which one is most suitable and what the pro’s and con’s of them are.

Any help is appreciated. Thank you!

Thanks for your post! I am afraid I do not have expertise in this, but will tag a few others who might know.

If you can describe more the exact comparisons you want to do, that may help. I can imagine that joining functions, particularly “fuzzy” or probabilistic matching may be useful to compare cases detected by differing surveillance systems.

@chris @cmaronga @isaacflorence @aspina @sophiemeakin @temuulen

3 Likes

hey - never actually had to do a capture-recapture with r.
But agree joining will be a big part of what you need to deal with - a new package which extends the regular tidyverse for joining is the powerjoin package.
Otherwise the epiR - is quite outdated, but will have some useful helper functions with appropriate methodologies that others dont necessarily have.
The tidyverse package, yardstick also has functions for specificity and sensitivity.

4 Likes

Also this case study is outdated but might be helpful to you

3 Likes

Thank you all for your comments!

@aspina: I think this case study is very helpful! This might be silly but do you know where I could download the example dataset (e.g. salmonella.xlxs)? I have trouble with imaging the dataset structure :frowning:

2 Likes

good question - I wonder if @amy.mikhail knows if this is available?

2 Likes

Hi @vuthutrang.hmu ,

This github repository is a bit out of date, as Alex said, and so was missing some of the required materials. I have now added a folder called data to the repository, which contains the data sets that you need to run the capture-recapture code in exercise 9.

The link to download the data is here:

There are three data sets (contained in two files) that you need:

  1. salmonella.xlsx contains two data sets on separate worksheets (NSSS and NRLS);
  2. threesources.dta is the third data set.

It is a while since I have looked at the code, but hopefully it should still work - it uses the Rcapture package for the last section.

3 Likes

@amy.mikhail Thank you so much! I was successful download and have chance to practice as the case study :grinning:

However, I saw in the case study only have guidance on calculate the sensitivity. But I need to calculate the 95%CI of sensitivity too. Do you know which command is suitable for calculate the 95% CI for sensitivity?

hi @vuthutrang.hmu - not sure those functions have a way of doing 95%CI, however you can get the CIs for any proportion using the binom.wilson() function from the binom package