How to calculate Risk ratios from aggregated data

kostas.danis · April 6, 2024, 9:05am

I tried to calculate Risk Ratios from aggregated data, using the epitools package and the riskratio function, but I am getting the following error:

install.packages(“epitools”)

Load required libraries

library(epitools)

Create a data frame with aggregated data

data ← data.frame(
group = c(“Exposed”, “Unexposed”),
cases = c(4602, 13), # Number of cases
total = c(140681, 547) # Total population in each group
)

Calculate risk ratios

risk_ratios ← riskratio(data$cases, data$total, data$group)
Error in match.arg(method) : ‘arg’ must be of length 1

Print the results

print(risk_ratios)

Why do I get this error? Is there another way to calculate this?

aspina · April 6, 2024, 4:37pm

hey @kostas.danis
not used {epitools} in a while but it is a bit unweildly - it doesnt like dataframes and the 2x2 needs to be in a specific format (explained in the details section of the helpfile).
You need to provide the counts as a matrix - and give the case and control numbers, it calculates totals by itself.

The alternative option as in example 2 below (and described in this post) is to recreate a linelist from the aggregated data. Then using that linelist you could use {gtsummary} to get risk ratios as described here

Example 1: {epitools}

library(epitools)

## as a data frame just to look at (and ensure matrix below is correct)
data <- data.frame(
  group = c("Exposed", "Unexposed"),
  cases = c(4602, 13), # Number of cases
  control = c(136079, 534) # number of controls
)

## create 2x2 table as a matrix 
dat <- matrix(
  c(4602, 136079, 13, 534), ## input numbers of cases and controls
  2,2, ## define dimensions of table
  byrow=TRUE) ## input numbers above row-wise 

## produce risk ratio 
riskratio(dat, ## input matrix 
          rev = "both") ## flip rows and columns to input correctly 
#> $data
#>           Outcome
#> Predictor  Disease2 Disease1  Total
#>   Exposed2      534       13    547
#>   Exposed1   136079     4602 140681
#>   Total      136613     4615 141228
#> 
#> $measure
#>           risk ratio with 95% C.I.
#> Predictor  estimate     lower    upper
#>   Exposed2 1.000000        NA       NA
#>   Exposed1 1.376433 0.8038413 2.356894
#> 
#> $p.value
#>           two-sided
#> Predictor  midp.exact fisher.exact chi.square
#>   Exposed2         NA           NA         NA
#>   Exposed1  0.2365873    0.2781665  0.2401614
#> 
#> $correction
#> [1] FALSE
#> 
#> attr(,"method")
#> [1] "Unconditional MLE & normal approximation (Wald) CI"

Created on 2024-04-06 with reprex v2.0.2

Example 2: recreate a linelist

library(tidyr) 

stacked <- data.frame(
  exposure = c("Exposed", "Exposed", 
               "Unexposed", "Unexposed"),
  disease  = c("Case", "Control", "Case", "Control"), 
  count = c(4602, 136079, 13, 534)
) %>% 
  tidyr::uncount(weights = count)

kostas.danis · April 7, 2024, 9:08am

Great. That works very well. Thank you.

For option 2 with the disagregated data, you can also use:
epitools:riskratio (outcome_variable, exposure_variable)
to calculate the risk ratios.