How to calculate Risk ratios from aggregated data

I tried to calculate Risk Ratios from aggregated data, using the epitools package and the riskratio function, but I am getting the following error:

install.packages(“epitools”)

library(epitools)

Create a data frame with aggregated data

data ← data.frame(
group = c(“Exposed”, “Unexposed”),
cases = c(4602, 13), # Number of cases
total = c(140681, 547) # Total population in each group
)

Calculate risk ratios

risk_ratios ← riskratio(data\$cases, data\$total, data\$group)
Error in match.arg(method) : ‘arg’ must be of length 1

Print the results

print(risk_ratios)

Why do I get this error? Is there another way to calculate this?

1 Like

hey @kostas.danis
not used {epitools} in a while but it is a bit unweildly - it doesnt like dataframes and the 2x2 needs to be in a specific format (explained in the details section of the helpfile).
You need to provide the counts as a matrix - and give the case and control numbers, it calculates totals by itself.

The alternative option as in example 2 below (and described in this post) is to recreate a linelist from the aggregated data. Then using that linelist you could use {gtsummary} to get risk ratios as described here

Example 1: {epitools}

``````library(epitools)

## as a data frame just to look at (and ensure matrix below is correct)
data <- data.frame(
group = c("Exposed", "Unexposed"),
cases = c(4602, 13), # Number of cases
control = c(136079, 534) # number of controls
)

## create 2x2 table as a matrix
dat <- matrix(
c(4602, 136079, 13, 534), ## input numbers of cases and controls
2,2, ## define dimensions of table
byrow=TRUE) ## input numbers above row-wise

## produce risk ratio
riskratio(dat, ## input matrix
rev = "both") ## flip rows and columns to input correctly
#> \$data
#>           Outcome
#> Predictor  Disease2 Disease1  Total
#>   Exposed2      534       13    547
#>   Exposed1   136079     4602 140681
#>   Total      136613     4615 141228
#>
#> \$measure
#>           risk ratio with 95% C.I.
#> Predictor  estimate     lower    upper
#>   Exposed2 1.000000        NA       NA
#>   Exposed1 1.376433 0.8038413 2.356894
#>
#> \$p.value
#>           two-sided
#> Predictor  midp.exact fisher.exact chi.square
#>   Exposed2         NA           NA         NA
#>   Exposed1  0.2365873    0.2781665  0.2401614
#>
#> \$correction
#> [1] FALSE
#>
#> attr(,"method")
#> [1] "Unconditional MLE & normal approximation (Wald) CI"

Created on 2024-04-06 with reprex v2.0.2
``````

Example 2: recreate a linelist

``````library(tidyr)

stacked <- data.frame(
exposure = c("Exposed", "Exposed",
"Unexposed", "Unexposed"),
disease  = c("Case", "Control", "Case", "Control"),
count = c(4602, 136079, 13, 534)
) %>%
tidyr::uncount(weights = count)
``````
1 Like

Great. That works very well. Thank you.

For option 2 with the disagregated data, you can also use:
epitools:riskratio (outcome_variable, exposure_variable)
to calculate the risk ratios.

1 Like