I have been stuck on a particular topic for a while. I did try searching for it in the handbook but could not find it. I searched on Stack Overflow and got no response either.
In the past I have typically used the epitools package to calculate Odds Ratio / Relative Risk. However I have found myself increasingly using gtsummary to create tables. In this scenario is there a way I can directly calculate OR /RR with 95CI in a gtsummary table? Could I also add an adjusted OR (via regression) to the same table?
I did read up and gtsummary does have the tbl_reg function to create a table with adjusted OR post regression. But what if i want the normal âunadjustedâ cross product OR as well? Also, what about Relative Risk.
I would love to have posted reprex code as well, but this is more of a âwhat is a good resource to readâ rather than me getting stuck while coding so dont have any code to share.
Thank You Alex! This was precisely what i was looking for.
I had seen that text in the handbook. I guess the problem was that i was not familiar with the term âunivariate regressionâ.
Would i be right in assuming that the OR generated from this would be the same as the OR (Cross Product ( ad/bc)) if we drew a simple 2x2 table between exposure and outcome? Also in this case would not Relative Risk need a different calculation (Incidence among exposed (a/a+b)/ Incidence among non exposed (c/ c+d) )
There is an oddsratio.wald() / riskratio.wald() function in the epitools library but it is really cumbersome to make a good looking table with that.
no worries deepak - yep you are right univariate regression produces the same OR and RRs as the cross product calculation.
You are also right about the calculation for RR - gtsummary does the calculating for you though!
Thank You @aspina . I got my tables exactly how I wanted them. Much appreciated.
PS: I must actively point out that I dislike the use of (abcd) while describing OR/ RR and make it a point to tell my Uni students to not try to remember it like that. I used it just to make sure we were talking about the same thing. It bugged me so much I NEED to clarify, LOL.
haha glad you could scratch that itch - out of interest how do you explain it to your students? Its the only way ive ever seen it explained, keen to hear other ways!
Ah. Its not really different. Just that I would like them to think of Relative Risk as âincidence in one group (exposed)/ incidence in other group (unexposed)â rather than simply memorize a whole bunch of abcd formulae that make no sense to them. Similarly when learning about sensitivity I would rather them think of True Positives / (True Positives + False Positives) while visualizing the 2*2 table rather than rote memorizing a/ (a+c) . I guess the concept is the most important thing?
@deepakvarughese did you manage to find out how to calculate risk ratios using tbl_uvregression() by any chance? Or for that matter, how to calculate relative risk using glm()?
My understanding is that:
family = poisson(link = "log") gives incident rate ratios
family = binomial(link = "logit") gives odds ratios
family = binomial(link = "log") should give risk ratios but throws an error about start values
family = poisson(link = "log"), vcov = ~ hetero should give risk ratios and can be calculated with the fixest::feglm() function, but this function doesnât work with gtsummary as the formula argument is called fml and it doesnât match with formula created by gtsummary (even though the vignette lists this function and implies that it should work).
Hello! Sorry for opening a solved discussion, but itâs still not clear to me how to use tbl_uvregression to give an output of RR instead of an OR. What option should I use?
You can get RR using either poisson or negative binomial as below (sorry not a proper reprex). Both will produce the same estimate, but likely slightly different confidence intervals. The choice between the two has to do with overdispersion - in reality, if you run the negative variable and there is no overdispersion the {MASS} package automatically runs a poisson (so saves you having to make decisions).
Hello, Alex!
Could you please clarify, what test is used in tbl_uvregression for calculation for p-value for categorical variables? Pearsons chi-square or Fisher test?
In tbl_summary when we add p_value default statistical test is chi-squared test, but if any expected call count is below 5 then a Fisherâs exact test is used. Is it the same to tbl_uvregression?
hi @ulyana.9355 - it depends what method you choose. tbl_uvregression just uses other packages to do the calculation and then presents them nicely. For the examples above the {glm} and {MASS} packages are used for the regression and then p-values are calculated with the {broom} package (tidy function). The {broom} tidy function itself uses the base R summary function to calculate p-values, which from this discussion seems to just be doing a pearsons.
I imagine when you have cell counts less than 5 the glm will struggle to converge and you probably need to start looking at exact regression methods (e.g. exact logistic)
hope this helps