Hello! I’m following along the epiR handbook on regression. I’m at the gtsummary section and I was thinking if it was possible for the output to have 2 columns rather than just the total N column: a column for count (percent) of controls and another column for cases. This is similar to the output in the previous section (with columns for 1 and 0).
I tried piping it to the the group_by(outcome) as follows, but I get the following error: Error in UseMethod(“group_by”) : no applicable method for ‘group_by’ applied to an object of class “c(‘tbl_uvregression’, ‘gtsummary’)”
tbl_uvregression( ## produce univariate table
method = glm, ## define regression want to run (generalised linear model)
y = outcome, ## define outcome variable
method.args = list(family = binomial), ## define what type of glm want to run (logistic)
exponentiate = TRUE ## exponentiate to produce odds ratios (rather than log odds)
) %>%
group_by(outcome)
output <- linelist %>%
select(died_covid, gender, age_group) %>% # keep variables of interest
tbl_uvregression( ## produce univariate table
method = glm, ## define regression want to run (generalised linear model)
y = died_covid, ## define outcome variable
method.args = list(family = binomial), ## define what type of glm want to run (logistic)
exponentiate = TRUE, ## exponentiate to produce odds ratios (rather than log odds)
hide_n = TRUE ## dont include overall counts in regression table
)
## produce counts for each of the variables of interest
cross_tab <- output$inputs$data %>%
tbl_summary(by = died_covid)
## combine for a full table
tbl_merge(list(cross_tab, output))
If you type ?gtsummary::tbl_uvregression, you can see on the help page that the value produced by running this function is described as an object, not a data frame. If you inspect your output object, you will see that it is actually a list of objects, and some of those objects have more elements nested within them. If a list contains a data.frame, you can access that layer by typing the dollar sign after the name of the list.
So:
output$inputs$data
is accessing the object you just created with gtsummary::tbl_uvregression(), and inside that object there are multiple elements. One of those elements is called inputs and this contains all the information that you fed into the function as arguments. One of the inputs were the columns to use in the regression, which are stored within inputs as a data.frame called data.
If you try typing output$inputs$data in your console, you will be able to see that data.frame, which should contain your three input columns, died_covid, gender and age_group. The function tbl_summary is summarising that data by creating counts according to the died_covid column.
The last line of Alex’s solution merges these summary counts with the table created by tbl_uvregression() for the final output.
Tip:
If you want to see what elements a list object contains, type the name of the object, followed by the dollar sign and have a look at what comes up in the RStudio prompt. It will show you the list of elements contained within that object, which you can scroll through and select for further manipulations or inspection.