Challenges with Network Visualization in R: Coloring Nodes Based on Contact Attributes epicontacts package

,

I’m working on a project involving epidemiological contact data, where I aim to visualise a network of cases and their contacts. The primary challenge I’m facing is how to use contact-specific attributes to colour nodes in the network, especially when the network visualization tool seem more oriented towards using case attributes.

Data Structure:

My dataset comprises two main parts:

  1. A cases dataframe that lists each case with a unique identifier and basic information (e.g., case_id, dob, sex).
  2. A contacts dataframe that tracks the interactions between cases and their contacts, including contact-specific attributes like DX (diagnosis), sex, dob, and contact_type. The structure is something like this:
  • case_id: Identifier for the case
  • contact_id: Unique ID for the contact
  • Additional columns: sex, dob, contact_type, igra, DX

Objective:

My main goal is to create a directed network where nodes represent both cases and contacts, and edges depict the interactions from cases to their contacts. Most importantly, I want to colour the nodes based on contact attributes, particularly DX, which represents the screening diagnosis outcome.

Attempts and Issues:

  • Using epicontacts: Initially, I attempted to use the epicontacts package, which is designed for such epidemiological data. However, I ran into limitations regarding visualizing nodes colored by contact attributes, as epicontacts (and its visualization function vis_epicontacts) primarily focuses on case attributes.

Specific Questions:

  1. For epicontacts Users: Has anyone successfully used contact attributes (like DX) to color nodes in a network visualization? If so, how did you integrate contact data into the visualization?

General Advice: Are there alternative R packages or methods that might be better suited for this type of network visualization where contact attributes are central to the analysis?

1 Like

Hello,

I haven’t sued epicontacts myself (I have used tidygraph), however, its appears that the node_color argument to the vis_epicontacts() function should work to colour nodes based on a variable in your linelist data. Something like vis_epicontacts(x = data, node_color = "DX") should work.

All the best,

Tim

Hi Tim,

Thank you for responding. I am now able to set the node color based on the attribute (but only if the variable is stored in the cases_df (which isn’t too much of a problem as I can include both cases and contacts in the cases linelinst and differentiate them based on the outcome DX). I now have another issue, I can not change the colors of the nodes based on my own color pallet or color of choice. I have defined my color pallet as cases_pal but this does not work. The usual syntax which would work with ggplot does not seem to apply for epicontacts. I have read the documentation and can not see a way to change the colours from the standard pallet which is part of the package.

Do you have any suggestions?

Visualization with custom color palette

vis_output ← vis_epicontacts(my_epicontacts,
node_color = “DX”,
col_pal = cases_pal,

Thank you,

Abbie

1 Like

Hi Abbie,

I’m glad you got the node colour working!

In terms of selecting a palette, it looks like you’d need to do something like:


vis_output ← vis_epicontacts(my_epicontacts, node_color = “DX”, col_pal = c("red", "blue"))

Essentially, you need to provide a vector with the same number of colours for each possible value of DX.

All the best,

Tim