1

I have read through all of the ggalluvial posts that I can find about ordering the axes in an alluvial diagram. Each of my axes is a factor variable and I have set the levels to be in the order I wish for them to appear in the plot. The first of my two axes is correctly ordered per the levels of its factor variable and the order changes as it should when I change the levels. However, the second axis does not reflect the levels of its underlying factor variable. For some reason the second axis appears to mirror the levels of the first axis for all overlapping drug classes and is not responsive to any changes I make to the levels.

Does any one have any suggestions for address this problem?

For refernece, here is my data:

top_transitions <- data.frame(
  last_pre_tx_drug = c("DPP4 inhibitors", "DPP4 inhibitors",
                       "Insulin", "Insulin", "Insulin", 
                       "None", "None", "None", "None", "None",
                       "Sulfonylureas", "Sulfonylureas", 
                       "Sulfonylureas", "Sulfonylureas"),

  post_transplant_drug = c("Insulin", "Insulin + DPP4 inhibitors", 
                           "Insulin", "None", "Insulin + GLP-1 RAs", 
                           "DPP4 inhibitors", "Insulin", "None", 
                           "Biguanides", "Sulfonylureas", 
                           "Insulin + Sulfonylureas", "Insulin", 
                           "None", "Sulfonylureas"),

  N = c(1809, 1005, 35053, 4506, 808, 2200, 12087, 36052, 
        1304, 3303, 1301, 5017, 1302, 1207)
)

top_transitions$last_pre_tx_drug <- factor(top_transitions$last_pre_tx_drug, 
      levels = c("DPP4 inhibitors", "Insulin", "None", "Sulfonylureas"))

top_transitions$post_transplant_drug <- factor(top_transitions$post_transplant_drug, 
       levels = c("DPP4 inhibitors", "Insulin + Sulfonylureas",
                  "Insulin", "Insulin + DPP4 inhibitors", "None", 
                  "Insulin + GLP-1 RAs", "Biguanides", "Sulfonylureas"))


And here is the code I'm using for my plot:

# Color palette based on the levels of post_transplant_drug
color_palette <- c(
  "DPP4 inhibitors" = "#4393C3",
  "Insulin + Sulfonylureas" = "#B2182B",
  "Insulin" = "#2166AC",
  "Insulin + DPP4 inhibitors" = "#A6A6A6",
  "None" = "#92C5DE",
  "Insulin + GLP-1 RAs" = "#1B7837",
  "Biguanides" = "#F4A582",
  "Sulfonylureas" = "#D6604D"
)

library(ggalluvial)

ggplot(top_transitions, aes(axis1 = last_pre_tx_drug, 
                            axis2 = post_transplant_drug, y = N)) +
  geom_alluvium(aes(fill = post_transplant_drug), 
                width = 1/25, knot.pos = 0.1, curve_type = "sigmoid") +
  geom_stratum(width = 1/5, fill = "#F0F0F0", color = "#808080") +
  geom_text(stat = "stratum", aes(label = after_stat(stratum)), 
                                  size = 3.5, color = "#333333",
            family = "Helvetica Neue", fontface = "bold", 
            hjust = 0.5, vjust = 0.5) +
  scale_x_discrete(limits = c("last_pre_tx_drug", "post_transplant_drug"),
                   labels = c("Last Regimen Used\nPre-Transplant",
                              "First Regimen Used\nPost-Transplant"),
                   expand = c(.02, .02)) +
  scale_fill_manual(values = color_palette, name = "Post-Transplant\nDrug Class") +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 16, hjust = 0.5, 
                                  margin = margin(b = 10), 
                                  family = "Helvetica Neue"),
        plot.subtitle = element_text(size = 12, hjust = 0.5, 
                                     margin = margin(b = 20), 
                                     family = "Helvetica Neue"),
        axis.title = element_blank(),
        axis.text = element_text(size = 11, color = "#333333", family = "Helvetica Neue"),
        axis.text.y = element_blank(),
        legend.position = "bottom",
        legend.title = element_text(size = 11, face = "bold", family = "Helvetica Neue"),
        legend.text = element_text(size = 10, family = "Helvetica Neue"),
        panel.grid = element_blank(),
        panel.border = element_blank(),
        plot.margin = margin(20, 20, 20, 20))

No matter what I try, the right most axis always appears as follows: enter image description here

2 Answers 2

3

We can reshape the data and put both axes in the same column and apply the appropriate factor order/level.

library(ggalluvial)
library(dplyr)

melted_top_transitions <- top_transitions %>% 
  tibble::rowid_to_column("ID") %>% 
  tidyr::pivot_longer(-c(ID, N)) %>% 
  mutate(value = factor(value, 
          levels = unique(c(levels(top_transitions$post_transplant_drug),
                            levels(top_transitions$last_pre_tx_drug))))) %>% 
  mutate(fclr = last(value), .by = ID)

Then, you also need to change your aes and provide:

Parameters `x`, `stratum`, and `alluvium` are required for data in lodes form

I also used geom_flow instead of geom_alluvium (for some reason it didn't work with the latter). And to get the "linewidth" to correspond to N we need to provide y to aes (previously weight).

ggplot(melted_top_transitions, 
       aes(x = name, y = N,
           stratum = value, alluvium = ID,
           label = value)) +
  geom_flow(aes(fill = fclr), 
                width = 1/25, knot.pos = 0.1, curve_type = "sigmoid") +
  geom_stratum(width = 1/5, fill = "#F0F0F0", color = "#808080") +
  geom_text(stat = "stratum", aes(label = after_stat(stratum)), 
            size = 3, color = "#333333",
            family = "Helvetica Neue", fontface = "bold", 
            hjust = 0.5, vjust = 0.5) +
  scale_x_discrete(limits = c("last_pre_tx_drug", "post_transplant_drug"),
                   labels = c("Last Regimen Used\nPre-Transplant",
                              "First Regimen Used\nPost-Transplant"),
                   expand = c(.02, .02)) +
  scale_fill_manual(values = color_palette, name = "Post-Transplant\nDrug Class") +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 16, hjust = 0.5, 
                                  margin = margin(b = 10), 
                                  family = "Helvetica Neue"),
        plot.subtitle = element_text(size = 12, hjust = 0.5, 
                                     margin = margin(b = 20), 
                                     family = "Helvetica Neue"),
        axis.title = element_blank(),
        axis.text = element_text(size = 11, color = "#333333", family = "Helvetica Neue"),
        axis.text.y = element_blank(),
        legend.position = "bottom",
        legend.title = element_text(size = 11, face = "bold", family = "Helvetica Neue"),
        legend.text = element_text(size = 10, family = "Helvetica Neue"),
        panel.grid = element_blank(),
        panel.border = element_blank(),
        plot.margin = margin(20, 20, 20, 20)) +
  scale_y_discrete()

4
  • This is perfect. I didn't realize I could melt the table like this or use this table structure for the plot. Thank you so much!
    – jos0909
    Commented Jun 6 at 22:21
  • After looking at this more closely I realize that the thickness of the flows don't appear to correspond to the N as they had before. Could you kindly clarify how I can achieve this?
    – jos0909
    Commented Jun 6 at 23:01
  • 1
    @jos0909 Huh!!! You are right, but I am sure this used to work (or at least I have a graph that I previously created for one of my reports with geom_flow instead of geom_alluvium which has the right weights). Seems to be a bug or something that changed recently. Let me read through some of the changes and documentation.
    – M--
    Commented Jun 6 at 23:35
  • 1
    @jos0909 they changed some stuff around. I should've used y instead of weight to get the thickness to work. And geom_alluvium had a weird behavior (worked when I didn't have theme) so I changed to geom_flow. See the edits above;
    – M--
    Commented Jun 7 at 0:00
1

One option to achieve your desired result would be to get rid of the shared levels for both factors for which I simply add a suffix "_2" to post_transplant_drug. Afterwards we can get rid of the suffix using e.g. gsub():

library(ggalluvial)
#> Loading required package: ggplot2

top_transitions$post_transplant_drug <- factor(
  paste0(top_transitions$post_transplant_drug, "_2"),
  levels = paste0(
    c(
      "DPP4 inhibitors", "Insulin + Sulfonylureas",
      "Insulin", "Insulin + DPP4 inhibitors", "None",
      "Insulin + GLP-1 RAs", "Biguanides", "Sulfonylureas"
    ), "_2"
  )
)

names(color_palette) <- paste0(names(color_palette), "_2")

ggplot(top_transitions, aes(
  axis1 = last_pre_tx_drug,
  axis2 = post_transplant_drug, y = N
)) +
  geom_alluvium(aes(fill = post_transplant_drug),
    width = 1 / 25, knot.pos = 0.1, curve_type = "sigmoid"
  ) +
  geom_stratum(width = 1 / 5, fill = "#F0F0F0", color = "#808080") +
  geom_text(
    stat = "stratum", aes(label = after_stat(
      gsub("_2", "", stratum)
    )),
    size = 3.5, color = "#333333",
    family = "Helvetica Neue", fontface = "bold",
    hjust = 0.5, vjust = 0.5
  ) +
  scale_x_discrete(
    limits = c("last_pre_tx_drug", "post_transplant_drug"),
    labels = c(
      "Last Regimen Used\nPre-Transplant",
      "First Regimen Used\nPost-Transplant"
    ),
    expand = c(.02, .02)
  ) +
  scale_fill_manual(
    labels = \(x) gsub("_2", "", x),
    values = color_palette, name = "Post-Transplant\nDrug Class"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(
      face = "bold", size = 16, hjust = 0.5,
      margin = margin(b = 10),
      family = "Helvetica Neue"
    ),
    plot.subtitle = element_text(
      size = 12, hjust = 0.5,
      margin = margin(b = 20),
      family = "Helvetica Neue"
    ),
    axis.title = element_blank(),
    axis.text = element_text(size = 11, color = "#333333", family = "Helvetica Neue"),
    axis.text.y = element_blank(),
    legend.position = "bottom",
    legend.title = element_text(size = 11, face = "bold", family = "Helvetica Neue"),
    legend.text = element_text(size = 10, family = "Helvetica Neue"),
    panel.grid = element_blank(),
    panel.border = element_blank(),
    plot.margin = margin(20, 20, 20, 20)
  )

3
  • Valid idea +1, but I don't like it particularly for this example. In my mind, axis1, axis2, axis3, .... should be used where your levels are fundamentally different like let's say gender, job sector, experience for instance. For this one I prefer reshaping the data. Cheers. :)
    – M--
    Commented Jun 6 at 20:01
  • 1
    @M-- Absolutley agree with you. But sometimes it's also good to know how to achieve the desired result with a little hack. Resembles the approach taken by Arrow and Debreu to extend general equilibrium theory (in economics) to the case of uncertainty: Simply make goods different goods by treating the state as a characteristic.
    – stefan
    Commented Jun 6 at 20:29
  • 1
    Agreed. You had me with the "a little hack" part. "economics" went right over my head ;)
    – M--
    Commented Jun 6 at 20:36

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.