韦恩图进阶之upsetplot:04

  • Post author:
  • Post category:其他



医学和生信笔记

,专注R语言在临床医学中的使用,R语言数据分析和可视化。主要分享R语言做医学统计学、meta分析、网络药理学、临床预测模型、机器学习、生物信息学等。

书接上回!!!

前面讲了组合图形,今天继续组合!

前面的组合图形都是关于交集的,或者关于总体的一些统计汇总可视化,今天说的

metadata

是关于每个集合的属性。

还是以

movies

数据集为例。

自带数据集

movies

,是一个电影类型数据,共有3883行(3881部电影),21列。第1列是电影名字,还有上映时间(ReleaseDate)、评分(AvgRatings)、观看数(Watches)。其余列是由0-1矩阵表示的电影类型。

movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"), 
    header = T, sep = ";")

str(movies)
## 'data.frame':	3883 obs. of  21 variables:
##  $ Name       : chr  "Toy Story (1995)" "Jumanji (1995)" "Grumpier Old Men (1995)" "Waiting to Exhale (1995)" ...
##  $ ReleaseDate: int  1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 ...
##  $ Action     : int  0 0 0 0 0 1 0 0 1 1 ...
##  $ Adventure  : int  0 1 0 0 0 0 0 1 0 1 ...
##  $ Children   : int  1 1 0 0 0 0 0 1 0 0 ...
##  $ Comedy     : int  1 0 1 1 1 0 1 0 0 0 ...
##  $ Crime      : int  0 0 0 0 0 1 0 0 0 0 ...
##  $ Documentary: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Drama      : int  0 0 0 1 0 0 0 0 0 0 ...
##  $ Fantasy    : int  0 1 0 0 0 0 0 0 0 0 ...
##  $ Noir       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Horror     : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Musical    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Mystery    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Romance    : int  0 0 1 0 0 0 1 0 0 0 ...
##  $ SciFi      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Thriller   : int  0 0 0 0 0 1 0 0 0 1 ...
##  $ War        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Western    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ AvgRating  : num  4.15 3.2 3.02 2.73 3.01 3.88 3.41 3.01 2.66 3.54 ...
##  $ Watches    : int  2077 701 478 170 296 940 458 68 102 888 ...

这个数据一共21列,其中4列是属性,剩余17列是电影类型,例如动作片、剧情片、恐怖片、惊悚片等。


下面我们新建一个数据集,这个数据集是每种类型电影的烂番茄评分!

metadata <- data.frame(
  sets = names(movies[3:19]),
  RottenTomato = round(runif(17, min = 0, max = 90))
)
upset(movies, 
      set.metadata = list(
        data = metadata, 
        plots = list(list(type = "hist", column = "RottenTomato", assign = 20))
        )
      )

plot of chunk unnamed-chunk-17

OK,非常简单就显示了每种类型电影的烂番茄评分,以条形图的形式展示在最左侧。

当然也可以是其他类型的图形,比如热图。

# 再新建一个数据画热图用
metadata$Cities <- sample(c("Boston", "NYC", "LA"), 17, replace = T)
metadata$accepted <- round(runif(17, min = 0, max = 1))
str(metadata)
## 'data.frame':	17 obs. of  4 variables:
##  $ sets        : chr  "Action" "Adventure" "Children" "Comedy" ...
##  $ RottenTomato: num  84 79 66 43 13 26 74 54 70 25 ...
##  $ Cities      : chr  "Boston" "Boston" "NYC" "Boston" ...
##  $ accepted    : num  0 0 0 1 0 0 1 1 0 0 ...
upset(movies, 
      set.metadata = list(
        data = metadata, 
        plots = list(
          list(type = "heat", column = "Cities", assign = 10, colors = c(Boston = "green", NYC = "navy", LA = "purple")),
          list(type = "heat", column = "RottenTomato", 
    assign = 10),
    list(type = "bool", 
    column = "accepted", assign = 5, colors = c("#FF3333", "#006400"))
          )
        )
      )

plot of chunk unnamed-chunk-19

还可以是文字,甚至把下方的矩阵换一个颜色:

upset(movies, 
      set.metadata = list(
        data = metadata, 
        plots = list(
          list(type = "text", column = "Cities", assign = 10, colors = c(Boston = "tomato", NYC = "grey70", LA = "red")),
          list(type = "matrix_rows", column = "Cities", colors = c(Boston = "green", NYC = "navy", LA = "purple"), 
    alpha = 0.5)
          )))

plot of chunk unnamed-chunk-20

OK,以上就是

upset plot

的全部内容!一共分了4篇推文介绍,这个包真是666!

最后,给大家奉上完整的

upset plot

upset(movies, main.bar.color = "skyblue",
      
      set.metadata = list(
        data = metadata, 
        plots = list(
          list(type = "hist", column = "RottenTomato", assign = 20), 
          list(type = "bool", column = "accepted", assign = 5, colors = c("#FF3333", "#006400")), 
          list(type = "text", column = "Cities", assign = 5, colors = c(Boston = "green", NYC = "navy", LA = "purple")), 
          list(type = "matrix_rows", column = "Cities", colors = c(Boston = "green", NYC = "navy", LA = "purple"), alpha = 0.5)
          )
        ), 
      
      queries = list(
        list(query = intersects, params = list("Drama"), color = "red", active = F),
        list(query = intersects, params = list("Action", "Drama"), active = T), 
        list(query = intersects, params = list("Drama", "Comedy", "Action"), color = "orange", active = T)
        ), 
      
      attribute.plots = list(
        gridrows = 45, 
        plots = list(
          list(plot = scatter_plot, x = "ReleaseDate", y = "AvgRating", queries = T),
          list(plot = scatter_plot, x = "AvgRating", y = "Watches", queries = F)
          ), 
        ncols = 2),
      
      query.legend = "bottom")

plot of chunk unnamed-chunk-21

最后,希望对大家有帮助哦!



版权声明:本文为Ayue0616原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。