This R Notebook is the complement to my blog post How to Create an Interactive WebGL Network Graph Using R.
This notebook is licensed under the MIT License. If you use the code or data visualization designs contained within this notebook, it would be greatly appreciated if proper attribution is given back to this notebook and/or myself. Thanks! :)
Setup the R packages.
# must install ggnetwork using from source to avoid ggplot2 2.2.0 issue
# install.packages("ggnetwork", type="source")
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
library(nycflights13)
library(igraph)
Attaching package: ‘igraph’
The following objects are masked from ‘package:dplyr’:
%>%, as_data_frame, groups, union
The following objects are masked from ‘package:stats’:
decompose, spectrum
The following object is masked from ‘package:base’:
union
library(intergraph)
library(sna)
Loading required package: statnet.common
Loading required package: network
network: Classes for Relational Data
Version 1.13.0 created on 2015-08-31.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
Mark S. Handcock, University of California -- Los Angeles
David R. Hunter, Penn State University
Martina Morris, University of Washington
Skye Bender-deMoll, University of Washington
For citation information, type citation("network").
Type help("network-package") to get started.
Attaching package: ‘network’
The following objects are masked from ‘package:igraph’:
%c%, %s%, add.edges, add.vertices, delete.edges, delete.vertices,
get.edge.attribute, get.edges, get.vertex.attribute,
is.bipartite, is.directed, list.edge.attributes,
list.vertex.attributes, set.edge.attribute, set.vertex.attribute
sna: Tools for Social Network Analysis
Version 2.4 created on 2016-07-23.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
For citation information, type citation("sna").
Type help(package="sna") to get started.
Attaching package: ‘sna’
The following objects are masked from ‘package:igraph’:
betweenness, bonpow, closeness, components, degree, dyad.census,
evcent, hierarchy, is.connected, neighborhood, triad.census
library(ggplot2)
library(ggnetwork)
library(plotly)
Attaching package: ‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following objects are masked from ‘package:igraph’:
%>%, groups
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
library(htmlwidgets)
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.2
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] htmlwidgets_0.8 plotly_4.5.6 ggnetwork_0.5.1
[4] ggplot2_2.2.0 sna_2.4 network_1.13.0
[7] statnet.common_3.3.0 intergraph_2.0-2 igraph_1.0.1
[10] nycflights13_0.2.0 dplyr_0.5.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.8 knitr_1.15.1 magrittr_1.5 munsell_0.4.3
[5] viridisLite_0.1.3 colorspace_1.3-2 R6_2.2.0 httr_1.2.1
[9] plyr_1.8.4 stringr_1.1.0 tools_3.3.2 grid_3.3.2
[13] gtable_0.2.0 DBI_0.5-1 htmltools_0.3.5 lazyeval_0.2.0
[17] assertthat_0.1 rprojroot_1.1 digest_0.6.10 tibble_1.2
[21] tidyr_0.6.0 purrr_0.2.2 base64enc_0.1-3 ggrepel_0.6.5
[25] evaluate_0.10 rmarkdown_1.3 stringi_1.1.2 scales_0.4.1
[29] backports_1.0.4 jsonlite_1.1
The nycflights13
package contains a flights
dataset.
flights %>% head()
There are 336,776 flights in the dataset.
Getting the edge weights is a dplyr
aggregation.
df_edges <- flights %>% group_by(origin, dest) %>% summarize(weight = n())
df_edges %>% arrange(desc(weight)) %>% head()
There are 224 total edges.
Add a colors column to edge for each origin
which will eventually be used for final ggplot.
# blue, red, green
colors = c("#3498db", "#e74c3c", "#2ecc71")
# seting alphabetical order; allows for predictable ordering later
origins = c("EWR", "JFK", "LGA")
df_colors = tbl_df(data.frame(origin=origins, color=origins))
df_edges <- df_edges %>% left_join(df_colors)
Joining, by = "origin"
joining factor and character vector, coercing into character vector
df_edges %>% arrange(desc(weight)) %>% head()
net <- graph.data.frame(df_edges, directed = T)
net
IGRAPH DNW- 107 224 --
+ attr: name (v/c), weight (e/n), color (e/c)
+ edges (vertex names):
[1] EWR->ALB EWR->ANC EWR->ATL EWR->AUS EWR->AVL EWR->BDL EWR->BNA EWR->BOS
[9] EWR->BQN EWR->BTV EWR->BUF EWR->BWI EWR->BZN EWR->CAE EWR->CHS EWR->CLE
[17] EWR->CLT EWR->CMH EWR->CVG EWR->DAY EWR->DCA EWR->DEN EWR->DFW EWR->DSM
[25] EWR->DTW EWR->EGE EWR->FLL EWR->GRR EWR->GSO EWR->GSP EWR->HDN EWR->HNL
[33] EWR->HOU EWR->IAD EWR->IAH EWR->IND EWR->JAC EWR->JAX EWR->LAS EWR->LAX
[41] EWR->LGA EWR->MCI EWR->MCO EWR->MDW EWR->MEM EWR->MHT EWR->MIA EWR->MKE
[49] EWR->MSN EWR->MSP EWR->MSY EWR->MTJ EWR->MYR EWR->OKC EWR->OMA EWR->ORD
[57] EWR->ORF EWR->PBI EWR->PDX EWR->PHL EWR->PHX EWR->PIT EWR->PVD EWR->PWM
+ ... omitted several edges
V(net)$degree <- centralization.degree(net)$res
V(net)$weighted_degree <- graph.strength(net)
V(net)$color_v <- c(origins, rep("Others", gorder(net) - length(colors)))
Write specialized hovertext for each vertex. Note that airport attributes must be mapped to same order as vertices.
df_airports <- data.frame(vname=V(net)$name) %>% left_join(airports, by=c("vname" = "faa"))
joining character vector and factor, coercing into character vector
V(net)$text <- paste(V(net)$name,
df_airports$name,
paste(format(V(net)$weighted_degree, big.mark=",", trim=T), "Flights"),
sep = "<br>")
V(net)$text %>% head()
[1] "EWR<br>Newark Liberty Intl<br>120,835 Flights"
[2] "JFK<br>John F Kennedy Intl<br>111,279 Flights"
[3] "LGA<br>La Guardia<br>104,663 Flights"
[4] "ALB<br>Albany Intl<br>439 Flights"
[5] "ANC<br>Ted Stevens Anchorage Intl<br>8 Flights"
[6] "ATL<br>Hartsfield Jackson Atlanta Intl<br>17,215 Flights"
Add latitudes/longitudes to both vertices and edges for spatial map;
V(net)$lat <- df_airports$lat
V(net)$lon <- df_airports$lon
# gives to/from locations; map to corresponding ending lat/long
end_loc <- data.frame(ename=get.edgelist(net)[,2]) %>% left_join(airports, by=c("ename" = "faa"))
joining character vector and factor, coercing into character vector
E(net)$endlat <- end_loc$lat
E(net)$endlon <- end_loc$lon
Use ggnetwork
to transform the network to a ggplot
friendly format.
# ggnetwork sets default nodes randomly; set seed for reproducibility
set.seed(123)
df_net <- ggnetwork(net, layout = "fruchtermanreingold", weights="weight", niter=50000, arrow.gap=0)
df_net %>% head()
plot <- ggplot(df_net, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(aes(color = color), size=0.4, alpha=0.25) +
geom_nodes(aes(color = color_v, size = degree, text=text)) +
ggtitle("Network Graph of U.S. Flights Outbound from NYC in 2013") +
scale_color_manual(labels=c("EWR", "JFK", "LGA", "Others"),
values=c(colors, "#1a1a1a"), name="Airports") +
guides(size=FALSE) +
theme_blank() +
theme(plot.title = element_text(family="Source Sans Pro"),
legend.title = element_text(family="Source Sans Pro"),
legend.text = element_text(family="Source Sans Pro"))
Ignoring unknown aesthetics: text
plot
plot %>% ggplotly(tooltip="text") %>% toWebGL()
Save interactive plot locally to disk using htmlwidgets
for uploading to my website. (this only saves the data/layout; you will need to provide the relevant plot.ly javascript on your own website)
plot %>% ggplotly(tooltip="text", height=400) %>%
toWebGL() %>%
saveWidget("ggplot-graph-1.html", selfcontained=F, libdir="plotly")
I make a few manual changes to the output for the output used on the website:
plot <- ggplot(df_net, aes(x = lon, y = lat, xend = endlon, yend = endlat)) +
geom_edges(aes(color = color), size=0.4, alpha=0.25) +
geom_nodes(aes(color = color_v, size = degree, text=text)) +
ggtitle("Locations of U.S. Flights Outbound from NYC in 2013") +
scale_color_manual(labels=c("EWR", "JFK", "LGA", "Others"),
values=c(colors, "#1a1a1a"), name="Airports") +
guides(size=FALSE) +
theme_blank() +
theme(plot.title = element_text(family="Source Sans Pro"),
legend.title = element_text(family="Source Sans Pro"),
legend.text = element_text(family="Source Sans Pro"))
Ignoring unknown aesthetics: text
plot
plot %>% ggplotly(tooltip="text") %>% toWebGL()
plot %>% ggplotly(tooltip="text", height=400) %>%
toWebGL() %>%
saveWidget("ggplot-graph-2.html", selfcontained=F, libdir="plotly")
Minimum amount of code needed to demonstate proof-of-concept for article.
df_edges <- flights %>% group_by(origin, dest) %>% summarize(weight = n())
net <- graph.data.frame(df_edges, directed = T)
V(net)$degree <- centralization.degree(net)$res
df_net <- ggnetwork(net, layout = "fruchtermanreingold", weights="weight", niter=5000)
plot <- ggplot(df_net, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(size=0.4, alpha=0.25) +
geom_nodes(aes(size = degree, text=vertex.names)) +
ggtitle("Network Graph of U.S. Flights Outbound from NYC in 2013") +
theme_blank()
Ignoring unknown aesthetics: text
plot
plot %>% ggplotly(tooltip="text") %>% toWebGL()
plot %>% ggplotly(tooltip="text", height=400) %>%
toWebGL() %>%
saveWidget("ggplot-graph-3.html", selfcontained=F, libdir="plotly")
The MIT License (MIT)
Copyright (c) 2016 Max Woolf
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.