This notebook is licensed under the MIT License. If you use the code or data visualization designs contained within this notebook, it would be greatly appreciated if proper attribution is given back to this notebook and/or myself. Thanks! :)

Via https://twitter.com/felipehoffa/status/1111050585120206848

0.1 Setup

Registered S3 method overwritten by 'dplyr':
  method           from
  print.rowwise_df     
── Attaching packages ──────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.2.1     ✔ purrr   0.3.2
✔ tibble  2.1.3     ✔ dplyr   0.8.3
✔ tidyr   1.0.0     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.4.0
── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

Attaching package: ‘scales’

The following object is masked from ‘package:purrr’:

    discard

The following object is masked from ‘package:readr’:

    col_factor

Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

    date
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bigrquery_1.2.0 lubridate_1.7.4 scales_1.0.0    forcats_0.4.0  
 [5] stringr_1.4.0   dplyr_0.8.3     purrr_0.3.2     readr_1.3.1    
 [9] tidyr_1.0.0     tibble_2.1.3    ggplot2_3.2.1   tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5 xfun_0.10        haven_2.1.1      lattice_0.20-38 
 [5] colorspace_1.4-1 vctrs_0.2.0      generics_0.0.2   htmltools_0.4.0 
 [9] yaml_2.2.0       base64enc_0.1-3  rlang_0.4.0      pillar_1.4.2    
[13] glue_1.3.1       withr_2.1.2      DBI_1.0.0        bit64_0.9-7     
[17] modelr_0.1.5     readxl_1.3.1     lifecycle_0.1.0  munsell_0.5.0   
[21] gtable_0.3.0     cellranger_1.1.0 rvest_0.3.4      evaluate_0.14   
[25] knitr_1.25       broom_0.5.2      Rcpp_1.0.2       backports_1.1.5 
[29] jsonlite_1.6     bit_1.1-14       hms_0.5.1        digest_0.6.21   
[33] stringi_1.4.3    grid_3.6.1       cli_1.1.0        tools_3.6.1     
[37] magrittr_1.5     lazyeval_0.2.2   crayon_1.3.4     pkgconfig_2.0.3 
[41] zeallot_0.1.0    xml2_1.2.2       assertthat_0.2.1 rmarkdown_1.16  
[45] httr_1.4.1       rstudioapi_0.10  R6_2.4.0         nlme_3.1-141    
[49] compiler_3.6.1  

1 EDA

1.1 SFO → SEA Flight Duration

Waiting for authentication in browser...
Press Esc/Ctrl + C to abort
Authentication complete.
Complete
Billed: 0 B
Downloading 102 rows in 1 pages.

Parsing [========================================================] ETA:  0s
                                                                           

NB: As of ggplot 3.2.0, you must have the group aesthetic for boxplots manually specifying the bounds. (https://stackoverflow.com/q/57192727)

Alternate approach using ribbons (not used in final blog post since harder to visually parse, but present here for posterity):