Data

A Visual Overview of Stack Overflow's Question Tags

R Notebook on how to process and analyze Stack Overflow data.

Analyzing IMDb Data The Intended Way, with R and ggplot2

R Notebook on how to process and visualize the official IMDb datasets.

Playing with 80 Million Amazon Product Review Ratings

R Notebook for analyzing millions of Amazon reviews using Apache Spark.

Predicting And Mapping Arrest Types in San Francisco

R Notebook for predicting arrest types in San Francisco.

Pretrained Character Embeddings for Deep Learning and Automatic Text Generation

R Notebook with visualizations of character embeddings from derived 300D character vectors.

Problems with Predicting Post Performance on Reddit and Other Link Aggregators

R Notebook on how to process and visualize both Reddit and Hacker News data.

The Decline of Imgur on Reddit and the Rise of Reddit's Native Image Hosting

R Notebook for analyzing the decline of Imgur on Reddit.

Visualizing One Million NCAA Basketball Shots

R Notebook on how to process and visualize NCAA basketball data.

What Percent of the Top-Voted Comments in Reddit Threads Were Also 1st Comment?

R Notebook for querying, analyzing, and visualizing the Reddit data to determine the impact of the first comment in a Reddit thread.