A list of Max Woolf’s top data analysis projects, in the form of Jupyter/IPython Notebooks and R Notebooks. Most notebooks are available fully open-sourced on GitHub, with a MIT license.

If you want to learn more about Max’s technical projects, see his coding portfolio.

Pretrained Character Embeddings for Deep Learning and Automatic Text Generation

R Notebook with visualizations of character embeddings from derived 300D character vectors.

R, ggplot2

Pretrained Character Embeddings for Deep Learning and Automatic Text Generation

R Notebook with visualizations of character embeddings from derived 300D character vectors. R, ggplot2

Predicting And Mapping Arrest Types in San Francisco

R Notebook for predicting arrest types in San Francisco.

R, ggplot2, LightGBM

Predicting And Mapping Arrest Types in San Francisco

R Notebook for predicting arrest types in San Francisco. R, ggplot2, LightGBM

Playing with 80 Million Amazon Product Review Ratings

R Notebook for analyzing millions of Amazon reviews using Apache Spark.

R, ggplot2, Spark

Playing with 80 Million Amazon Product Review Ratings

R Notebook for analyzing millions of Amazon reviews using Apache Spark. R, ggplot2, Spark

How to Create an Interactive WebGL Network Graph

R Notebook with a tutorial on how to create an interactive graph network using R and Plotly.

R, ggplot2, plotly

How to Create an Interactive WebGL Network Graph

R Notebook with a tutorial on how to create an interactive graph network using R and Plotly. R, ggplot2, plotly

What Percent of the Top-Voted Comments in Reddit Threads Were Also 1st Comment?

R Notebook for querying, analyzing, and visualizing the Reddit data to determine the impact of the first comment in a Reddit thread.

R, ggplot2

What Percent of the Top-Voted Comments in Reddit Threads Were Also 1st Comment?

R Notebook for querying, analyzing, and visualizing the Reddit data to determine the impact of the first comment in a Reddit thread. R, ggplot2

Processing Clusters of Clickbait Headlines

Jupyter Notebook for processing Facebook headline data in preparation for plotting word embeddings.

Python, Spark, word2vec

Processing Clusters of Clickbait Headlines

Jupyter Notebook for processing Facebook headline data in preparation for plotting word embeddings. Python, Spark, word2vec

Visualizing Clusters of Clickbait Headlines

Jupyter Notebook for visualizing Facebook data interactively.

R, plotly

Visualizing Clusters of Clickbait Headlines

Jupyter Notebook for visualizing Facebook data interactively. R, plotly

Processing Pokémon Data With Apache Spark

Jupyter Notebook for processing Pokémon data in preparation for visualization.

Python, Spark

Processing Pokémon Data With Apache Spark

Jupyter Notebook for processing Pokémon data in preparation for visualization. Python, Spark

Interactive 3D Clusters of all 721 Pokémon

Jupyter Notebook for visualizing Pokémon data in 3D.

R, ggplot2, plotly

Interactive 3D Clusters of all 721 Pokémon

Jupyter Notebook for visualizing Pokémon data in 3D. R, ggplot2, plotly

Visualizing How Developers Rate Their Own Programming Skills

Jupyter Notebook for visualizing Stack Overflow 2016 Survey data.

R, ggplot2

Visualizing How Developers Rate Their Own Programming Skills

Jupyter Notebook for visualizing Stack Overflow 2016 Survey data. R, ggplot2

Classifying the Emotions of Facebook Posts Using Reactions Data

Jupyter Notebook for visualizing different sentiments of Facebook headline data.

R, ggplot2, plotly

Classifying the Emotions of Facebook Posts Using Reactions Data

Jupyter Notebook for visualizing different sentiments of Facebook headline data. R, ggplot2, plotly

Blockbuster Movies with Male Leads Earn More Than Those with Female Leads

Jupyter Notebook for visualizing Box Office grosses between male-led and female-led movies.

R, ggplot2

Blockbuster Movies with Male Leads Earn More Than Those with Female Leads

Jupyter Notebook for visualizing Box Office grosses between male-led and female-led movies. R, ggplot2

How to Analyze Every Reddit Submission and Comment, in Seconds, for Free

Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily.

R, ggplot2

How to Analyze Every Reddit Submission and Comment, in Seconds, for Free

Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily. R, ggplot2

Analyzing When and Where San Francisco Criminal Arrests Occur

Jupyter notebook for replicating analysis of when and where arrests in San Francisco occur.

R, ggplot2

Analyzing When and Where San Francisco Criminal Arrests Occur

Jupyter notebook for replicating analysis of when and where arrests in San Francisco occur. R, ggplot2

How to Visualize New York City Using Taxi Location Data

Jupyter notebook for analyzing and visualizing NYC Yellow Taxi data.

R, ggplot2

How to Visualize New York City Using Taxi Location Data

Jupyter notebook for analyzing and visualizing NYC Yellow Taxi data. R, ggplot2