Max Woolf's Blog
About
Posts
Portfolio
Patreon
GitHub
Posts
Playing with 80 Million Amazon Product Review Ratings Using Apache Spark
Manipulating actually-big-data is just as easy as performing an analysis on a dataset with only a few records.
January 2, 2017
7 min read
Data Science
,
Big Data
What Percent of the Top-Voted Comments in Reddit Threads Were Also 1st Comment?
Are commenters ‘late to this thread’ indeed late?
November 7, 2016
7 min read
Data Science
,
Big Data
Visualizing How Developers Rate Their Own Programming Skills
As it turns out, there is no correlation between programming ability and the frequency of Stack Overflow visits.
July 21, 2016
6 min read
Data Science
Methods for Finding Related Reddit Subreddits with Simple Set Theory
Fancy machine learning approaches may not be required to help Redditors discover new things.
June 20, 2016
5 min read
Data Science
,
Big Data
How to Create a Network Graph Visualization of Reddit Subreddits
There is very little discussion on how to gather the data for large-scale network graph visualizations, and how to make them. It is …
May 27, 2016
7 min read
Data Science
,
Big Data
Creating Stylish, High-Quality Word Clouds Using Python and Font Awesome Icons
Why not make a word cloud which looks like a line chart?
May 9, 2016
7 min read
Data Visualization
Blockbuster Movies with Male Leads Earn More Than Those with Female Leads
On average, blockbuster movies with male leads generate 22% more domestic box office revenue, and this difference is statistically …
April 13, 2016
8 min read
Data Science
The Importance of Sanity-Checking Datasets Before Analysis
The 1972 TV Special ‘The Lorax’ is the best movie ever, earning $1.2 billion?
April 6, 2016
6 min read
Thought Piece
Unlimited Data Storage Using Image Steganography and Cat GIFs
tl;dr I was bored and decided to create infinite data in a way that makes people feel fuzzy inside.
March 29, 2016
5 min read
Idea
,
Comedy
,
Genius
Facebook Reactions and the Problems With Quantifying Likes Differently
Apparently, there is little statistical relationship between things that are cute and things that make you go YAAASS.
February 29, 2016
6 min read
Data Science
,
Thought Piece
«
»
Cite
×