Locating All the Christmas Trees on Instagram
I downloaded *hundreds of thousands* of #tree images and found 25,432 images which were taken on Christmas, have a #tree, and, most importantly, contain location data where the photo was taken.
Jan 1, 2015
A Statistical Analysis of 142 Million Reddit Submissions
I constructed a database to store all Reddit Submissions from November 2007 to the end of October 2014: 142,159,793 submissions in total. And this data is very curious and very, *very* memetic.
Dec 16, 2014
The Quality, Popularity, and Negativity of 5.6 Million Hacker News Comments
Hopefully, these comments will answer whether Hacker News is experiencing a rise in quality, or if the complaints levied against HN are valid.
Oct 6, 2014
The Least Effective Method For Blocking Web Scraping of a Website
What was on Page #10 shocked me. OMG. I could not believe my eyes.
Sep 26, 2014
The Statistical Difference Between 1-Star and 5-Star Reviews on Yelp
It can be proven that language has a strong statistical effect on review ratings, but that is intuitive enough. How have review ratings changed?
Sep 23, 2014
The Data From Our Comments to the FCC About Net Neutrality
The FCC released a dataset of about 450,000 comments against net neutrality. Looking at the data behind these comments, it is clear to see that the entire country is passionate against the rule changes to net neutrality.
Aug 8, 2014