Data Science

Quantifying the Clickbait and Linkbait in BuzzFeed Article Titles

You probably do not know that the 3 most interesting things I found will blow your mind.

Locating All the Christmas Trees on Instagram

I downloaded *hundreds of thousands* of #tree images and found 25,432 images which were taken on Christmas, have a #tree, and, most importantly, contain location data where the photo was taken.

A Statistical Analysis of 142 Million Reddit Submissions

I constructed a database to store all Reddit Submissions from November 2007 to the end of October 2014: 142,159,793 submissions in total. And this data is very curious and very, *very* memetic.

The Quality, Popularity, and Negativity of 5.6 Million Hacker News Comments

Hopefully, these comments will answer whether Hacker News is experiencing a rise in quality, or if the complaints levied against HN are valid.

The Least Effective Method For Blocking Web Scraping of a Website

What was on Page #10 shocked me. OMG. I could not believe my eyes.

The Statistical Difference Between 1-Star and 5-Star Reviews on Yelp

It can be proven that language has a strong statistical effect on review ratings, but that is intuitive enough. How have review ratings changed?

The Data From Our Comments to the FCC About Net Neutrality

The FCC released a dataset of about 450,000 comments against net neutrality. Looking at the data behind these comments, it is clear to see that the entire country is passionate against the rule changes to net neutrality.

The Wikipedia Entries Which Are Most-Edited by Members of the U.S. Congress

Saying that the results were surprising would be the understatement of the century.

Impact of the New Show HN Section on Show HN Submissions

Did this feature help or harm Show HN submissions as a whole?

Who Performs the Best in Online Classes?

Which types of student characteristics lead to the best performance in online classes? That depends on how you define "performance."