Quantifying and Visualizing the Reddit Hivemind
If we can find out which topics Reddit users tend to upvote, we can identify what keywords are most attractive to the Reddit hivemind.
If we can find out which topics Reddit users tend to upvote, we can identify what keywords are most attractive to the Reddit hivemind.
With Reddit data in BigQuery, quantifying all the hundreds of millions of Reddit submissions and comments is trivial.
I have reverse-engineered data and code with R and ggplot2 in order to create detailed implementations of bootstrapping, and also to add a few visual improvements.
In theory, plotting a million little points in close proximity should simulate the lines of the streets of New York City.
It is pretty easy to scrape Facebook Posts data and make into a spreadsheet for easy analysis, although there are a large number of gotchas.