Data Science

Mapping Where Arrests Frequently Occur in San Francisco Using Crime Data

Let's plot 587,499 arrests on top of a map of San Francisco for fun and see what happens.

Analyzing San Francisco Crime Data to Determine When Arrests Frequently Occur

Spoilers: Most arrests in San Francisco happen Wednesdays at 4-5 PM. For some reason.

How to Visualize New York City Using Taxi Location Data and ggplot2

I had posted a visualization of NYC taxis using ggplot2. Due to popular demand, I've cleaned up the code and have released it open source, with a few improvements.

Quantifying and Visualizing the Reddit Hivemind

If we can find out which topics Reddit users tend to upvote, we can identify what keywords are most attractive to the Reddit hivemind.

How to Analyze Every Reddit Submission and Comment, in Seconds, for Free

With Reddit data in BigQuery, quantifying all the hundreds of millions of Reddit submissions and comments is trivial.

Coding, Visualizing, and Animating Bootstrap Resampling

I have reverse-engineered data and code with R and ggplot2 in order to create detailed implementations of bootstrapping, and also to add a few visual improvements.

Plotting a Map of New York City Using Only Taxi Location Data

In theory, plotting a million little points in close proximity should simulate the lines of the streets of New York City.

Why is the Most-Viewed Gaming Video on YouTube About Cars 2?

No, this is not an error. You can watch the video yourself on YouTube and verify the view count.

Analyzing the Patterns of Numbers in 10 Million Passwords

There are many patterns for numbers in passwords, which involve surprising yet intuitive logic.

An Introduction on How to Make Beautiful Charts With R and ggplot2

Adding a touch of color and design can help make more compelling visualizations, thanks to ggplot2 syntax and chaining capabilities.