<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Statistics on Max Woolf&#39;s Blog</title>
    <link>https://minimaxir.com/tag/statistics/</link>
    <description>Recent content in Statistics on Max Woolf&#39;s Blog</description>
    <image>
      <title>Max Woolf&#39;s Blog</title>
      <url>https://minimaxir.com/android-chrome-512x512.png</url>
      <link>https://minimaxir.com/android-chrome-512x512.png</link>
    </image>
    <generator>Hugo</generator>
    <language>en</language>
    <copyright>Copyright Max Woolf © 2026</copyright>
    <lastBuildDate>Fri, 29 Nov 2013 10:30:00 -0700</lastBuildDate>
    <atom:link href="https://minimaxir.com/tag/statistics/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Probabilistically Generating GitHub Projects</title>
      <link>https://minimaxir.com/2013/11/innovation-rng/</link>
      <pubDate>Fri, 29 Nov 2013 10:30:00 -0700</pubDate>
      <guid>https://minimaxir.com/2013/11/innovation-rng/</guid>
      <description>Perl interface to Git repositories via Ruby. Brute force your OpenERP data integration with flatfiles.</description>
      <content:encoded><![CDATA[<p>Grant Slatton made an amusing post on Hacker News yesterday titled &ldquo;<a href="https://news.ycombinator.com/item?id=6815282">Show HN: Probabilistically Generating HN Post Titles</a>&rdquo;. By using the statistical principle of <a href="http://en.wikipedia.org/wiki/Markov_chain">Markov chains</a>, Slatton was able to <a href="http://grantslatton.com/hngen/">generate eerily-realistic Hacker News headlines</a> such as &ldquo;Facebook detects if you are not a pilot&rdquo; and &ldquo;The No. 1 Habit of Highly Effective Mediocre Entrepreneurs.&rdquo;</p>
<p>Could Markov chains be applied to any other data sets for hilarious effect? By using Slatton&rsquo;s <a href="https://gist.github.com/grantslatton/7694811">Python implementation of Markov chains</a> plus 300,000 descriptions of public GitHub repositories retrieved from their API, I discovered that statistical randomness can indeed create funny innovation.</p>
<figure>

    <img loading="lazy" srcset="/2013/11/innovation-rng/github-wordcloud-mac_hu_6b5a252d5bd0d887.webp 320w,/2013/11/innovation-rng/github-wordcloud-mac_hu_4aef18ab7660ca38.webp 768w,/2013/11/innovation-rng/github-wordcloud-mac.png 900w" src="github-wordcloud-mac.png"/> 
</figure>

<p>You can download a list of 1,000 Markov chain-generated projects <a href="https://dl.dropboxusercontent.com/u/2017402/github_markov.txt">here</a>. Here are a few interesting ones:</p>
<ul>
<li>MaNGOS is a free, Open Source implementation of a tag at relatively random intervals.</li>
<li>A Warhammer 40k simulator to teach myself both OpenGL and Clojure</li>
<li>Perl interface to Git repositories via Ruby.</li>
<li>A windows live messenger network client written in Erlang</li>
<li>Rails plugin which allows to talk anonymously and use tripcodes if you want.</li>
<li>A Firebug extension for displaying the latest from Hacker News</li>
<li>Sinatra-inspired JavaScript node.js web development framework for lua. Inspired by rspec</li>
<li>Inverted Index on top of Tornado</li>
<li>Android LED interface library for various wave propagation techniques.</li>
<li>CatchAPI is a Java API to remove the need for boring project setup.</li>
<li>Adds basic social networking capabilities to your lighting system based on the concept of the Working with Rails</li>
<li>Brute force your OpenERP data integration with flatfiles</li>
<li>Culerity integrates Cucumber and Celerity in order to shutdown the computer.</li>
<li>Parses ANSI color codes and converts them to iphone compatible mp4s using HandBrake</li>
<li>A simple OFX (Open Financial Exchange) parser built on top of WordPress. Rolopress core theme</li>
</ul>
<hr>
<p><em>The code used to get the project descriptions from the GitHub API is available in <a href="https://github.com/minimaxir/get-github-repo-descriptions">this GitHub repository</a>, and you can download the ~300k repo descriptions <a href="https://dl.dropboxusercontent.com/u/2017402/github_repo_desc.zip">here</a>. [5MB .zip]</em></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
