Portfolio

A list of my open-sourced projects on GitHub

.js-id-Tools

GPT-2 Small

Webapp to generate text from the default GPT-2 117M model.

GPT-2 Reddit

Webapp to generate Reddit submission titled conditioned on a subreddit/keywords from from a finetuned GPT-2 model.

GPT-2 Magic: The Gathering

Webapp to generate text + a Magic card image from a finetuned GPT-2 model.

A Visual Overview of Stack Overflow’s Question Tags

R Notebook on how to process and analyze Stack Overflow data.

Analyzing IMDb Data The Intended Way, with R and ggplot2

R Notebook on how to process and visualize the official IMDb datasets.

Copy Syntax Highlight for macOS

macOS service which copies the selected text to the clipboard, with proper syntax highlighting for the given language.

Get all Hacker News Submissions/Comments

Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.

Github Repo Stargazers

A script used to get the GitHub profile information of all the people who’ve Stared a given GitHub repository.

gpt-2-cloud-run

Text-generation API via GPT-2 for Cloud Run

gpt-2-simple

Python package to easily retrain OpenAI’s GPT-2 text-generating model on new texts

Hacker News Undocumented

Some of the hidden norms about Hacker News not otherwise covered in the Guidelines and the FAQ.

Is it a Duck or a Bird?

Python code to submit rotated images to the Cloud Vision API + R code for visualizing it

Magic: The GIFening

A Twitter bot which tweets Magic: the Gathering cards with appropriate GIFs superimposed onto them.

ML Data Generator

Python script to generate fake datasets optimized for testing machine learning/deep learning workflows

Person Blocker

Automatically “block” people in images (like Black Mirror) using a pretrained neural network.

Playing with 80 Million Amazon Product Review Ratings

R Notebook for analyzing millions of Amazon reviews using Apache Spark.

Predicting And Mapping Arrest Types in San Francisco

R Notebook for predicting arrest types in San Francisco.

Pretrained Character Embeddings for Deep Learning and Automatic Text Generation

R Notebook with visualizations of character embeddings from derived 300D character vectors.

Problems with Predicting Post Performance on Reddit and Other Link Aggregators

R Notebook on how to process and visualize both Reddit and Hacker News data.

Stylistic Word Clouds

Python scripts for creating stylistic word clouds

textgenrnn

Easily generate text using a pretrained character-based recurrent neural network.

The Decline of Imgur on Reddit and the Rise of Reddit’s Native Image Hosting

R Notebook for analyzing the decline of Imgur on Reddit.

Video to GIF macOS

A set of utilities that allow the user to easily convert video files to very-high-quality GIFs on macOS.

Visualizing One Million NCAA Basketball Shots

R Notebook on how to process and visualize NCAA basketball data.

What Percent of the Top-Voted Comments in Reddit Threads Were Also 1st Comment?

R Notebook for querying, analyzing, and visualizing the Reddit data to determine the impact of the first comment in a Reddit thread.