<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Programming on Max Woolf&#39;s Blog</title>
    <link>https://minimaxir.com/tag/programming/</link>
    <description>Recent content in Programming on Max Woolf&#39;s Blog</description>
    <image>
      <title>Max Woolf&#39;s Blog</title>
      <url>https://minimaxir.com/android-chrome-512x512.png</url>
      <link>https://minimaxir.com/android-chrome-512x512.png</link>
    </image>
    <generator>Hugo</generator>
    <language>en</language>
    <copyright>Copyright Max Woolf © 2026</copyright>
    <lastBuildDate>Tue, 23 Sep 2014 08:00:00 -0700</lastBuildDate>
    <atom:link href="https://minimaxir.com/tag/programming/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>The Statistical Difference Between 1-Star and 5-Star Reviews on Yelp</title>
      <link>https://minimaxir.com/2014/09/one-star-five-stars/</link>
      <pubDate>Tue, 23 Sep 2014 08:00:00 -0700</pubDate>
      <guid>https://minimaxir.com/2014/09/one-star-five-stars/</guid>
      <description>It can be proven that language has a strong statistical effect on review ratings, but that is intuitive enough. How have review ratings changed?</description>
      <content:encoded><![CDATA[<p>Many business in the real world encourage their customers to &ldquo;Rate us on Yelp!&rdquo;. <a href="http://www.yelp.com/">Yelp</a>, the &ldquo;best way to find local businesses,&rdquo; relies on user reviews to help its viewers find the best places. Both positive and negative reviews are helpful in this mission: positive reviews on Yelp identify the best places, negative reviews identify places where people <em>shouldn&rsquo;t</em> go. Usually, both positive and negative reviews are not based on objective attributes of the business, but on the experience the writer has with the establishment.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp_review_pos_hu_ddcb34306c4121d.webp 320w,/2014/09/one-star-five-stars/yelp_review_pos.png 620w" src="yelp_review_pos.png"/> 
</figure>

<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp_review_neg_hu_5472e07a6e063134.webp 320w,/2014/09/one-star-five-stars/yelp_review_neg.png 633w" src="yelp_review_neg.png"/> 
</figure>

<p>I analyzed the language present in 1,125,458 Yelp Reviews using the dataset from the <a href="http://www.yelp.com/dataset_challenge">Yelp Dataset Challenge</a> containing reviews of businesses in the cities of Phoenix, Las Vegas, Madison, Waterloo and Edinburgh. Users can rate businesses 1, 2, 3, 4, or 5 stars. When comparing the most-frequent two-word phrases between 1-star and 5-star reviews, the difference is apparent.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/Yelp-2-Gram-Small_hu_a3184278e17792da.webp 320w,/2014/09/one-star-five-stars/Yelp-2-Gram-Small_hu_93816ee646e301fc.webp 768w,/2014/09/one-star-five-stars/Yelp-2-Gram-Small_hu_6c0c9d7f59903afe.webp 1024w,/2014/09/one-star-five-stars/Yelp-2-Gram-Small.jpg 1200w" src="Yelp-2-Gram-Small.jpg"/> 
</figure>

<p>The 5-star Yelp reviews contain many instances of &ldquo;Great&rdquo;, &ldquo;Good&rdquo;, and &ldquo;Happy&rdquo;. In contrast, the 1-star Yelp reviews use very little positive language, and instead discuss the amount of &ldquo;minutes,&rdquo; presumably after long and unfortunate waits at the establishment. (Las Vegas is one of the cities where the reviews were collected, which is why it appears prominently in both 1-star and 5-star reviews)</p>
<p>Looking at three-word phrases tells more of a story.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/Yelp-3-Gram-Small_hu_e46cca474f4fc455.webp 320w,/2014/09/one-star-five-stars/Yelp-3-Gram-Small_hu_d70187d5b3a7bd24.webp 768w,/2014/09/one-star-five-stars/Yelp-3-Gram-Small_hu_51480d3275b8941b.webp 1024w,/2014/09/one-star-five-stars/Yelp-3-Gram-Small.jpg 1200w" src="Yelp-3-Gram-Small.jpg"/> 
</figure>

<p>1-Star reviews frequently contain warnings for potential customers, which promises that the author will &ldquo;never go back&rdquo; and a strong impression that issues stem from conflicts with &ldquo;the front desk&rdquo;, such as those at hotels. 5-star reviews &ldquo;love this place&rdquo; and &ldquo;can&rsquo;t wait to&rdquo; go back.</p>
<p>Can this language be used to predict reviews?</p>
<h2 id="regression-of-language">Regression of Language</h2>
<p>To determine the causal impact on positive and negative words on the # of stars given in a review, we can perform a simple linear regression of stars on the number of positive words in the review, the number of negative words in the review, and the number of words in the review itself (since the length of the review is related to the number of positive/negative words; the longer the review, the more words)</p>
<p>A quick-and-dirty way to determine the number of positive/negative words in a given Yelp review is to compare each word of the review against a lexicon of positive/negative words, and count the number of review words in the lexicon. In this case, I use the <a href="http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html">lexicons compiled by UIC professor Bing Liu</a>.</p>
<p>Running a regression of # stars in a Yelp review on # positive words, # negative words, and # words in review, returns these results:</p>
<pre tabindex="0"><code>Coefficients:
               Estimate	 Std. Error  t value  Pr(&gt;|t|)
(Intercept)    3.692      1.670e-03  2210.0   &lt;2e-16 ***
pos_words      0.122      2.976e-04   411.3   &lt;2e-16 ***
neg_words     -0.154      4.887e-04  -315.9   &lt;2e-16 ***
review_words  -0.003      1.984e-05  -169.4   &lt;2e-16 ***


Residual standard error: 1.119 on 1125454 degrees of freedom
Multiple R-squared:  0.2589,	Adjusted R-squared:  0.2589
F-statistic: 1.311e+05 on 3 and 1125454 DF,  p-value: &lt; 2.2e-16
</code></pre><p>The regression output explains these things:</p>
<ul>
<li>If a reviewer posted a blank review with no text in it, that review gave an average rating of 3.692.</li>
<li>For every positive word, the predicted average star rating given is increased by 0.122 on average (e.g. 8 positive words indicate a 1-star increase)</li>
<li>For every negative word, the predicted average star rating given is decreased by 0.15 on average (e.g. 6-7 negative words indicate a 1-star decrease)</li>
<li>The amount of words in the review has a lesser, negative effect. (A review that is 333 words indicates a 1-star decrease, but the average amount of words in a Yelp review is 130 words)</li>
<li>This model explains 25.98% of the variation in the number of stars given in a review. This sounds like a low percentage, but is impressive for such a simple model using unstructured real-world data.</li>
</ul>
<p>All of these conclusions are <em>extremely</em> statistically significant due to the large sample size.</p>
<p>Additionally, you could rephrase the regression as a logistic classification problem, where reviews rated 1, 2, or 3 stars are classified as &ldquo;negative,&rdquo; and reviews with 4 or 5 stars are classified as &ldquo;positive.&rdquo; Then, run the regression to determine the likelihood of a given review being positive. Running this regression (not shown) results in a logistic model with up to <em>75% accuracy</em>, a noted improvement over the &ldquo;no information rate&rdquo; of 66%, which is the model accuracy if you just guessed that every review was positive. The logistic model also has similar conclusions for the predictor variables as the linear model.</p>
<p>It can be proven that language has a strong statistical effect on review ratings, but that&rsquo;s intuitive enough. How have review ratings changed?</p>
<h2 id="1-star-and-5-star-reviews-visualized">1-Star and 5-Star Reviews, Visualized</h2>
<p>Since 2005, Yelp has had incredible growth in the number of new reviews.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-time-series_hu_c7a13c976c5495e.webp 320w,/2014/09/one-star-five-stars/yelp-review-time-series_hu_a1b4db49122e2298.webp 768w,/2014/09/one-star-five-stars/yelp-review-time-series_hu_f6943ed84c603de9.webp 1024w,/2014/09/one-star-five-stars/yelp-review-time-series.png 1200w" src="yelp-review-time-series.png"/> 
</figure>

<p>For that chart, it appears that each of the five rating brackets have grown at the same rate, but that isn&rsquo;t the case. Here&rsquo;s a chart of the rating brackets showing how the proportions of new reviews of each rating have changed over time.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-time-proportion_hu_153ca7093861ad13.webp 320w,/2014/09/one-star-five-stars/yelp-review-time-proportion_hu_ae3ee3f52d98ee95.webp 768w,/2014/09/one-star-five-stars/yelp-review-time-proportion_hu_7953d250442d19e0.webp 1024w,/2014/09/one-star-five-stars/yelp-review-time-proportion.png 1200w" src="yelp-review-time-proportion.png"/> 
</figure>

<p>Early Yelp had mostly 4-star and 5-star reviews, as one might expect for an early Web 2.0 startup where the primary users who would be the only ones who would put in the effort to write a review would be those who had positive experiences. However, the behavior from 2010 onward is interesting: the relative proportions of both 1-star reviews <em>and</em> 5-star reviews increases over time.</p>
<p>As a result, the proportions of ratings in reviews from Yelp&rsquo;s beginning in 2005 and Yelp&rsquo;s present 2014 are incredibly different.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/Yelp-2005-2014_hu_39ad524a7eb9df70.webp 320w,/2014/09/one-star-five-stars/Yelp-2005-2014_hu_c9fa061ae2bda217.webp 768w,/2014/09/one-star-five-stars/Yelp-2005-2014_hu_395f9de928083b7c.webp 1024w,/2014/09/one-star-five-stars/Yelp-2005-2014.png 1600w" src="Yelp-2005-2014.png"/> 
</figure>

<p>More negativity, more positivity. Do they cancel out?</p>
<h2 id="how-positive-are-yelp-reviews">How Positive Are Yelp Reviews?</h2>
<p>We can calculate relative <strong>positivity</strong> between reviews by taking the number of positive reviews in a review and dividing it by the number of words in the review itself.</p>
<p>The average positivity among all reviews is <em>5.6%</em>. Over time, the positivity has been relatively flat.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-time-series-positivity_hu_f3209866d8404a7f.webp 320w,/2014/09/one-star-five-stars/yelp-review-time-series-positivity_hu_31c147031a1203e7.webp 768w,/2014/09/one-star-five-stars/yelp-review-time-series-positivity_hu_3f2e651879ad63c4.webp 1024w,/2014/09/one-star-five-stars/yelp-review-time-series-positivity.png 1200w" src="yelp-review-time-series-positivity.png"/> 
</figure>

<p>Flat, but still increasing, mostly likely due to the increasing proportion of 5-star reviews. But the number of 1-star reviews also increased: do the two offset each other?</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-positivity_hu_637131981fcef452.webp 320w,/2014/09/one-star-five-stars/yelp-review-positivity_hu_c921212195eb39d7.webp 768w,/2014/09/one-star-five-stars/yelp-review-positivity_hu_41cd19db408d2367.webp 1024w,/2014/09/one-star-five-stars/yelp-review-positivity.png 1200w" src="yelp-review-positivity.png"/> 
</figure>

<p>This histogram of positivity scores shows that 1-star reviews have lower positivity with rarely high positivity, and 5-star reviews rarely have low positivity and instead have very high positivity. The distribution for each star rating is close to a <a href="http://en.wikipedia.org/wiki/Normal_distribution">Normal distribution</a>, with each successive rating category peaking at increasing positivity values.</p>
<p>The relative proportion of each star rating reinforces this.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-positivity-density_hu_f23db88cd1ac716.webp 320w,/2014/09/one-star-five-stars/yelp-review-positivity-density_hu_76cfe57f52f13af.webp 768w,/2014/09/one-star-five-stars/yelp-review-positivity-density_hu_fc116b0474c61bb9.webp 1024w,/2014/09/one-star-five-stars/yelp-review-positivity-density.png 1200w" src="yelp-review-positivity-density.png"/> 
</figure>

<p>Over half of the 0% positivity reviews are 1-star reviews, while over three-quarters of the reviews at the highest positivity levels are 5-star reviews. (note that the 2-star, 3-star, and 4-star ratings are not as significant at either extreme)</p>
<h2 id="how-negative-are-yelp-reviews">How Negative Are Yelp Reviews?</h2>
<p>When working with the negativity of reviews, calculated by taking the number of negative words and dividing them by the number of total words in the review, the chart looks much different.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-time-series-negativity_hu_86f7acc4985c237b.webp 320w,/2014/09/one-star-five-stars/yelp-review-time-series-negativity_hu_1809307409787cc1.webp 768w,/2014/09/one-star-five-stars/yelp-review-time-series-negativity_hu_d07d1e1fdef155a9.webp 1024w,/2014/09/one-star-five-stars/yelp-review-time-series-negativity.png 1200w" src="yelp-review-time-series-negativity.png"/> 
</figure>

<p>The average negativity among all reviews is <em>2.0%</em>. Since the average positivity is 5.6%, this implies that the net sentiment among all reviews is positive, despite the increase in 1-star reviews over time.</p>
<p>The histogram of negative reviews looks much different as well.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-negativity_hu_d5d4e839efa115f7.webp 320w,/2014/09/one-star-five-stars/yelp-review-negativity_hu_757c09188bd04406.webp 768w,/2014/09/one-star-five-stars/yelp-review-negativity_hu_9b14213657352703.webp 1024w,/2014/09/one-star-five-stars/yelp-review-negativity.png 1200w" src="yelp-review-negativity.png"/> 
</figure>

<p>Even 1-star reviews aren&rsquo;t completely negative all the time.</p>
<p>The chart is heavily skewed right, making it difficult to determine the proportions of each rating at first glance.</p>
<p>Henceforth here&rsquo;s another proportion chart.</p>
<figure>

    <img loading="lazy" srcset="/2014/09/one-star-five-stars/yelp-review-negativity-density_hu_dba0956b28ad9c05.webp 320w,/2014/09/one-star-five-stars/yelp-review-negativity-density_hu_6c544b1696ff5283.webp 768w,/2014/09/one-star-five-stars/yelp-review-negativity-density_hu_60ad65564b53f2bd.webp 1024w,/2014/09/one-star-five-stars/yelp-review-negativity-density.png 1200w" src="yelp-review-negativity-density.png"/> 
</figure>

<p>At low negativity, the proportions of negative review scores (1-star, 2-stars, 3-stars) and positive review scores (4-stars, 5-stars) are about equal, implying that negative reviews can be just as civil as positive reviews. But high negativity is solely present in 1-star and 2-star reviews.</p>
<p>From this article, you&rsquo;ve seen that Yelp reviews with 5-star ratings are generally positive, and Yelp reviews with 1-star are generally negative. Yes, this blog post is essentially &ldquo;Pretty Charts Made By Captain Obvious,&rdquo; but what&rsquo;s important is confirmation of these assumptions. Language plays a huge role in determining the ratings of reviews, and that knowledge could be applied to many other industries and review websites.</p>
<h2 id="four-stars">Four Stars</h2>
<p>I&rsquo;d give this blog post a solid 4-stars. The content was great, but the length was long, although not as long as <a href="http://minimaxir.com/2014/06/reviewing-reviews/">some others</a>. Can&rsquo;t wait to read this post again!</p>
<hr>
<ul>
<li><em>Yelp reviews were preprocessed with Python, by simultaneously converting the data from JSON to a tabular structure, tokenizing the words in the review, counting the positive/negative words, and storing bigrams and trigrams in a dictionary to later be exported for creaitng word clouds.</em></li>
<li><em>All data analysis was performed using R, and a ll charts were made using ggplot2. <a href="http://www.pixelmator.com/">Pixelmator</a> was used to manually add relevant annotations when necessary.</em></li>
<li><em>You can view both the Python and R code used to process and chart the data <a href="https://github.com/minimaxir/yelp-review-analysis">in this GitHub repository</a>. Note that since Yelp prevents redistribution of the data, the code may not be reproducible.</em></li>
<li><em>You can download full-resolution PNGs of the two word clouds [5000x2000px] in <a href="https://www.dropbox.com/s/f20gwh9jvkibi4z/Yelp_Wordclouds_5000_200.zip?dl=0">this ZIP file</a> [18 MB]</em></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>The Interesting Percentages of Female Students in MIT and Harvard Online Courses</title>
      <link>https://minimaxir.com/2014/07/gender-course/</link>
      <pubDate>Fri, 04 Jul 2014 10:30:00 -0700</pubDate>
      <guid>https://minimaxir.com/2014/07/gender-course/</guid>
      <description>The proportion of female students in each of Harvard and MIT&amp;rsquo;s online courses range from 5% to 49%.</description>
      <content:encoded><![CDATA[<p>At the end of May, <a href="http://www.harvard.edu/">Harvard</a> and <a href="http://web.mit.edu/">MIT</a> jointly <a href="http://newsoffice.mit.edu/2014/mit-and-harvard-release-de-identified-learning-data-open-online-courses">released a dataset</a> containing statistics about their online courses in the Academic Year of 2013. This <a href="http://dx.doi.org/10.7910/DVN/26147">Person-Course De-Identified dataset</a> contains 476,532 students who have taken up to 13 unique courses from a variety of topics:</p>
<figure>

    <img loading="lazy" srcset="/2014/07/gender-course/mit-harvard-courses_hu_7940e4f3b6f7a13a.webp 320w,/2014/07/gender-course/mit-harvard-courses.png 560w" src="mit-harvard-courses.png"/> 
</figure>

<p>About half of the courses involve subjects in the humanities, while the other half involve computer science and electrical engineering.</p>
<p>One of the statistics I wanted to analyze was the gender ratio of students of online courses. In the data set, 425,105 students have a gender on record, with 311,534 male students (73.3%) and 113,571 female students (26.7%). This population proportion of female students is surprisingly low, especially since the male/female ratio is <a href="http://colleges.findthebest.com/q/1929/1270/What-is-the-male-to-female-ratio-at-Harvard-University">about 50:50</a> at MIT and Harvard themselves.</p>
<p>Therefore, I took a looked at the gender distribution of each of the 13 unique courses. Is the gender ratio similar across all classes, or is there a huge difference between classes?</p>
<figure>

    <img loading="lazy" srcset="/2014/07/gender-course/course-female_hu_8a3574152a2f4856.webp 320w,/2014/07/gender-course/course-female_hu_b147ce256fa08c5b.webp 768w,/2014/07/gender-course/course-female_hu_97ec908d28732b5f.webp 1024w,/2014/07/gender-course/course-female.png 1500w" src="course-female.png"/> 
</figure>

<p>Yeah, there&rsquo;s a huge difference.</p>
<p>The proportion of female students in each of Harvard and MIT&rsquo;s online courses range from <strong>5% to 49%</strong>.</p>
<p>The top half of the gender ratios are all well above the 26.7% threshold. All six of these courses are in the humanities or in the life sciences. The bottom half of the gender ratio are all well below the 26.7% threshold. All seven are these courses are engineering or computer science courses with a strong focus on mathematics. (for clarification, the <a href="https://www.edx.org/course/mitx/mitx-2-01x-elements-structures-1759#.U7ZfKvldV8F">Elements of Structures</a> course at MIT is a physics course with linear algebra programming)</p>
<p>Is there a correlation? As it turns out, the reason that the average proportion of female students is so low is that both Harvard&rsquo;s Introduction to Computer Science I (where 169,621 students took the class; about 40% of all students) and MIT&rsquo;s Introduction to CS/Programming (124,446 students total across both semesters) are so popular that the low percentage of women in those particular classes is drastically affecting the average.</p>
<p>The presence and interest of <a href="http://www.whitehouse.gov/administration/eop/ostp/women">women in STEM fields</a> (science, technology, engineering, and mathematics) has been a topic of <a href="http://www.huffingtonpost.com/stella-kasdagli/should-women-avoid-jobs-in-stem_b_5549016.html">controversy</a> for a very long time. However, the chart shows that indeed the percentage of women interested in STEM classes is measurably lower than other fields, and hopefully awareness of this issue will help cause changes in the future.</p>
<hr>
<ul>
<li><em>Data was processed using R and the chart was made using ggplot2. (w/ a few annotations added using a photo editor)</em></li>
<li><em>You can view code necessary to reproduce these results in <a href="https://github.com/minimaxir/gender-course">this GitHub repository</a>. Since MIT/Harvard prevent redistribution of the dataset, you&rsquo;ll have to <a href="http://dx.doi.org/10.7910/DVN/26147">download the dataset</a> yourself.</em></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>A Statistical Analysis of 1.2 Million Amazon Reviews</title>
      <link>https://minimaxir.com/2014/06/reviewing-reviews/</link>
      <pubDate>Tue, 17 Jun 2014 08:20:00 -0700</pubDate>
      <guid>https://minimaxir.com/2014/06/reviewing-reviews/</guid>
      <description>Analyzing the dataset of 1.2 million Amazon reviews, I found some interesting statistical trends; some are intuitive and obvious, but others give insight to how Amazon&amp;rsquo;s review system actually works.</description>
      <content:encoded><![CDATA[<p>When buying the latest products on <a href="http://www.amazon.com/">Amazon</a>, reading reviews is an important part of the purchasing process.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/ore_hu_a023cb91d2d5bbec.webp 320w,/2014/06/reviewing-reviews/ore.png 554w" src="ore.png"/> 
</figure>

<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amazon-review_hu_e37f25bba24a903e.webp 320w,/2014/06/reviewing-reviews/amazon-review.png 495w" src="amazon-review.png"/> 
</figure>

<p>Customer reviews from customers who have actually purchased and used the product in question can give you more context to the product itself. Each reviewer rates the product from 1 to 5 stars, and provides a text summary of their experiences and opinions about the product. The ratings for each product are averaged together in order to get an overall product rating.</p>
<p>The number of reviews on Amazon has grown over the years.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-basic-time-count_hu_8c9a16a5c5892b45.webp 320w,/2014/06/reviewing-reviews/amzn-basic-time-count_hu_9ed51550cf6967d7.webp 768w,/2014/06/reviewing-reviews/amzn-basic-time-count_hu_5718b80f7ce8a708.webp 1024w,/2014/06/reviewing-reviews/amzn-basic-time-count.png 1200w" src="amzn-basic-time-count.png"/> 
</figure>

<p>But how do people write reviews? What types of ratings do reviewers give? How many of these reviews are considered helpful?</p>
<p>Stanford researchers Julian McAuley and Jure Leskovec collected <a href="https://snap.stanford.edu/data/web-Amazon.html">all Amazon reviews </a>from the service&rsquo;s online debut in 1995 to 2013. Analyzing the dataset of 1.2 million Amazon reviews of products in the Electronics section, I found some interesting statistical trends; some are intuitive and obvious, but others give insight to how Amazon&rsquo;s review system actually works.</p>
<h2 id="describing-the-data">Describing the Data</h2>
<p>First, let&rsquo;s see how the user ratings are distributed among the reviews.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-basic-score_hu_59c031b5274368e7.webp 320w,/2014/06/reviewing-reviews/amzn-basic-score_hu_6eb4ff83d005ab3.webp 768w,/2014/06/reviewing-reviews/amzn-basic-score_hu_c6472b19e29a9fe6.webp 1024w,/2014/06/reviewing-reviews/amzn-basic-score.png 1200w" src="amzn-basic-score.png"/> 
</figure>

<p>More than half of the reviews give a 5-star rating. Aside from perfect reviews, most reviewers give 4-star or 1-star ratings, with <em>very</em> few giving 2-stars or 3-stars relatively.</p>
<p>As as result, the statistical average for all review ratings is on the high-end of the scale at about <strong>3.90</strong>. In fact, the average review rating for newly-written reviews has varied from 3.4 to 4.2 over time.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-basic-time-rating_hu_52cea30c7eeb25e2.webp 320w,/2014/06/reviewing-reviews/amzn-basic-time-rating_hu_2957bab5d470c910.webp 768w,/2014/06/reviewing-reviews/amzn-basic-time-rating_hu_6a7aecd8bbdede75.webp 1024w,/2014/06/reviewing-reviews/amzn-basic-time-rating.png 1200w" src="amzn-basic-time-rating.png"/> 
</figure>

<p>Another metric used to measure reviews is review helpfulness. Other Amazon reviewers can rate a particular review as &ldquo;helpful&rdquo; or &ldquo;not helpful.&rdquo; A &ldquo;review helpfulness&rdquo; statistic can be calculated by taking the number of &ldquo;is-helpful&rdquo; indicators divided by the total number of is-helpful/is-not-helpful indicators (in the example at the beginning of the article, 639/665 people found the review helpful, so the helpfulness rating would be 96%). This gives an indication of review quality to a prospective buyer. Only 10% of the reviews had atleast 10 is-helpful/is-not-helpful data points, and of those reviews, the vast majority of the reviews had perfect helpfulness scores.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-basic-helpful_hu_6374fe7678c45848.webp 320w,/2014/06/reviewing-reviews/amzn-basic-helpful_hu_fa5de8d1b024cf5.webp 768w,/2014/06/reviewing-reviews/amzn-basic-helpful_hu_b43d4f9b8739f463.webp 1024w,/2014/06/reviewing-reviews/amzn-basic-helpful.png 1200w" src="amzn-basic-helpful.png"/> 
</figure>

<p>That would make sense; if you&rsquo;re writing a review (especially a 5 star review), you&rsquo;re writing with the intent to help other prospective buyers.</p>
<p>Another consideration is review length. Do reviews frequently write essays, or do reviews typically write a single paragraph?</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-basic-length_hu_47c45b6e84c3e877.webp 320w,/2014/06/reviewing-reviews/amzn-basic-length_hu_36ecbd0ae249486f.webp 768w,/2014/06/reviewing-reviews/amzn-basic-length_hu_42feb4443896b5f3.webp 1024w,/2014/06/reviewing-reviews/amzn-basic-length.png 1200w" src="amzn-basic-length.png"/> 
</figure>

<p>Most reviews are 100-150 characters, but the average amount of characters in a review is about <strong>582</strong> (there are some outlier reviews with 30,000+ characters!). Assuming that the average amount of characters in a paragraph <a href="http://wiki.answers.com/Q/How_many_characters_does_the_average_paragraph_have">is 352</a>, reviewers typically write about half a paragraph. Interestingly, reviews are rarely less than a sentence. (the <a href="http://www.amazon.com/gp/community-help/customer-reviews-guidelines">Review Guidelines</a> suggest a minimum of 20 words in a review, so this discrepancy could be attributed to moderator removal of short, one-liner reviews)</p>
<h2 id="particularizing-the-products">Particularizing the Products</h2>
<p>The 1.2 million reviews in the Electronics data set address about 82,003 distinct products. However, most of those entries represent different SKUs of the same product (e.g. different colors of headphones). Of those products, only 30,577 products have pricing information which identify them as the source product.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-product-price_hu_58971098a111aa36.webp 320w,/2014/06/reviewing-reviews/amzn-product-price_hu_3ef361e65687d666.webp 768w,/2014/06/reviewing-reviews/amzn-product-price_hu_89981cc6ca8be307.webp 1024w,/2014/06/reviewing-reviews/amzn-product-price.png 1200w" src="amzn-product-price.png"/> 
</figure>

<p>Over 2/3rds of Amazon Electronics are priced between $0 and $50, which makes sense as popular electronics such as television remotes and phone cases are not extremely expensive. However, there&rsquo;s no statistical correlation between the price of a product and the number of reviews it receives.</p>
<p>For the overall rating of a particular product, which is the average rating of all reviews for that product, the ratings are no longer limited to discrete numbers between 1 and 5, and can take decimal values between those numbers as well. The distribution of product ratings is similar to the distribution of review ratings.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-product-rating_hu_1521ed87824a3e14.webp 320w,/2014/06/reviewing-reviews/amzn-product-rating_hu_bbcde884366ab6c2.webp 768w,/2014/06/reviewing-reviews/amzn-product-rating_hu_f305fe55ecfa3298.webp 1024w,/2014/06/reviewing-reviews/amzn-product-rating.png 1200w" src="amzn-product-rating.png"/> 
</figure>

<p>Again, the perfect rating of 5 is most popular for products. This distribution resembles the distribution of scores of all reviews for the discrete rating values, but this view reveals local maxima at the midpoint between each discrete value. (i.e. 3-and-a-half stars and 4-and-a-half stars are surprisingly common ratings)</p>
<p>What happens when you plot product rating and product price together?</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-product-score-price_hu_b271b61ddc8a67b5.webp 320w,/2014/06/reviewing-reviews/amzn-product-score-price_hu_5448fe68fcfc3bb8.webp 768w,/2014/06/reviewing-reviews/amzn-product-score-price_hu_65b3de5328ae68dd.webp 1024w,/2014/06/reviewing-reviews/amzn-product-score-price.png 1200w" src="amzn-product-score-price.png"/> 
</figure>

<p>The most expensive products have 4-star and 5-star overall ratings, but not 1-star and 2-star ratings. However, the correlation is very weak. (r = 0.04)</p>
<p>In contrast, the relationship between product price and the average <em>length</em> of reviews for the product is surprising.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-product-price-length_hu_31310358eb50b709.webp 320w,/2014/06/reviewing-reviews/amzn-product-price-length_hu_cde21bf380b44ae5.webp 768w,/2014/06/reviewing-reviews/amzn-product-price-length_hu_c8de453ae8e3e19d.webp 1024w,/2014/06/reviewing-reviews/amzn-product-price-length.png 1200w" src="amzn-product-price-length.png"/> 
</figure>

<p>This relationship is logarithmic with a relatively good correlation (r = 0.29), and it shows that reviewers put more time and effort into reviewing products which are worth more.</p>
<h2 id="reviewing-the-reviewers">Reviewing the Reviewers</h2>
<p>As you might expect, most people leave only 1 or 2 reviews on Amazon, but some have left <em>hundreds</em> of reviews. Out of 1.2 Million reviews, there are 510,434 distinct reviewers.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-reviewer-count_hu_6f6049ea689edace.webp 320w,/2014/06/reviewing-reviews/amzn-reviewer-count_hu_347234ec3a8db6e6.webp 768w,/2014/06/reviewing-reviews/amzn-reviewer-count_hu_d3bd3fe21815e2a9.webp 1024w,/2014/06/reviewing-reviews/amzn-reviewer-count.png 1200w" src="amzn-reviewer-count.png"/> 
</figure>

<p>Over 80% of the reviewers of Amazon electronics left only 1 review. Analyzing reviewers who have left only 1 review is not helpful statistically, so for the rest of the analysis, only reviews who have made 5 or more reviews (which have received atleast 1 is-helpful/is-not-helpful indicator) will be considered. This makes it much easier to get the overall profile of a reviewer. 11,676 reviewers fit this criteria.</p>
<p>Do repeat Amazon users tend to give 5-star reviews?</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-reviewer-score_hu_732050183c11716.webp 320w,/2014/06/reviewing-reviews/amzn-reviewer-score_hu_8ebfb0af49fa84ed.webp 768w,/2014/06/reviewing-reviews/amzn-reviewer-score_hu_cbe8342aed8a7b4b.webp 1024w,/2014/06/reviewing-reviews/amzn-reviewer-score.png 1200w" src="amzn-reviewer-score.png"/> 
</figure>

<p>Distribution of review ratings when averaged across is similar to the other distributions of review ratings. However, this distribution is less skewed toward 5-stars and is more uniform between 4-stars and 5-stars.</p>
<p>What about the average helpfulness of the reviews written by a single reviewer? If a reviewer has enjoyed Amazon enough such that they make 5 or more reviews, chances are that their reviews are high quality.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-reviewer-helpfulness_hu_3396b6f6f7c36442.webp 320w,/2014/06/reviewing-reviews/amzn-reviewer-helpfulness_hu_709e280dd4ad021b.webp 768w,/2014/06/reviewing-reviews/amzn-reviewer-helpfulness_hu_42e3979cc26ce7cd.webp 1024w,/2014/06/reviewing-reviews/amzn-reviewer-helpfulness.png 1200w" src="amzn-reviewer-helpfulness.png"/> 
</figure>

<p>Again, the data is slightly skewed. 8% of the reviewers have perfect helpfulness scores on all their reviews, and the average helpfulness score for all repeat reviews is 80%. Interestingly, a few repeat reviewers have average helpfulness scores of 0.</p>
<p>If you plot <em>both</em> average score and average helpfulness in a single chart, the picture becomes much more clear:</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-reviewer-count-score_hu_d5f52b74f0303f85.webp 320w,/2014/06/reviewing-reviews/amzn-reviewer-count-score_hu_c2ce7b3548ec0cce.webp 768w,/2014/06/reviewing-reviews/amzn-reviewer-count-score_hu_d9432e36b058bedc.webp 1024w,/2014/06/reviewing-reviews/amzn-reviewer-count-score.png 1200w" src="amzn-reviewer-count-score.png"/> 
</figure>

<p>As the chart shows, there&rsquo;s a good positive correlation (r = 0.27) between rating and helpfulness, with a discernible cluster at the top. However, I don&rsquo;t think it&rsquo;s a causal relationship. Reviewers who give a product a 4 - 5 star rating are more passionate about the product and likely to write better reviews than someone who writes a 1 - 2 star &ldquo;this product sucks and you suck too!&rdquo; review.</p>
<p>Another interesting bivariate relationship is the relationship between the helpfulness of a review and the length of a review). Stereotypically, you might think that longer reviews are more helpful reviews. And in the case of Amazon&rsquo;s Electronics reviews, you&rsquo;d be correct.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-reviewer-helpful-length_hu_130d8f3444e197a4.webp 320w,/2014/06/reviewing-reviews/amzn-reviewer-helpful-length_hu_93c5104c96670a15.webp 768w,/2014/06/reviewing-reviews/amzn-reviewer-helpful-length_hu_7f33170521ce93ef.webp 1024w,/2014/06/reviewing-reviews/amzn-reviewer-helpful-length.png 1200w" src="amzn-reviewer-helpful-length.png"/> 
</figure>

<p>Again, there&rsquo;s a good positive correlation (r = 0.26) between average helpfulness and average length, which the trend line supports. (the dip at the end is caused by the high amount of low-character reviews). All the longer reviews have high helpfulness; there are very, very few unhelpful reviews that are also long.</p>
<h2 id="completing-the-conclusion">Completing the Conclusion</h2>
<p>The reviews on Amazon&rsquo;s Electronics products very frequently rate the product 4 or 5 stars, and such reviews are almost always considered helpful. 1-stars are used to signify disapproval, and 2-star and 3-stars reviews have no significant impact at all. If that&rsquo;s the case, then what&rsquo;s the point of having a 5 star ranking system at all if the vast majority of reviewers favor the product? Would Amazon benefit if they made review ratings a binary like/dislike?</p>
<p>Having a 5-star system can allow the prospective customer to make more informed comparisons between two products: a customer may be more likely to buy a product that&rsquo;s rated 4.2 stars than a product that is rated 3.8 stars, which is a subtlety that can&rsquo;t easily be emulated with a like/dislike system. Likewise, if products are truly bad, the propensity toward 5-star reviews can help obfuscate the low quality of the product when a like/dislike system would make the low quality more apparent.</p>
<p>Unfortunately, only Amazon has the data that would answer all these questions.</p>
<p>Of course, there are many other secrets to be uncovered from Amazon reviews. The Stanford professors who collected the initial data used <a href="http://i.stanford.edu/~julian/pdfs/recsys13.pdf">machine learning techniques on the review text</a> to predict the rating given by a review from just the review text itself. Other potential topics for analysis are comparisons between <em>types</em> of Electronics (e.g. MP3 players, headphones) or using natural language processing to determine the common syntax in reviews.</p>
<figure>

    <img loading="lazy" srcset="/2014/06/reviewing-reviews/amzn-word-review-start_hu_d1ceead5636a4804.webp 320w,/2014/06/reviewing-reviews/amzn-word-review-start_hu_5932602b953da6be.webp 768w,/2014/06/reviewing-reviews/amzn-word-review-start_hu_d484e026176d66f7.webp 1024w,/2014/06/reviewing-reviews/amzn-word-review-start.png 1200w" src="amzn-word-review-start.png"/> 
</figure>

<p>That&rsquo;s a topic for another blog post. :)</p>
<hr>
<ul>
<li><em>Data analysis was performed using R, and all charts were made using ggplot2.</em></li>
<li><em>You can download a ZIP file containing CSVs of the time series, the aggregate product data, and the anonymized aggregate reviewer data <a href="https://dl.dropboxusercontent.com/u/2017402/amazon_data.zip">here</a>.</em></li>
<li><em>No, I have no relation to &ldquo;<a href="http://www.amazon.com/review/R1KHEP16MXXWCN/ref=cm_cr_rdp_perm?ie=UTF8&amp;ASIN=B000796XXM">M. Wolff</a>&rdquo;.</em></li>
</ul>
]]></content:encoded>
    </item>
  </channel>
</rss>
