This article is the second in a series on problems with sentiment analysis – describing common pitfalls and difficulties that need to be understood in order to correctly use these tools/models. Enjoy!
The previous article in this series offered a brief overview of sentiment analysis and the kinds of datasets we work with. This overview highlighted some of the problems that occur when your model trains on data significantly different from the data you’ll actually make predictions about – for example, using Yelp restaurant reviews to predict Amazon pest control product reviews (“Dead bugs all over!” means two very different things).
This article will deal with problems inherent to text itself. I’ll give a few quick definitions before we dive into actual examples. Continue reading “Problems with sentiment analysis: spam, sarcasm, and human error”
This article is the first in a series on problems with sentiment analysis – describing common pitfalls and difficulties that need to be understood in order to correctly use these tools/models. Enjoy!
Short background to sentiment models
A classical sentiment model learns the sentiment value of given words. For example, “FANTASTIC” is generally positive (so it’d have a high sentiment score) and “WORST” is usually negative (meaning a low sentiment score). The model then combines those words to form an overall sentiment score. A document with lots of negative words should probably have a negative score and the opposite is also true. Continue reading “Problems with sentiment analysis: Domain”
One popular topic in Computer Science (among other disciplines) is whether professors are out-of-touch with “real world” programming. Those who argue for less academics in CS education claim that university degrees prepare students poorly for industrial roles. Plagued by an environment in which theory and research are preferred over practical skills for daily software development, these students enter the workforce ill-equipped to write professional software. Continue reading “Should academics or professionals teach programmers?”
So you’d like to interview for a programming position. Congratulations! You will find there are many helpful methods to make the experience as arbitrary and frustrating as possible. Continue reading “How to interview programmers badly”
The modern anti-competition potential of the Internet Service Provider is nuclear. Far from merely an industry-siloed cartel, they control both the products and the means of discovery.
Today the FCC voted to overturn Obama-era policy that prevented Internet Service Providers from blocking or slowing access to certain web content. Chairman Ajit Pai, who spearheaded the successful campaign, argued that the repealed rules stifled competition and represented government interference in the otherwise free market. “The internet wasn’t broken in 2015. We weren’t living in a digital dystopia. To the contrary, the internet is perhaps the one thing in American society we can all agree has been a stunning success.” Continue reading “Free markets: the economic and technical arguments for strong net neutrality”
In August 2015, The Economist published an article entitled “Automation angst
” in which they explored the dichotomy of feelings about automation – one side representing the thrill of cheaper production and the other warning of an impending existential crisis. When repetitive human labor is replaced, do the laborers feel better off?
Continue reading “What is automation?”
Every once in awhile, there are really big ideas in academia, ideas that change the way we think about the world. Nash equilibrium is one of those ideas.
John Nash wrote about games where people make decisions based on the way they think other people will behave, eventually reaching an equilibrium where no individual can improve their own situation by changing. This equilibrium, however, does not mean that the entire group has achieved an optimal result.
Continue reading “Nash Equilibrium and Graph Theory”
Despite Stephen Curry’s recent downturn that cost his team the NBA finals, he remains the most prolific 3pt shooter of our day. With 482 three-pointers made in the most recent season, he eclipses both second (teammate Klay Thompson at 374) and third place (Damian Lillard at 271). This makes the Golden State Warriors a pain in the neck to defend – they score from further out than any other team in the league. So how does everybody else stay competitive? They play like Steph Curry.
Continue reading “2016: The Year the NBA Played Like Steph Curry”
This guide is intended to be a very unsophisticated, very broad overview of the most basic kind of sentiment analysis. You can use this to get results fast, but they’ll be dirty results. I’ll begin by throwing out the broad outline and then address several problems. We’ll begin with the basic steps: (1) Seeding, (2) Training, and (3) Evaluation.
For our purposes, we’re going to assume that all texts have a sentiment somewhere between 0 and 1 where 0 is very negative and 1 is very positive. A neutral text has a sentiment score of 0.5 under this system.
Continue reading “A primer on Naive Bayes for sentiment analysis”
LeBron James is largely considered one of the best NBA athletes of the past decade, consistently scoring about 0.8 points per minute in play (for comparison, Kobe Bryant scored 0.62 and Blake Griffin scored 0.63 in their last 55 games). He’s also one of the best-paid athletes, contracted for 24 million USD in the 2016-2017 season. Last year, Quora user Shane Hiller calculated that LeBron makes about $107 per second of gameplay. That’s a lot of money, and a good coach should try and maximize LeBron’s performance. Continue reading “Does Lebron James play worse when he’s tired?”