The impact of words. So strong that even if they’re not directed to us, they can change how we feel. Movies bring a lot of emotions to us spectators, with all the sceneries, the action, the history, the characters — but what about the impact of the words in movies? I did a small analysis and I found out some interesting things.

The analysis consists in 617 movies that range from the 20’s to 2010. Adding up to 220,579 fictional conversations and more than 300,000 utterances. Since we want to make the visualization interesting, I had to remove some words that are considered stop words. In computing, stop words are words which have very little meaning, such as “and”, “the”, “a”, “I”, and are filtered out before or after processing of natural language data. Let’s see the result of our first try (if you want to see the data for your favorite movie, just use the filter):

Hm, ok, that’s interesting, but that doesn’t say a lot by itself, right? So I decided to download the Opinion Lexicon, available in a study made by the University of Illinois at Chicago. These are lists of positives and negatives words that will let us classify the previous words by sentiment.

Positive or negative words? Who will triumph?

Good, or bad? Love or death? The visualization below says a lot more now that we have the feelings associated. Go ahead and don’t forget you can filter the values per year or genre.

In general, it seems the good keeps winning against the bad, even when we’re talking about words in movies. But was it always like that? So I decided to create a last visualization by putting all the good and bad words in a timeline and see if, at some point, more negative words were said than the positive words.

The view below shows the average number of words per film over the years and the difference between them. Since some of the years only have one or two movies, I added a pre-filter to show only years with 3 or more movies for a better sample. Again, you can click on the positive/negative legend to highlight the dimension, and you can also filter to see the trend for a specific movie genre.

Is the trend changing?

It is clear that for a better conclusion the sample would need to be bigger, but if we can take a guess, this visualization shows a trend change starting at the 90’s. Negative words surpassed the positive ones during almost two decades, possibly related to the constant rise in popularity of action movies and the slightly decline in comedy.

I personally believe we will see this trend changing again soon. Director Tom Six said in a recent interview that our society is currently living in a politically correct time and that “the movie world only makes play-it-safe movies now”. Maybe the positive words will triumph once again? I guess we need to wait and see.


The dataset was made available in a study by the Cornell University, written by Cristian Danescu-Niculescu-Mizil and Lillian Lee.

