Filler words and function words

Today, I found this interesting article on NPR:

Our Use Of Little Words Can, Uh, Reveal Hidden Interests

Here’s a short excerpt:

“When two people are paying close attention, they use language in the same way,” he says. “And it’s one of these things that humans do automatically.”

Pennebaker has counted words to better understand lots of things. He’s looked at lying, at leadership, at who will recover from trauma.”

Here is Prof. Pennebaker’s web page, discussing some of the details of his findings:

The World of Words

An excerpt:

Style-related words can also reveal basic social and personality processes, including:

  • Lying vs telling the truth. When people tell the truth, they are more like to use 1st person singular pronouns. They also use more exclusive words like except, but, without, excluding. Words such as this indicate that a person is making a distinction between what they did do and what they didn’t do. Liars have a problem with such complex ideas.
  • Dominance in a conversation. Analyze the relative use of the word “I” between two speakers in an interaction. Usually, the higher status speaker will use fewer “I” words.
  • Social bonding after a trauma. In the days and weeks after a cultural upheaval, people become more self-less (less use of “I”) and more oriented towards others (increased use of “we”).
  • Depression and suicide-proneness. Public figures speaking in press conferenecs and published poets in their poetry use more 1st person singular when they are depressed or prone to suicide.
  • Testosterone levels. In two case studies, it was found that when people’s testosterone levels increased rapidly, they dropped in their use of references to other people.
  • Basic self-reported personality dimensions. Multiple studies are now showing that style-related words do much better than chance at distinguishing people who are high or low in the Big Five dimensions of personality: neuroticism, extraversion, openness, agreeableness, and conscientiousness.
  • Consumer patterns. By knowing people’s linguistic styles, we are able to predict (at reasonable rates), their music and radio station preference, liking for various consumer goods, car preferences, etc.
  • And much, much more.

And finally, here’s a link to the paper published in The Journal of Language and Social Psychology:

Um . . . Who Like Says You Know: Filler Word Use as a Function of Age, Gender, and Personality

I find it fascinating that they were able to extract this information without using any complicated analysis of syntax, as far as I can tell.

I played with the free-to-use, public version of LIWC. It seems this gives you some results of the analysis, without drawing any conclusions from it. I fed it the “I Have A Dream” speech by Martin Luther King, Jr. Here were my results:

Details of Writer: 34 year old Male
Date/Time: 1 September 2014, 2:43 pm

LIWC Dimension Your
Self-references (I, me, my) 4.08 11.4 4.2
Social words 6.58 9.5 8.0
Positive emotions 3.74 2.7 2.6
Negative emotions 0.79 2.6 1.6
Overall cognitive words 2.27 7.8 5.4
Articles (a, an, the) 8.50 5.0 7.2
Big words (> 6 letters) 18.03 13.1 19.6

The text you submitted was 882 words in length.

The numbers don’t have units, so I’m not sure how I’m supposed to interpret them. Nonetheless, it’s interesting to compare the contents of the speech to “personal” and “formal” texts in relative terms, I suppose.

I looked around on the Internet, and found a Reddit comment referring to the work of Fairclough, Van Dijk and Wodak.

Here’s one article about Critical Discourse Analysis, the category this type of study falls under: Teun A. Van Dijk – Critical Discourse Analysis. From that article:

Critical analysis of conversation is very different from an analysis of news reports in the press or of lessons and teaching at school.

This might be a reason why I Have A Dream might not have been a good example to use to play with the LIWC tool.