The 25 Meanings behind Favoriting on Twitter

I found an interesting paper describing the 25 possible motivations for clicking the “Favorite” button on someone’s tweet:

More than Liking and Bookmarking? Towards Understanding Twitter Favouriting Behaviour

The team that came up with the fav button probably did not anticipate all of these uses. I’m guessing they probably thought of 3 or 4 at most. This is interesting because it stresses the importance of studying user behavior and motivation. Changes to the behavior of the fav button will affect many of these motivations in unpredictable ways.

I wonder if Facebook’s “Like” button has similar connotations. I’d like to see a study comparing similar functionality across multiple social media sites.

I found this paper when reading this post on Medium, which is interesting in itself:

What’s Wrong with Twitter’s Latest Experiment with Broadcasting Favorites: It Steps over Social Signals While Looking for Technical Solutions

A hidden gem in Manning and Schutze: what to call 4+-grams?


I’m a longtime fan of Chris Manning and Hinrich Schutze’s “Foundations of Natural Language Processing” — I’ve learned from it, I’ve taught from it, and I still find myself thumbing through it from time to time. Last week, I wrote a blog post on SXSW titles that involved looking at n-grams of different lengths, including unigrams, bigrams, trigrams and … well, what do we call the next one up? Manning and Schutze devoted an entire paragraph to it on page 193 which I absolutely love and thought would be fun to share for those who haven’t seen it.

Before continuing with model-building, let us pause for a brief interlude on naming. The cases of n-gram language models that people usually use are for n=2,3,4, and these alternatives are usually referred to as a bigram, a trigram, and a four-gram model, respectively. Revealing this will surely be enough to…

View original post 235 more words

Detecting Social Power in Written Dialog

This semester, I’ll be working on a project at Columbia CCLS with Prof. Owen Rambow and Vinod Prabhakaran. It falls under the broad category of discourse analysis. Specifically, we’ll be looking into detecting displays of power from email threads and discussion boards. Here’s some prior work that explains the subject better:

Written Dialog and Social Power: Manifestations of Different Types of Power in Dialog Behavior

Extracting social meaning from text analysis is an interesting subject, and I’m excited to get started on it.

How I Start

I recently found this website on Hacker News called “How I Start.” It’s a series of tutorials for learning new programming languages. Right now, there are only a few, but I think they’re expanding to add more.

I’m learning Go for one of my projects this semester, so their tutorial for Go is very useful:

How I Start. – Go (with Peter Bourgon)

He’s setting up a web server, querying a weather API, and displaying some results.

I’ll be setting up an identical toy webserver soon, and I’ll post my results here.


It seems that Satoshi Nakamoto has been hacked or somehow compromised. This led me to learn more about Bitcoin and its history so far.

Like many others, I’m kicking myself for not buying a few Bitcoins back in 2009. I was a college student then, and I had no disposable income. Still, I was considering buying one or two, just to walk myself through the process and use an exchange or a client. Maybe it’s a good thing I didn’t get involved in it at all, since its legality and widespread acceptance are still big issues.

Anyway, I’m trying to drown out that regret by satisfying my fascination with the concept of cryptocurrency.

First, here’s the original whitepaper about Bitcoin:

Bitcoin: A Peer-to-Peer Electronic Cash System

This Bitcoin series on Khan Academy is a good introduction as well:

Bitcoin: What Is It?

This Wikipedia article has some interesting background:

History of Bitcoin

And finally, take a look at these photos taken at a Bitcoin farm:

Gallery: Inside a Top Bitcoin Mine in China


Just a quick productivity tip today.

Noise is probably my #1 distractor. I find it very difficult to concentrate in noisy environments. Moving from Seattle to New York, it has definitely become even more of a problem.

But ambient noise, with some soft instrumental music layered on top, is the best way to protect myself from noisy distractions. Here’s what I use for ambient noise:


This page is one of the pinned tabs on my browser. I like to mix thunder, rain (at a low volume), and the crackling fire. The rain noise gives the right amount of noise at the right frequencies. Plus, the thunder and fire give it some texture. I find this combination far superior to simple white noise, which actually gives me a headache.

I haven’t used the visuals or the text editor, so I can’t speak for their effectiveness. But I recommend that you try out Noisli’s ambient noise features.

Character encoding

Every software developer needs to know the basics of character encoding. However, I find it a very dry and dull topic. So here are some entertaining introductions to it.

First, a video explaining Unicode, UTF-8, and its elegance.

Now, read this popular Joel on Software blog post:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)