In the burgeoning field of data science, the volume of tools and resources that are being produced and the velocity of which they are evolving is simply astounding. Every day we have a new toolkit or novel technique in working with data. But how much of these advancements are actually benefiting people and organizations who wants to solve real-world problems they are facing?

As a data scientist, I have seen many cases where people are stuck at the beginning: how should I translate my issue into a data problem? In other words, how should I measure the phenomenon of interest to collect relevant data? In my view, given recent advancement, applying appropriate tools and techniques is relatively straightforward once you have right data on hand.

Measurement can be defined as the quantification of real-world phenomenon into numbers. In some cases there are familiar measures such as age or temperature. However, oftentimes defining an appropriate measure is a non-trivial issue, because real-world is multifaceted and not everything is directly observable.

Let’s take an example of user satisfaction for content-oriented online service such as blog. While you can observe what users do on your website, interpreting these behavior to measure their satisfaction is not straightforward. Should we assume that users are happier if they click more articles and spend longer time on your website? It depends on who they are and what they were trying to do.