New Product Development

Oil and Diamonds in Online Communities

A common pillar of research aimed at predicting market acceptance of new products is represented by the identification of the “Voice of the Customer”. In another post we suggested that the term “Voice of the User” should be rather adopted. Actually the current emphasis on customer rather than user is probably linked to some limits of traditional research on new product development. Indeed, when it’s time to submit surveys  or organize focus groups it is a common practice to select participants among the customers of the target brand. Not that it is always the case, but it’s definitely easier. Hence the term “Voice of the Customer”.

When we come to new product development the fact of listening at customers rather than users manifests a “submissive” attitude: the fact that the new product is targeted to keep current customers rather than conquering new ones.

The concept behind the “Voice of the User” is different: you state that your future product is not only strengthening the fidelity of your customers, but it is targeted to acquire new clients and even to pick up in your competitors customer base.

However plain, unqualified users are difficult to be involved in market research, at least under the traditional paradigm of surveys and focus groups. Hence the need of capturing their voice (their attitudes) in an unsolicited way by analyzing the places where they spontaneously talk about products: we will call these places “sources”.

One of the primary sources for detecting unsolicited user attitudes are forums. Their spontaneous, bottom up nature makes them ideal candidates for market research on new product development. Moreover they are in general composed of really passionate  users which give origin to self monitored and self managed groups where information noise (such as the one introduced by fake comments) is automatically banned by the community.

What is even more interesting in forums is that one can harvest two different kind of information, which we call “oil” and “diamonds”. Let’s remind the fact that we are interested into information about a specific product or “thing”. More precisely we are interested in mining information that allows us to predict which features of our new product will trigger market acceptance. Let’s now consider an online community such as the one associated to the web site of Wine Spectator. It is a quite active community, enumerating, at the time of writing of this post, about 654,000 posts in the two most populated forum of the site. The structure of the vaste majority of these posts is similar: I buy a wine, I taste it, and I share my impressions (and notes) with other member of my community.

It is what we call a “dense” social media: as it is evident from the conceptual tag cloud associated to the 654k posts, the product or thing, namely wine,  is really the central discussion item.  As such, each discussion takes the form of a kind of “collective review”, where the density of features is very high and where basically any adjective used in a post can be associated to a specific wine:

Conversely, let’s take the example of forums of bikers, and let’s assume that we want to investigate the impact of certain features concerning bikes.  We took to this purpose the three biggest french forum of passionate biker (both street bikes and mountain bikes), namely  Velo VertVelo 101 and Velo de Route. Out of  about five millions sentences, only half a million contains a direct or indirect reference to a bike, together with some relevant feature. These are what we call low density sources, or “diamonds”. They are characterized by the fact that the product is not necessarily the focus of discussions, but the artifact which the passion of users gravitate around. In most cases users describes an activity, such as taking a tour or participating to a bike competition: it might happens that they describe their bike or its parts, but this is a rare event.

The availability of oil vs. diamond sources does not depend on the product itself, but on the focus of the sources. For instance, if we were looking for oil sources for bikes it would be enough to take into account Amazon reviews in the bike category. If looking at diamond sources for wines we would rather mine cooking forums such as Discuss Cooking  or ChefTalk, or even plain lifestyle communities.  The interesting point is that, in order to conduct an effective study of the voice of the user for new products, both are probably needed. On the one hand oil sources provide huge quantity of information and millions of mentions of product features, which gives a statistically significant quantitative dimension. On the other hand diamond sources capture user attitudes in a specific usage context and during the whole period of their relationship with the product.

So both sources provide us with different, but equally valuable information. They also have both their own drawbacks: oil sources are affected by the “pollution” due to deceptive (or fake) reviews (more than 5 % according to a study by MIT). Diamond sources might cause the introduction of some statistical bias given the aleatory mention of the product.

Be it as it may, both oil and diamond sources represent a fundamental resource for conducting research on impacts of future products. The capability of capturing user attitudes in an unsolicited way is indeed crucial for designing new  features in a convincing way. And the capability of analyzing the language of the users (i.e. the way people speak about things) is fuel to the marketing machinery. Of course one must be able to automatically mine all this out of terabytes of conversations: but this is another story.