Voice of the Consumer: Is Text Analytics Finally Growing Up?

voice of the consumer

Text analytics (also called unstructured data analysis) involves making sense of all those words written on blogs, review sites, and social networks. Text analytics or text mining was always the ugly stepchild of marketing analytics because, unlike the analysis of quantitative data, judgment replaces statistics and bias can result from small sample sizes. Often reduced to simple categories or sentiment analysis as the amount of available data exploded with the advent of social networks and other sources of text data, text analytics suffered neglect.  However, new products like those from Qualtrics and Talkwalker, seek to address this critical need with new software that measures the voice of the consumer, a key metric for assessing consumer sentiment related to your product and its key competitors.

voice of the consumer
Image courtesy of Talkwalker


Voice of the consumer metric

Enter “Voice of the Consumer” (VOC). Here’s what resulted from the Wharton School’s Customer Analytics Initiative:

Every year, American companies spend more than $10 billion dollars seeking to learn what consumers think about their products or how they rate against their competitors. The techniques they typically use have guided the field of market research for the last three decades – methods such as focus groups or detailed customer surveys. These practices still dominate even as thousands of the most engaged and avid customers – larger than a focus group by a factor of thousands – have been taking to the Internet daily in growing numbers for nearly 25 years to hold free-ranging and unsolicited discussions of what they like about consumer brands, what they dislike, or to compare similar products.

If so much data is available, why don’t more firms close up their marketing research efforts in favor of this richer pool of data? The answer, according to a report by Dr. Netzer published by MSI (Marketing Science Institute) lies in the complexity associated with teasing out insights from this voice of the consumer:

Consumer-generated content on the Web is both a blessing and a curse.” The sheer size of the data pool can be overwhelming, and the freewheeling nature of the consumer chat raises problems with spelling and grammar, in addition to interpretation.

This is why firms reduce the complexity (and obscure interpretation) by placing consumer-generated content into general buckets or categorizing utterances as positive, negative, or neutral to score consumer sentiment.

Delivering value

An obvious consideration in VOC programs is balancing costs versus value from the program as these programs are notoriously expensive and fraught with misinformation unless collected without subject bias and interpreted in an unbiased manner by trained researchers (although AI and software do much of the heavy lifting in interpreting text data these days). A well-run program, however, offers the following rewards based on recent studies:

voice of the consumer
Image courtesy of Super Office

Improving VOC programs

There are ways to improve your VOC data collection and interpretation efforts to deliver more accurate results that lead to increased value from the program. Let’s move on to that part of today’s discussion.

Collecting from various data sources

The best way to avoid bias is to develop a plan involving multiple data sources to increase the variety of the data you collect. Omnichannel sources work best as they help you avoid groupthink that emerges when a strong voice or several strong voices influence the conversation to a disproportionate degree. Also, by using a variety of data sources, you end up with a richer assessment of VOC. Newer software allows you to scrape data from numerous sources, combining it into one analysis that includes:

  • social media
  • review sites
  • blog posts
  • call logs and chatbot transcripts
  • messaging app texts

Collecting the right data

Negative comments can provide the best information to improve your performance by highlighting elements that cause frustration, confusion, or dissatisfaction, so don’t ignore them in building your insights. If you’re using offline sources, such as focus groups or interviews as supplements, don’t think too narrowly about your questions.

The same goes for eliminating data you think doesn’t provide value in the early stages of analyzing text data. It’s too time-consuming to scour the data a second time to catch something you missed on the first pass because you didn’t immediately see the value. I sometimes use a word cloud as a first pass in analyzing my text data since many software solutions require you to seed the analysis with keywords.

Interpreting the voice of the consumer

Increasingly, efforts to glean more nuanced insights involve collaborations between marketing practitioners, academics, and computer scientists. That’s because interpreting text material requires automatically scraping relevant conversations, then determining WHAT attributes consumers mention and HOW they frame their conversation about these attributes. Bradlow and Lee, both professors at Wharton, developed such a tool and found it much better at providing insights than traditional market research methods such as conjoint analysis. It remains to be seen whether the tool works equally well for other types of text analysis and whether it can work in commercial applications of text analysis.

Del Moro points to 5 important characteristics of a successful voice of the customer programs.


Many tools toute their listening ability when, in fact, they only HEAR what consumers say to them or, even worse, on their own properties like Facebook pages. Listening involves gathering insights from customer-generated content on other platforms as well as peer-to-peer conversations. Then, build your insights using that data strategically to improve firm offerings.

Unfortunately, many software options available offer an anemic view of what consumers are saying about the brand and its competitors that simply divide comments into sentiments of positive, negative, and neutral. At the Text Analytics conference, Mark Eduljee, who heads social listening at Microsoft (and is my co-author on our upcoming social media analytics book) and I presented a paper about listening to derive actionable insights (see below for the slide deck).

The overwhelming problem discussed by multiple presenters was the difficulty in even decerning sentiment from English, which requires some elements of context for accurate interpretation. For instance, a simple phrase, “it’s the shit,” may mean a consumer is very dissatisfied with the performance of a product or that it far exceeded expectations depending on para verbal elements that are missing from written text.

For instance, Mark Eduljee talks about problems customers voiced after the introduction of the Surface computer. He used a random sample of comments and analyzed them without resorting to software to glean a deeper understanding of what consumers were saying about the product. He searched for commonalities within complaints voiced to quantify which problems accounted for the biggest dissatisfaction.  Within days, Mark heard their conversations across Microsoft-owned properties as well as those owned by others, identified issues that garnered the most criticism, and sent the information to the design team responsible for improving the system. Within a week of the product’s introduction, the design team was on the job working on fixes and building the next version of the product.

Separate actionable insights from noise

Text data is everywhere — in blogs, on review sites, and on social networks like X (formerly Twitter) and Facebook, in chat rooms and forums, etc. Thus, an automated process is needed to decrease the noisiness of the digital environment. IBM, SAS, SAP,  and several other software companies offer excellent solutions designed for the voice of the consumer analysis. There are even some free text analytics solutions including 1 recently released by Stanford, which unfortunately classifies the voice of the consumer data rather than aiding interpretation.

In Mark’s portion of our presentation, he highlights the problem with trying to analyze everything; namely, it’s frustrating, expensive, and doesn’t necessarily provide insights you can use to improve your performance with consumers. Instead, start with the question you want answered. Then, explore the data scraped from a variety of sources to discover how consumers frame their relationship with your brand and those of your competition.

Look for actionable insights from your text mining efforts. For instance, use negative comments to determine how to improve your performance in the eyes of the consumer rather than simply charting negative comments. In working on a book related to text analytics, I interviewed a company. The company used an intern to collect as many comments as he could find related to the brand in an Excel spreadsheet. The plethora of data resulted in analysis paralysis that caused the company (a pharmaceutical company) to overlook complaints until the US FDA (Food and Drug Administration) sent queries regarding the efficacy of the drug. The company should have “heard” these complaints before it reached this level so they could correct the problem by adapting the product to meet customer needs or changing their messaging so they didn’t overpromise the drug’s performance.

In statistics, we frequently consider data that fits within two or three standard deviations from the mean. Data points outside this range often represent mistakes in answering questions or coding the answers or disingenuous consumer opinions. With text analytics, we don’t have these nice, clean demarcations to follow but we still have the same problem, especially when it comes to disingenuous comments. Eliminating these statements provides a better picture of the reality forming the voice of the consumer. Amazon tries to eliminate text and ratings that can skew results used by shoppers to guide their purchase decisions by ensuring there’s no relationship between the reviewer and the seller. They also differentiate between verified buyers who post reviews and others so shoppers can make informed decisions. In analyzing your textual data, consider throwing out voices at both ends of the spectrum to focus on those in the “normal” range.

Develop integrated solutions

Integrated solutions for the voice of the customer capture data across multiple sources, including customer support conversations by phone, email, and mail, as well as online platforms such as Chatbots and social media. Are there differences across these platforms in terms of what consumers and customers are saying? If so, you may want to investigate your messaging across platforms to ensure you’re making the same promises across platforms (underdelivering on promises is one of the top reasons for dissatisfaction). Ensure you don’t overpromise and deliver on your promises.

Assuming that there’s no significant difference in the voice of the consumer across sources, you must build an integrated solution to issues identified in the text analysis so you “fix” whatever issues were identified and retain a consistent voice across all your marketing channels.

Make it mobile

Customers increasingly rely on mobile devices to shop, post comments, access reviews, and spend time. That means you have to assess the voice of the consumer on these mobile apps, as well. Google recently revamped its analytics platform (GA 4) to better integrate mobile apps into the rest of your online analysis, allowing you to track app downloads and click to your website from mobile devices (even splitting out tablets from smartphones) to better assess what mobile customers want.

design an effective mobile adVoice of the consumer as part of your business process

The voice of the consumer can’t be a stand-alone solution. User experience (UX) greatly impacts what consumers think about your brand so building a great customer experience is important if you want to improve your reputation with customers. That means working across functional areas including product design, marketing, and business intelligence.

You need to integrate it into other business processes, such as your sales and customer loyalty data. Moreover, the voice of the consumer insights must inform ongoing business strategy as we saw earlier with insights from Microsoft via Mark Eduljee.


Assessing the voice of the consumer must be an ongoing part of your marketing process if you want to see the continued growth of your brand and improved market share over your competition. This post provides some of the insights you need to build a voice of the consumer program to meet your needs.

Need marketing help to support business growth?

We welcome the opportunity to show you how we can make your marketing SIZZLE with our data-driven, results-oriented marketing strategies.  Sign up for our FREE newsletter, get our FREE guide to creating an awesome website, or contact us for more information on hiring us.

Hausman and Associates, the publisher of MKT Maven, is a full-service marketing agency operating at the intersection of marketing and digital media. Check out our full range of services.