A Data Professionals Community

Why Big Data Isn’t The Only Thing Insurance Companies Should Be Thinking About


We’ve all heard about the potential of big data. This brave new world is an exciting one, with a lot of potential–especially for the insurance industry, which unquestionably necessitates the use of many customer and competitive insights to create a holistic overview of the market and the risks they face.

However, before getting too excited about the latest shiny object, there’s an easier opportunity that’s available to exploit the information you already have (or should have!) on hand.

This is information that’s already yours… but that’s buried in dark data.

If you’re not capturing content properly or fully leveraging the information you already have, can you really say you’re ready to invest in new big data capabilities?

Why big data in insurance isn’t necessarily the answer

There’s an inherent correlation between how structured content is and how easy it is to extract value from it. For example, if you have a simple database, it’s a piece of cake for Watson to churn out the analytics answer. Having documents standardized ensures that it’s easily accessible for employees and software alike.

However, if you’re scanning in files from customers – such as onboarding and policy-related documents or claims – many programs won’t be able to read the content. Ultimately, this means that it’ll be impossible to access any information from said scanned files without hours of manual reconfiguration.

Image result for traditional enterprise data dark data

According to IDC, the chances of dark data grow exponentially more probable the more data you’re using — meaning that if you’re focusing on big data without having cleaned up your own data first, the majority of your data could be unusable.

Furthermore, according to research by KPMG, only 16% of organizations believe that the models they’re producing with their data are accurate. Could this be because so much of it is dark, and therefore valueless?

If you haven’t harnessed the power of your existing content, that should be the first place to look for insights–particularly customer-facing ones, as the amount of content available for analysis will be so vast. By recognizing information about your customers during the capture process, it’s possible to create an improved customer experience by increasing your organization’s knowledge. This means learning their pain points in order to create more streamlined procedures and creating a more personalized experience based on their previous behaviours.

And how should organizations doing this today? Not just by pursuing big data dreams and vapor ware… but by harnessing the power of data you already have at your disposal, lost in your endless piles of content.

 The dominance of unstructured data

Unstructured data continues to rule within the insurance industry–and so brings the continual problem of not being able to access the value within it.

According to a 2016 study by Veritas, on average, two-thirds of the content that companies hold is not searchable or exposed for analysis–and out of the content that’s accessible, over 60% is redundant, obsolete, or trivial, leaving only 13% of your content with business value.

Knowing this, thinking about how to find the proverbial needle in the haystack, or even to organize that hay, suddenly becomes a lot more daunting. The shifts occurring in the industry mean that more and more content is emerging – not only in greater amounts, but also in different formats.

Rather than solely scanning files in, companies are now collecting them via email as well. The format in which content is being received and inputted may not be one that is compatible with existing technology, which is often optimized for paper and archiving rather than recognizing and analyzing the data within.

Ultimately, this means that many of those legacy files are unreadable for a number of reasons. For example, the human error involved in filling them out (so a 0 turns into an O once scanned) or the fact that many files, once scanned, are in a format that’s unreadable to anyone, let alone the poor employees whose job it is to make sense of them all.

However, this problem is easily solved by beginning to structure content via OCR (optical character recognition, meaning that it can easily be read by a computer)—it’s possible to standardize your content into a universal format (such as PDF) that can quickly be understood by software and humans alike.

The cost in both time and money to do this kind of cleanup manually thus often becomes prohibitive—meaning that it just doesn’t happen. By focusing on cleaning up this content electronically, rather than on investing further in big data capabilities or analytics, it’s possible to redirect efforts more strategically towards something that can help you today, rather than in the future—and save everyone a lot of time.

Inspecting data more constructively

Managing massive amounts of data, especially if it’s unstructured, is a clear challenge that the insurance industry is facing regardless of whether or not you’re focusing on big data. Focusing on data capture and recognition to leverage the insights that you already have, but are locked within unstructured content, can provide a more integrated way to digest data and automate the process of uncovering insights.


This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More