AI projects need data. It doesn’t have to be perfect, but it needs to have enough quality and consistency for useful patterns to emerge. However, many companies are overwhelmed by the volume, velocity and variety of their data and find themselves unable to access data’s fourth V: value. So how should we think about data preparation strategies to avoid potential data paralysis or over-ambition with your AI projects?
The better the data, the better the AI. But for many companies, there’s a problem: 85 percent of their data is either dark (whereby its value is unknown), redundant, obsolete or trivial.
It’s not always easy to determine where you will find value, but in order to even understand the landscape, the data needs to be cleaned up and integrated into your business. It’s all about making sure that the data has a structure and format that will enable you to develop it into the training data you need for your AI models.
Restructuring the data is a task that can be both mundane and enormous. Phone numbers, for instance, need to be formatted consistently, with spaces in the same places. Or consider addresses: if one person has given their city as “New York”, another “New-York”, and a third has given “NYC”, AI models will represent these as three separate entities unless trained to associate them.
Leaders know they need to use their data to stay competitive, but they also know the monumental tasks they face in cleaning up that data. It’s time-consuming and expensive, and in many cases, they don’t know the best way to go about it—which can make them feel paralyzed.
Putting data scientists on the task isn’t always the best solution. When they are tapped for this purpose, this valuable talent often gets trapped in “the data dungeon”—spending too much time doing tedious data preparation work and often feeling paralyzed by the volume of data to be cleaned. Too little time, then, is spent uncovering powerful insights that can transform their business, create a new customer experience, and much more—which, ultimately, is their main objective.
Overcoming the paralysis, while avoiding over-ambition
To make the most impact from data and pursue valuable data-driven transformation, companies will want to avoid this data paralysis and uncover ways to move the AI agenda forward. However, companies also need to consider the risk of over-ambitious strategies, which can be just as damaging as data paralysis, as we see both in the example below:
- Large-scale disorder. This financial firm had major operations in over five countries; however, it was managing most of its operations from a mountain of Excel spreadsheets. Unsurprisingly, there were inconsistencies and inefficiencies across its global operations—its subsidiary in one country, for instance, didn’t track customers in the same way as the subsidiary in another.
- Change without progress. A new leadership member, a Chief Data Officer (CDO), had recently joined the firm. The CDO’s approach was to aggregate all the spreadsheets—tens of millions of them—into one giant data lake. However, that approach was too indiscriminate, complex and unwieldy. The firm spent a significant amount of money trying to organize and get value from that data but got nowhere—they took on far too much too fast, and lacked a focused, strategic approach.
- Achievable ambitions. When we began to help, rather than trying to revolutionize all the organization’s data in one go, we found the better strategy to be looking at the firm’s most critical pain points and its most valuable business units and geographies. This helped us determine where data improvements and AI would have the greatest impact. Focusing in on smaller-scale, but highly valuable, transformation has increased the speed of returns and will accelerate enterprise-wide transformation in the long run. Now, we have established a repeatable process which helps the firm rapidly replicate and scale digital transformation in other areas of their organization.
Choose prioritization over perfection
My advice is to assess what needs to be done with your data across business functions, but then isolate a small area that is a priority for the organization. This is where you can make a focused, valuable start and gain some momentum, with a view to gradually expanding or replicating the approach over time.
A business might start by choosing, for example, 10 pain points where it needs to improve, then ranking them and finding that by focusing on just the top two, it can achieve a substantial improvement on a key metric. Find an example like that in your business and zero in on it before moving to the next pain points. In this way, you secure tangible AI successes, win the confidence of stakeholders and establish methods you can replicate and scale up.