In brief

In brief

  • Unstructured data—or almost 80 percent of an enterprise’s data—holds untapped value when it comes to addressing challenges and embracing opportunities.
  • Extracting valuable insights from unstructured data has been difficult because it involves complex and time-consuming data analytics processes.
  • However, with the help of natural language processing and machine learning, this is changing fast.


Imagine you’re a doctor who needs to analyze hundreds of memos about a patient’s history and symptoms to spot life-threatening patterns. Or you’re a lawyer in need of a faster, more accurate way to spot risky contract language and speed up the decision process of your legal team.

The time it takes us, as humans, to complete these exacting tasks can be difficult to rationalize. But thanks to advances in AI, these analyses can be completed automatically and accurately, as computers learn desired patterns and replicate them to improve future results.

Whether or not organizations are aware of it, memos and legal documents are only two examples from the veritable mountain of unstructured data that they have. In fact, nearly 80% of the data enterprises have is unstructured – including CVs, emails, text documents, research and legal reports, voice recordings, videos, and social media posts.

If businesses could take full advantage of the value this information holds as and when they need it, they would be able to resolve and even prevent issues more efficiently across the whole enterprise. However, unlike structured data such as tables and spreadsheets, which have long been put to good use within enterprises, unstructured data is much more difficult to leverage and a lot harder to proactively analyze. Until recently.

Document understanding is helping to make it possible for organizations to extract valuable insights from hitherto untapped, unstructured data sources.

Change is afoot

In recent years, pragmatic AI has become a critical driver of enterprise evolution, as a variety of intelligent tools transform the data supply chain for better insight discovery. There are three prominent AI capabilities driving a lion’s share of the change:

  1. Internet of Things (IoT): applying multiple technologies, such as real-time analytics, machine learning (ML), and smart sensors, to manage and analyze machine-generated structured data
  2. Computer Vision: using digital imaging technologies, ML, and pattern recognition to interpret image and video content
  3. Document Understanding: combining natural language processing (NLP) and machine learning (ML) to help gain insights into human-generated, natural language unstructured text

Of the three, document understanding is helping to make it possible for organizations to extract valuable insights from hitherto untapped, unstructured data sources. This should play an increasingly influential role in the future of enterprise transformation, as the number of unstructured sources within organizations will likely increase. However, this is most likely one of the first times you’ve heard of document understanding; it’s relatively new on the scene. And that’s why this article will dig into it a little further.

Natural language processing gains momentum

The practical applications of natural language processing (NLP) make it an ideal tool for businesses looking to leverage insights from unstructured data to transform their operations.

As a result, NLP has been going through a transformation of its own, with two main factors, we believe, driving this change.

First, as business employees become more accustomed to search engines like Google and digital assistants in their personal lives, they begin to expect the same knowledge-seeking experience at work. This is fueling high-performing enterprise NLP applications that can understand and respond to natural language queries.

Second, NLP no longer relies on manually-written rules alone. For greater automation, scalability and accuracy, NLP is now being paired with ML and enabling tools such as document understanding applications. At its simplest, document understanding combines NLP and ML to gain insights from human-generated, natural language unstructured text—and it can now reliably aid process automation and decision making.

At its simplest, document understanding combines NLP and ML to gain insights from human-generated, natural language unstructured text—and it can now reliably aid process automation and decision making.

This trend has been driven by what Accenture’s Jean-Luc Chatelain, CTO of Applied Intelligence, talks about in his latest 2019 Predictions: Today’s enterprises will need to move beyond “search engines” to “find engines” to gain actionable insights now.

In its July 2018 Magic Quadrant for Insight Engines, Gartner also predicted that by 2022, “information will proactively find more employees more often, thereby providing the insight needed to progress decisions and actions and reducing reactive searching by 20 percent”.

This shift also explains why the analysis of unstructured data is increasingly moving away from reactive searches and towards the proactive generation of insights to feed existing or anticipated business needs. Document understanding applications that combine NLP and ML deliver just that.

Document understanding drives productivity

By combining search and analytics with pragmatic AI technologies like NLP and ML, document understanding automatically extracts relevant information from unstructured data sources, saving businesses the time and resources needed to search manually.

As these applications further develop, they can deliver advanced actionable insights to improve business processes and customer experience. In fact, businesses across a number of industries have started to apply document understanding to help surface insights, including:

  • Legal departments – Reducing risks by automatically analyzing legal contracts for specific “red-flag” terms. For example, in merger & acquisition (M&A) processes, any variations of risky contract language referencing unlimited liability can be automatically identified, highlighted, and shared with the legal team. This helps deliver faster, automated insights beyond traditional keyword search.
  • Government agencies – Analyzing digitized incoming mail to route relevant letters to the right departments, eliminating manual effort and saving hundreds of thousands of agent hours.
  • Recruiting – Taking on rote tasks like sifting through millions of resumes and automatically matching CVs to job postings. Now, recruiters can review a curated and prioritized set of CVs and candidates, allowing them to focus on the people instead of the paperwork. Algorithms can also improve future results by learning and replicating desired patterns when hiring.
  • Banks and financial services – Automatically cross-analyzing loans or mortgages with the borrowers’ profiles from multiple independent sources to deliver better customer experience and engagement.
  • Content creation – Automatically identifying an article’s main theme, finding similar articles from other sources, and sharing those articles with the author to aid writing.
  • Storage optimization – Using automated business rules to identify the appropriate action to take with documents stored in expensive on-premise storage—whether to move to lower-cost storage, delete if obsolete, or archive. ML can also accurately and quickly detect duplicates or near-duplicates, allowing for storage cost savings as well as a 360-degree view of enterprise data.
By combining search and analytics with pragmatic AI technologies like NLP and ML, document understanding automatically extracts relevant information from unstructured data sources, saving businesses the time and resources needed to search manually.

Poised to reach new potential

With the increasing range of pragmatic AI solutions available, from open source frameworks and evolving vendors to cloud-based APIs, enterprises stand to benefit more than ever from this ecosystem. They now have the flexibility to integrate appropriate approaches and technologies for their use cases.

While NLP is not perfect, it is being consistently enhanced. And the ML algorithms supporting NLP are seeing significant advances with industry giants like Google, Microsoft, and Amazon making strides to improve accuracy.

We’re also leveraging our own technology assets at Accenture to orchestrate different components of NLP applications, making them easily maintainable and scalable using both custom and ready-built algorithms. This means that NLP and ML are slowly gaining maturity, helping businesses to use document understanding to tackle increasingly complex challenges and finally begin to unlock the full potential of unstructured data.

Now that you know more about document understanding, what potential use cases do you see within your organization? What value will it help you unlock?

Kamran Khan

Managing Director – Accenture Applied Intelligence

MORE ON THIS TOPIC


Subscription Center
Stay in the Know with Our Newsletter Stay in the Know with Our Newsletter