Skip to main content Skip to footer


Unleashing the power of unstructured data


April 26, 2021

Today’s petabytes of structured health data are just the tip of a very large digital iceberg.

I recently contributed to the AHA’s Leveraging data for health care innovation report, which I highly recommend to industry professionals interested in the future of healthcare from a digital perspective. That got me thinking about all the unstructured data the health industry generates but can’t process because it remains unorganized.

Unstructured data comes in all shapes and sizes, including text, images, media, sensor data and other information that’s not in a traditional row/column format. Industry estimates put the proportion of unstructured data at about 80% or higher[1]. Most care providers and payers are swimming in unstructured information that could contain vital clinical and business-related insights.

Recent breakthroughs in AI and natural language processing make it possible to machine-read an unstructured medical record and identify concepts that providers and payers can employ in a variety of healthcare use cases.

Icebergs ahead!

Most patient data is unstructured and takes the forms of notes, transcripts and documents that aren’t machine readable and computable. In fact, structured health data represents only the tip of the industry’s digital iceberg. Creating structured concepts that are machine readable is the chief reason doctors and other care providers spend so much time typing during patient interactions. Think about it: If the goal were getting words on the page, they could simply dictate them. Instead, they type to create structured concepts that organizations can use for analytical purposes and decision support. Unfortunately, typing also creates a significant drag on a caregivers’ productivity and can interfere with their conversations with patients.

A peer-reviewed study[2] on waste in the healthcare system supports this point, identifying administrative complexity as a major challenge. Digging deeper into the findings shows one of the largest sources of administrative waste involves putting data in or taking information out of the system for either quality reporting or to facilitate payments. This is the labor-intensive side of dealing with unstructured data and it costs healthcare organizations millions and perhaps billions of dollars in lost productivity and value. 

Unstructured data isn’t computable – sure, other doctors can read it, but that’s about it. You can’t examine it using big data analytics packages to glean new insights about patient care or to identify potential cost savings. Now, however, recent breakthroughs in artificial intelligence (AI) and natural language processing make it possible to machine-read an unstructured medical record and identify concepts that providers and payers can employ in a variety of healthcare use cases—all without the need for a human being to read and edit the document first. The use cases can range from insurance eligibility and payment reviews to situations focused on clinical trials or clinical decision support. Instead of relying on already-structured data, this system can take unstructured data, structure it for a particular need, and then execute decision support based on the findings.

AI coupled with either a reading device like natural language processing or a listening device will soon increase the amount of data available for all these use cases while simultaneously reducing the human effort needed to obtain it. But getting there will take more than simply installing new hardware and software. Developers must “train” the system to recognize a specific use case and understand a document in that context. For example, an insurance company trying to decide whether you’re eligible for a disability claim would read a medical record differently than an orthopedic surgeon diagnosing an illness. The words are the same, but the context is different.  

Gaining new levels of sophistication

Structured data enables providers and payers to accomplish far more than analysis and fact-based decision making. We’re starting to see a new level of sophistication in some clinics focused on concierge medicine that use it to automate and execute workflows. For example, as soon as the doctor leaves the patient, the system automatically structures the data generated during the visit to schedule a technician to perform a series of lab tests based on its understanding of the caregiver’s intent. No wasted time, no interminable handoffs.

I’d like to leave you with two questions on unstructured data. First, how much time do your clinicians spend reading records? They could be doctors, nurses, coders or literally anybody else in the organization. A lot of eyeballs spending a lot of time perusing medical records. Instead, we’re learning that a combined human/machine approach to reading reports works better than either method alone. An Accenture study makes the case that humans and machines working together can find information more efficiently than either can alone.[3] It harnesses the strengths of both parties: the machine’s tireless accuracy and the human’s special knowledge and experience: it’s the definition of synergy!

Second, how much time do your doctors spend typing? While even some doctors think they’re creating a transcription of the patient visit, what they’re really doing is structuring concepts for computation and to guide subsequent workflow. Now, what if you could give them all that reading and typing time back? Time for the activities only they – not technology – can do? Soon, it should be possible.

The current shortage of caregivers represents one of healthcare’s greatest challenges, because patients don’t get the attention they need, and doctors and clinicians burn out from overwork. It also leads to higher care costs. We now have a way to unburden caregivers of all that document reading and writing to collect information and hand it over to machines, giving back the time to do the things only trained medical staff can do for their patients and communities.


Kaveh Safavi, MD, JD

Senior Managing Director – Consulting Global Health