Businesses can use crowdsourced data enrichment to add attributes to unstructured data in several ways, according to Accenture Technology Labs:
Improve Customer Experiences – For example, analyzing customers’ social media posts, online product reviews or other posted material can reveal specific preferences or tendencies and more subtle cues.
Make Faster Decisions – Crowdsourcing is a viable way to distribute efforts to cleanse and enrich data for accuracy, remove duplication and assign references in the correct order.
Leverage Knowledge Assets – Crowdsourcing can help classify and annotate unstructured documents to make them easier to search and reuse, route to the right people or protect with automated security safeguards.
With the maturation of machine learning and other advanced analytics techniques, one may assume that automated analytics—for instance, image recognition or text understanding—would be the best way to enrich raw data with structured meta-data.
In many cases, automation can help. However, while automation is desirable for cost reasons, many tasks that are easy for people to do are still beyond the capabilities of algorithmic analysis. Many challenges arise.
Businesses are starting to turn to paid crowdsourcing platforms such as Amazon Mechanical Turk, CrowdFlower and Crowdsource to enrich a variety of data sets with meta-data.
Think of these large, micro-task labor platforms as eBay-like markets for paid human labor rather than for products.
To combine crowd work and automation systematically and effectively, companies must first understand the various patterns and then determine which one works best for their specific tasks. Accenture Technology Labs identified three key patterns:
Pattern 1 - Crowd Plus Automation Pipeline: A workflow is parceled out in phases of work, with some going to the crowd and others to automated algorithms in sequence that makes best use of the strengths of each.
Pattern 2 - Crowd Verifies the Automated System: Many tasks are time-consuming to perform, but once performed, quick to verify.
Pattern 3 - Crowd Trains the Automated System: Automated analytics approaches often involve machine learning algorithms that need to be “trained” with a large set of sample data.
To get started, companies should take inventory of new data sources available from various digital channels.
The next step is to examine and analyze these sources to determine which kinds of unstructured data could be converted into useful data and ingested into the data supply chain to help achieve business objectives.
Companies should then run experiments to determine what patterns combining crowdsourcing and automation are most effective.
Once this is decided, the final phase is to design and develop a data enrichment production system that will allocate workflows to the crowd.
We’ve created a crowd-powered data enrichment prototype platform to facilitate this process through iterative experimentation and systematic fine-tuning of crowd workflow configurations, price-per-task and other variables.
As we discuss crowdsourcing and data enrichment with a broad range of clients—from clothing retailers and grocers, to enterprises with large internal knowledge bases to manage--we are hearing a similar story in many different contexts.
With the explosion of new sources of data, cleansing and enrichment are growing increasingly important, and automatic analysis alone still handles only a part of that challenge.
By combining paid and free crowdsourced data enrichment with automation—and using Accenture Technology Labs’ prototype platform to bring human labor into the processing loop—companies can maximize the many potential streams of unstructured data into a continuously flowing river of useful data.
As this area matures, companies will want to integrate crowdsourcing seamlessly with other business systems, and work to combine mastery of the crowd with cognitive computing techniques.