If you understand what the user is searching for, you can make a better search engine: Better understanding = better results. This has been a guiding principle throughout the evolution of search engines. Trouble is, language is complicated. There can be so many words in a search query—not to mention possible combinations of those words and their varied meanings—that only now are we developing the tools to better understand user intent and make better search a reality.
Take a moment to consider the stats, and you may see why many past attempts to improve semantic understanding have failed. By far the largest percentage of words (well over 90 percent) will only be used a small number of times in our daily lives and therefore in any data set. Too few a number of occurrences makes these words unsuitable for machine learning (ML) approaches where large training datasets are needed.
On the other hand, there are a small number of words used all the time (words like “the” or “and”) that become too ambiguous and hence are also not useful for ML. This leaves a small set of “goldilocks” words that are used frequently enough to yield plenty of examples, but not so often that they become useless. So, the first challenge is identifying these words—the needles in the proverbial haystack.
A second challenge is understanding the meaning people bring to language through shared world-knowledge when they use those words in conversation. This alone goes a long way towards explaining why past approaches to semantic search have failed to deliver and why it has taken so long to get where we are today.
But what if we could train a computer to acquire the world knowledge necessary to effectively understand language, and apply it to semantic search?
Welcome to the age of new semantic search
New semantic search has shifted search engines away from surfacing content based on literal words users type into a search bar, to understanding the intention of those words and surfacing content that users really need. In other words, search engines are becoming more like literal find engines.
Consider, for instance, what results from typing “1 USD in GBP” or “country code 56” into a search engine such as Google. The results will give you the answer you seek, not just a series of results that include your search language.
By drawing on text data from the outside, new semantic search approaches have a broader and more accurate understanding of nuance than past ones. Thanks to Neural Networks (NNs) and universal sentence encoders, computers are being trained to read sentences and come up with an “abstract semantic understanding” of the content. So, for example, searches for "1 USD in GBP" or "1 dollar converted to British pounds" could be used to train a system to "understand" queries as [number][currency][action or phrase][currency]. And this pattern means "convert one currency to another.”
Typically, these NNs are trained using wide-ranging, cross-purpose text from sources such as Wikipedia and MedLine. With enough text, we will have more examples of how words (and phrases) are used, giving us a richer understanding of the content. In this way, we bring external world knowledge into our “search and find” experience to yield better results.
Bringing new semantic search to business
Introduced by Google, new semantic search capabilities are starting to redefine the challenge at hand by truly understanding what users want. It sounds simple and, as consumers, we’ve begun to expect this search ability from every service provider and search engine.
But in enterprises, that hasn’t always been possible. Without access to the thousands of data scientists and machine learning experts needed to implement more sophisticated semantic search capabilities—which can be costly and time-consuming—organizations have largely struggled to harness this approach.
In our experience, adopting an enterprise-wide approach to search has also been met with three major barriers:
- 1. Data has traditionally been inaccessible, locked away in siloed business systems.
- 2. Integrating the best search engines with the best machine learning and natural language processing (NLP) capabilities can be difficult, and has been virtually impossible, to date.
- 3. Manual coding has been required to address language ambiguity, so technology has failed to address the problem efficiently and at scale.
Fortunately, these barriers are starting to erode.
The growth of data warehousing, data lakes and data ingestion tools are breaking down silos and making data more readily available across organizations. And the advent of new tools designed specifically to implement semantic search for business applications is solving the integration challenge.
While search engines, ML and NLP remain distinct technologies, we are getting better at integrating them. In fact, many search engine companies and cloud providers (such as Google, Microsoft and AWS) now offer these capabilities in an umbrella solution, off the shelf. Additionally, evolutions in technology are enabling more accurate ambiguity resolution and NLP without the need for coding, meaning that new semantic search is fast becoming a realistic and maintainable option for enterprise organizations.
Take a client of ours in aerospace manufacturing as an example. Employees on the manufacturing floor can simply aim a barcode reader at an aircraft part’s barcode, and the tool’s system will do a corporate-wide search on how to use or maintain that part, surfacing only the most relevant information. We’ve interpreted the barcode as the user query “tell me everything about this part, including how to maintain or replace it.”
By taking corporate knowledge from experts within the organization and making that actionable through this new semantic search function, we’ve helped reduce the distance between the users and all the business’s systems and gathered knowledge.
If semantic search is, at the end of the day, about creating access to answers and information, then it stands to reason creating a single source of truth would be part and parcel. Imagine having one lookup tool to find the answers to all questions across the organization. In the wider business context, new semantic search could be used to improve access to any number of information points, such as product names, charge codes, email addresses, invoice and contract numbers, office locations, and so on.
Rather than going to the finance website for invoice numbers, directory for email addresses, or IT website to ask a tech question, imagine the value—not to mention relief—of having one search tool that could understand your need and provide the correct response.
Of this we are confident—new semantic search will in time become the Swiss Army® knife for every organization’s information. Public search engines have elevated their game on the utility and experience they give to users. And now enterprises can do the same for customers and employees.