From traditional search to AI-powered search & analytics applications
Many search engine discussions are about the major Internet search engines (Google, Bing, Baidu, etc.). However, besides those well-known search engines, there are hundreds of thousands, probably millions, of little search engines installed over the world for searching across datasets that are smaller and more focused than the entire World Wide Web. I’m talking about business search applications supported by search engine software, for example, open source options like Elasticsearch and Solr or commercial offerings like Microsoft Azure Search, Google Cloud Search, Sinequa, Attivio, Lucidworks, etc. So what are the uses of these search engines?
In my years as a search engine programmer and architect, I’ve come across lots of business-critical as well as creative uses for search engines. Here’s a roundup of ten prevalent search use cases we have helped insight-driven enterprises explore and implement over the years.
<<< Start >>>
<<< End >>>
1. Knowledge management
Searching across documents
This is the most obvious application. If you have a lot of documents (maybe billions of documents amounting to petabytes of data), search them with a search engine. This applies to all organizations with large collections of documents, such as publishers, research companies, government offices, and so on.
Answering natural language questions
We use a search engine to store knowledge graphs that represent insightful relationships between data points. For example, check out Wikidata. By indexing all of Wikidata in a search engine, we can look up the information via the search engine when we need to answer natural language questions, such as "How tall is Mount Kilimanjaro?”
We can do the same for enterprise data – using a search engine to store the enterprise knowledge graphs which will then help answer questions like "What was U.S. sales revenue last quarter?”
Note that the search engine does not do Natural Language Understanding (NLU) to help understand the question. NLU is done by another system. The search engine is used to search for the answer once an NLU system identifies what the question is. Search engines can incorporate Natural Language Processing (NLP) to deliver holistic enterprise knowledge via search, chatbots, and question-answering systems.
<<< Start >>>
of enterprise data consists of unstructured, natural language content - text, audio, videos, images. Combining search with NLP can help unlock hidden insights.
<<< End >>>
Validating whether the architecture plans for your building are complete
One of the more unusual cases for a search engine was to index all of the pieces in all of the architectural drawings for a building. This is a field of construction called “Building Information Management” or BIM.
The architectural drawing elements, include doors, windows, stairs, studs, wires, pipes, light fixtures, electrical panels, etc., are indexed. Then, a series of rules expressed as complex search engine queries can be used to validate that all of the building information was accurate, consistent, and fully filled out so that the building could be properly quoted and constructed.
2. People and places
Matching candidates to jobs
All recruiting companies have large databases of potential candidates. Search engines are used to take a job description and find the candidates that are most likely to be successful in the job.
Modern recruiting search-and-match applications use Artificial Intelligence (AI) to better understand the candidate and the job. For example, AI techniques are used to help identify things like language fluency, legal requirements (e.g. "Has the lawyer passed the bar exam?" "Does the registered nurse have a certificate?"), or “worked for” and “manages” relationships.
Finding people even when you can’t remember how to spell their names
Search engines don’t need to just search for words. They can search for fragments of people’s names. They can do this and find people with similarly spelled names even in extremely large lists of names.
Turning your telephone number into a word or words
I once used a search engine to search over a dictionary to identify a word or phrase which I could use instead of my telephone number. You know, where 2 = abc, 3 = def, 4 = ghi, etc.
I did this by dicing up the words in the dictionary into overlapping 2-character (i.e. overlapping 2-grams) patterns and then searched for the combinations of character patterns found in the telephone number. It worked like a charm.
Finding your soul mate
When you log into a dating website and enter your information, you know, “In Search Of,” where you specify your ideal mate in terms of gender, age, location, religion, smoking or non-smoking, politics, pets, children, and taste in music? It is a search engine that matches your profile against all other profiles to find your best match.
A search engine can do this very fast. It can also weigh each and every characteristic in a profile to identify those which are “hard requirements” versus those which are “highly desired” or simply “someone desired.”
So yes: A search engine is helping you find love!
Websites like ancestry.com use a search engine to find people's ancestors.
My team and I designed and implemented the search functionality for a government organization to enable its users to do genealogy research, especially to find soldiers and war casualties.
Finding good businesses with A+ ratings
A long time ago, my team and I helped design and implement the search engine for a business directory website. This was interesting because it involved geo-sensitive searching (for example, "I want the burger joint closest to me”) and fuzzy name search.
3. IT system performance
Offloading relational databases (RDBs)
Companies who have expensive RDB systems can load their data into a search engine and do most of their searches in the search engine instead of doing SQL to the RDB. This can reduce the load on the RDB system, so they don’t have to buy more RDB hardware or more expensive RDB licenses.
Dashboards that show how my IT system is running
We load all of the log data from a cluster of computers into a search engine and then use the search engine to create dashboards showing how the machines are operating. The metrics on these dashboards can include RAM, disk space, CPU, processes, transactions per second, etc.
Search engines can do this effectively over billions of log lines as well as quickly produce histograms and trend lines suitable for charts and graphs.
Identifying files that can be deleted or archived to free up disk space (storage analytics)
We are indexing directories of files, sometimes with the file content, to automatically identify files which meet certain criteria such that they can be defensibly removed or archived.
This process could include inappropriate files (e.g. inappropriate images, illegal music), old files, files owned by employees who are no longer with the company, files which meet certain retention criteria, etc.
Search engines are good for this application because they can handle extremely large numbers of files (billions) and return large sets of files which meet the specific criteria very quickly (in seconds). The “specific criteria” are specified as complex search queries.
4. E-commerce and customer services
E-commerce giants - they all use search engines for their on-site product search.
What makes search engines especially good for e-commerce site search is that they can display the products you are most likely to buy at the top of the search results list. We’ve implemented dozens, maybe hundreds of e-commerce search applications.
This is done with a ton of data science and analytics which compare your past queries, past purchases, and other information (total spend, time on the site, etc.) with past customers and their purchase habits. Most of the analysis is performed on big data machines in the background. The final data (many vectors of hundreds or thousands of numbers) is then indexed into the search engine along with all of the products in the product catalog.
The search engine uses this information to make educated guesses about what products you are most likely to purchase and place those at the top of the search results.
<<< Start >>>
In addition to product search, search engines are used to generate intelligent product recommendations, without any search at all.
<<< End >>>
After all, if you have a formula, such as popularity or best sellers, for sorting products based on which ones the customer is most likely to purchase, this can be done over the entire catalog (or any filtered subset), to provide product recommendations at any time.
Examples include “Best sellers,” “Recommendations for you,” “Popular right now,” “Trending now,” and “New Releases.”
Some readers may point out that the recommendation results can be simply pre-computed and displayed on the website, and this is true. However, in most implementations I’ve seen, the results actually come from the search engine because it holds all of the product metadata necessary for prices, images, descriptions, etc.
Dashboards that show how an e-commerce system is performing
Similar to IT dashboards, we implemented the same techniques to show how a large e-commerce site was performing.
Metrics like revenue, conversion, top products, top categories, top customers, total discounts, etc. provided the company with an up-to-the-second knowledge of "How is my e-commerce site performing?” Sudden drops often indicated a problem in one of their systems.
Search engines (compared to relational databases) are good for this application because they can handle billions of transactions and very high volumes of updates.
Finding segments of customers for targeted marketing
Businesses can write all of their customers’ information into a search engine and then use the search engine to find customer segments for targeted email campaigns.
Search engines are good for this application because they can very quickly (in less than a second) compute a subset of customers from a very large list of customers based on any criteria that your marketing team may think up.
5. Legal and contracts
This use case is typically called “E-Discovery.” Fun fact: this was one of the early uses of search engines. I remember working with clients on this use case in 1988.
When you are working on a legal case, you may ask the involved parties: “Give me all of your documents between these dates written by person X or which have word Y." You will then get literal truckloads of documents. Throw them all into a search engine and search them to find the evidence which will prove that your adversary was guilty. Note that search engines can also help you determine the specific people and words you should ask for – interesting use case.
Finding laws and regulations
I also helped an organization design and implement their site search to allow users to find laws and regulations. This was an unusual search system because we had to splice up the content into lots of small pieces to achieve the search granularity that we wanted.
6. Security and intelligence
Identifying public threats
Companies, mostly financial institutions, are searching their customer databases to make sure that none of their customers are public threats. They can use search engines to check every one of their customers against lists of individuals monitored by law enforcement. This can include exact name matches and searches on character patterns found in the name.
Identifying insider threats
These threats can include moles inside government intelligence agencies, traders who are violating insider information laws, bribery, harassment, and all sorts of non-compliant behavior.
By using search engines to index all emails and search for irregularities, organizations can find employees who are not playing by the rules.
Identifying website hackers
Search engines can ingest your system’s router, switch, and web server logs to look for suspicious behavior such as Denial of Service (DDoS) attacks, people from suspicious locations, people probing strange port addresses, etc.
Similarly, search engines can ingest Virtual Private Network (VPN) traffic to look for contractors with suspiciously large VPN downloads who may be trying to steal your customers’ information.
7. Oil and gas
Finding places to drill for oil
Oil and gas companies have been around for a long time and have tons of reports and studies on what’s in the ground. Search engines can sift through this content - often with geographic shape filters - to help them identify areas which are rich with indicators for possible trapped hydrocarbons.
8. Genomics and biotech
Genome variation research
We created a search engine for a research hospital to help them find patients with similar genome variations from a reference genome. The idea is that if a number of patients with the same genome variation also have the same illness or symptoms, perhaps there is a causal link that’s worth studying.
Chemical structure and chemical sub-structure search
We recently created a search engine which indexes chemical sub-structures. For example, you can submit an entire chemical structure like Aspirin (using the chemical's “SMILES” notation) and the search engine will find chemicals with a similar structure to Aspirin. It can also find chemicals with a similar sub-structure.
Search engines can do this across millions, perhaps billions, of chemicals.
Finding patients for drug trials
For medical research, we are indexing anonymized patients and all of their activities into search engines. Activities can include things like patient demographics (age, weight, height) as well as their doctor visits, prescriptions, symptoms, diagnoses, etc.
This allows us to find “cohorts” of patients that have the necessary symptoms, illnesses, demographics, and doctor visit activities to include in new drug trials.
9. Content suggestions
Supporting the type-ahead or query completion search feature
When you go to a website and type in a couple of characters in the search box (for example, "ab”), do you notice that some search boxes return a list of recommended words in a drop-down? Often, it is a behind-the-scenes search engine producing the data in that drop-box – a feature known as type-ahead or query completion.
Instead of indexing whole words, the search engine can index “start fragments” such as "a/ab/abs/abso/absol/absolu/absolut/absolute.” This is a really fast method for taking a few characters and finding all of the words starting with those characters.
A search engine is especially good query completion because it can produce a relevancy-ranked list of results. Relevancy ranking can be used to personalize the list as well as add preferences, such as boosting the most common query completions.
And last but not least, one of the most common search use cases and one that we probably encounter every day: the search box at the top of this website as well as any other website is handled by a search engine.
Search engines are incredibly useful – they can be applied in every aspect of life, from business to our everyday life (there’s one on your smartphone right now!). The search use cases discussed above have added significant value to our clients’ organizations. And more use cases are being invented every day as we continue combining search with technologies like analytics and NLP to help improve business outcomes.
<<< Start >>>
<<< End >>>