In a blog I wrote earlier this year, I discussed Neural Networks as one of the key developments to watch in the search and analytics space. Neural networks can “learn” to perform tasks through pattern recognition. They can create a “semantic space” – an abstract understanding of the enterprise content, which can be used for better search, classification, and question/answer.
As more client projects unfold this year, it’s clear that neural networks have become increasingly useful by going beyond simple keyword search to enable the search engine to understand the user’s meaning and intent, so that the most personalized, relevant results can be provided.
I had a chance to chat with a couple of our engineers who have recently worked on neural network applications. We’ll be sharing some neural network use cases and highlights of our approaches in this interview-style blog.
Use case #1: Neural networks for recruiting applications
Paul Nelson: Hi Mark and Mauricio. I’m glad to have you two here today to share some of the innovative neural network use cases you’ve worked on. Let’s start with Mark. You’ve mentioned a recruiting application. So how are neural networks being used effectively in recruiting?
Mark Stanger: It’s an interesting application, Paul. We use natural language processing (NLP) to extract useful information from job descriptions, such as job titles and skills. The objective of this project was to use the extracted data to automatically create suggested job descriptions for a position with some contextual richness. Recruiters and hiring managers can then manually review and edit them to ensure accuracy and consistency. Overall, it provides a more centralized and efficient process for creating and handling job descriptions.
Neural networks are beneficial in helping us build two models to extract and group together similar entities or sentences within job descriptions:
- An entity recognition model which looks at sentences within job descriptions and identifies words or phrases that match specific labels, such as technical skills, general skills, education requirements, abilities, etc.
- A sentence classification model which looks at whole sentences to determine the sentence type. In this case, it’d decide whether this is a task, a responsibility, or neither. This helps the machine understand what the sentence represents and what is the intent of the sentence.
Paul: So, the purpose is to find and group similar things within job description documents. What are the advantages of choosing neural networks over traditional statistical techniques, such as bag of words?
Mark: Bag of words and other traditional techniques have no understanding of context within the sentence. Those techniques would spot the word within the sentence but not its position in the sentence. In most use cases, syntax is vital to understand the context and meaning of the sentence.
Now with BERT – neural networks that have been pre-trained with knowledge from massive knowledge bases like Wikipedia and Google knowledge base – we can take into account word similarities and synonyms, enabling the machine to better understand the sentence.
Paul: That really is beneficial to sentence understanding as opposed to simple keyword identification. What’s interesting is that “Bidirectional” in BERT (Bidirectional Encoder Representations from Transformers) means that the neural network looks at the word position in the sentence – both before and after – and incorporates the position of the word bidirectionally to understand sentence better. I’m excited to hear further insights and results as you work on this project.
Mark: Definitely. So far, we’ve been evaluating the performance of these models using an F1 score, which measures a balance between precision and recall. And we’ve had pretty good scores from both models.
Use case #2: Neural networks for question/answer systems
Paul: Moving on to Mauricio with another exciting use case. Mauricio, I know you’re using neural networks to improve a question/answer application. Can you share some details on this project?
Mauricio Rizo Sandi: Sure, Paul. This application is very relevant in this pandemic and I think we all would like to get the latest answers to our COVID-19 related questions. This question/answer system does exactly that – it searches for similar text within the organization’s knowledge base and provide answers to questions that are similar to the existing ones in the knowledge base. The system makes it easier and more productive for the organization to serve similar answers to similar questions. It also helps the searcher save time by suggesting a list of possible questions as he/she is typing in the search box.
Paul: Great. So how do neural networks identify similar questions?
Mauricio: We use these new neural networks to convert text into dimensional vectors created from language models. We then use those vectors to compare sentence similarities. The closer the vectors, the more similar the sentences. We started with pre-trained BERT models and applied some scoring techniques to improve the models.
Paul: What techniques did you use to evaluate and improve the quality of the analysis?
Mauricio: We went through a few steps to do that:
- Convert query text to vectors
- Store the vectors in Elasticsearch which can handle large volumes of unstructured data
- Query the vectors for similarities
The pre-built BERT models returned decent results, but we decided to further improve the models for queries that didn’t have similar questions returned. For this, we used Saga NLU – our innovative NLP middleware – to help enhance NLP results for domain-specific queries (COVID-19 in this case).
Paul: That’s a great point. BERT was trained prior to COVID-19 so its knowledge base may not be comprehensive in this particular domain. And that’s why it’s so helpful to make neural networks models more domain-specific for COVID-19 related terms.
Mauricio: Right. Adding to that, the models were tweaked to better handle the various ways the same question can be asked as well as to disregard words that don’t provide meaning or value. For example, questions such as “should I go to the doctor,” “should I stay home,” “should I go to a clinic” contain different wording but essentially mean the same. Or a question would include words like “right?,” “is this correct?” which don’t really help in our sentence similarity analysis.
Paul: So, data cleansing is helpful to remove sentences that aren’t useful or adding any context to sentences. You mentioned you tried various models to see which produced the best results?
Mauricio: The visuals below are some examples of the scoring we developed to evaluate model performance. We tested and scored multiple variations of BERT. While each model is trained in a different way and has different structures and advantages, the highest-scoring models signal the highest performance at identifying and grouping similar sentences for our use case.
<<< Start >>>
Figure 1: Example of graphs representing sentence similarity grouping performance
<<< End >>>
<<< Start >>>
Figure 2: Example scores of neural network model performance
<<< End >>>
Where are neural networks heading?
Paul: Thanks, Mark and Mauricio, for sharing these innovative use cases and techniques. Neural network applications are helping us solve more challenges encountered in traditional statistical techniques. They’re better at understanding the user query and identifying the best matching sentences or paragraphs within a vast database of documents to return relevant answers. Instead of just displaying a list of documents containing the keyword, we can take the user straight to a specific sentence within the document, making the search much more useful and productive.
We’re turning this work into end-to-end sentence search capabilities with all components packaged together. These neural-network-enabled solutions would be beneficial for question/answer applications handling highly curated, complex documents, such as procedures, policies, manuals, and more. With advancements being made to current models at a fast pace, I expect that neural networks will become an essential technology in modern enterprise data-driven use cases.