We’ve talked about knowledge graphs on this blog before, and for good reason. With a knowledge graph representation of a collection of data, we can perform logical reasoning, infer relationships between concepts, and uncover new insights that weren’t possible before. We’ve used knowledge graphs here at Labs to drive innovation across industries, including accelerating 5G networks with the digital twin and thread, and improving and adding explainability to recommendation systems. We’ve also open-sourced Ampligraph, a suite of machine learning models that can uncover new knowledge from existing knowledge graphs.
We’re also exploring how knowledge graphs can contribute to social good. Working with Lambert Hogenhout, Chief of Data Analytics for the United Nations, we organized a “Knowledge Graphs for Social Good” workshop. The effort focused on the UN’s 17 Sustainability Development Goals: concrete areas like poverty, hunger, and health where action is needed to improve the life and well-being of people around the world. (Read more about the SDGs here.)
You might wonder how a knowledge graph can reduce poverty or improve health. The answer is in the insights that knowledge graphs let us draw from existing global data. Better insights let us understand the impact of different interventions; we can examine what policies or events are affecting the overall progress toward achieving specific SDGs. Our workshop explored a number of topics, from an introduction to modelling the UN’s data through to automatic modelling techniques, and some inspiring work in the space from other organizations.
Getting started with the UN data
The UN maintains a massive collection of both structured and unstructured data, starting with the SDG indicator datasets. One key way knowledge graphs can help in using this data to achieve the SDGs is by determining explanations for some of the metrics tracked in the data. Digging through the structured data, we found that the SDG target data is broken into “indicators” that measure progress towards goals – for example, poverty by gender with respect to a region). We wanted to see if it was possible to automatically attach related information to this data to provide additional context, like a news article or Wikipedia document – and it turns out that we could. We successfully linked journal articles to SDG goal-target-indicator information for countries through time using simple natural language processing with descriptions, synonyms, and values. Now we (or anyone!) can pull up an indicator and quickly find more information that may provide an explanation for the current data. The images below show the model of the knowledge graph and an example output for one country (Nigeria). You can read more about how this works, or learn how to approach the problem from the top down, in our “Working Towards Knowledge Graph Representations” workshop section and specifically in the presentation “Building A Simple Knowledge Graph with UN Data: Quick Start Example, Common Methodologies, and Tooling.”
Extracting knowledge automatically from unstructured sources
We also wanted to go a step further, mapping higher level concepts (and relationships between those concepts) to do more scalable modelling both within and across different Sustainability Development Goals. This is a bit easier said than done: modeling a schema to understand a dataset is instrumental, but manually modelling is difficult to scale and requires a domain expert. We needed a combination of top-down and bottom-up approaches to model the data from our understanding, along with automatic methods to extract various relevant concepts and relationships.
We used a toolkit to extract knowledge in the form of machine-readable triplets. The main effort of this portion of the workshop was to apply our tooling for automated knowledge graph creation, demonstrate scalability, and see how far we could get with modelling the SDG data. The images below show a sample of the input text and the outcome of mapping the free-form text, discovering potential concepts and relationships.
The tool is able to connect concepts and relations. We are working on improving the algorithms to further refine the feedback learning mechanisms, improving mappings and entity categorizations. You can read more about what we achieved in this space in "Knowledge Graph Extraction from Unstructured Text."
We’re proud to say that our workshop inspired a number of submissions from others who also wanted to put the UN’s data to use in trying to achieve the SDGs. Two of these submissions presented as part of the workshop. Luis Gonzalez Morales of the UN gave an excellent talk that expanded on our first approach, with an application that automatically extracts key concepts from various text documents, linking them to the most relevant sustainability development goals, targets, indicators and series. You can try out the application here.
Luigi Assom from Nifty.works presented a method for coarse grain analysis of the global food trade. This work illustrated the impact of food choices on hunger, and raised possible avenues for addressing the Zero Hunger SDG, such as embedding sustainability into food discovery services. You can see his presentation here.
Just the beginning
There is much to be done with the sustainability development goals. Modelling the concepts is key and can be done manually or semi-automatically, broken down into a top down approach of scoping; defining and modelling; and ingesting data. But automation is the key to an ambiguous space with unknown number of concepts – and critical to drive the kind of impact that the world needs on the global scale.
Knowledge graphs provide a means to understanding the context to data, and that is the key to comprehending SDG data. Though we only scratch the surface in testing out our approaches, our workshop presentations show how you can provide context to some of the metrics, and how to automate some of the modelling tasks.
We are keen to see what others have done with the data in next year’s workshop! We challenge everyone to work together, form teams and groups, and see what you can do with the SDG data. Show us how you would model it, your methods to derive new insights to the outcomes, or how one SDG intervention effects another. In the meantime, see the presentations from this year’s workshop here, watch the recording here, and email Colin Puri and Vivek Khetan to learn more about our methods or join us in our efforts.