AI for massive knowledge graphs

AI

·

Reading time: 5 minutes

 AI for massive knowledge graphs thumbnail depicting the text AI on a digital brain on screen

Explore how Neuro-Symbolic integration enables AI agents to process massive knowledge graphs for scalable, actionable analytics.

 

AI for massive knowledge graphs

 

Integrating knowledge graph technologies with AI agents and graph analytics can expose new pathways to interacting with your data and extract actionable information for your use case, allowing even massive graphs to be processed in a scalable manner. Here, we explore a use case that integrates AI and massive knowledge graphs with various Neuro-Symbolic approaches so that analytics can be performed even on extremely large graphs. Keep reading!

 

Table of Contents

 

 

The challenge: Extracting insights from massive knowledge graphs

 

Knowledge graphs are a useful tool to model information that represents complex relationships in your data. Typically, with knowledge graphs, there are standard symbolic approaches and algorithms that work well to address your needs for useful data analytics. 

 

When knowledge graphs become massive, however, and the relationships and semantics in your data become more intricate, it is often necessary to re-examine the traditional approaches to find strategies that can be tailored for massive graphs. This need can be due to memory issues, where a graph is too big to store on a normal computer or even a server, or too big for algorithms to handle when they need to look at lots of data at the same time.

 

In industry and academia, there are very large graphs that represent publications, authors, domains and the relationships between them. 

 

In this article, we will investigate an example of such a massive research knowledge graph and show how AI approaches can be leveraged to address the unique challenges posed when developing algorithms to utilize a massive knowledge graph. 

 

Massive Knowledge Graphs: what are they?

Massive knowledge graphs are knowledge graphs (KGs) that are, for some reason or another, much too large to manage with strategies that are typically effective for smaller graphs. Often, a knowledge graph is associated with an ontology, which schematizes the intended structure of the data in the knowledge graph in a formal way that is understandable by a computer or a human. When we say too large, this can mean a few different things, such as being simply too big to fit on a standard computer or too massively complex to process with typical algorithms, to give a few examples.

 

Scalable analytics for a massive research knowledge graph: Exploring the SemOpenAlex use case

 

Now let’s look at an existing use case that combines AI with massive knowledge graphs.  

 

Researchers and engineers at metaphacts are contributing to an EU project called Graph-Massivizer, where scalable solutions are being developed to process and extract insights from massive graphs with the Graph-Massivizer Toolkit. Composed of 12 partners from eight EU countries, the Graph-Massivizer project brings together the world-leading roles of European researchers in graph processing and serverless computing and uses leadership-class European infrastructure in the computing continuum. This project develops optimizations and AI/ML tools connected with the use cases described here, and motivates the bigger-picture topics we’ve discussed.

 

One of the use cases for this project is a development and integration scenario where test cases are developed for the massive SemOpenAlex knowledge graph of open academic publication data. The SemOpenAlex knowledge graph is challenging to design scalable algorithms for graph analytics due to its sheer size, so the project aims to use it as a test case to demonstrate the functionality under development. The use case features multiple interesting applications of these techniques for automated workflow execution and agentic AI that will be shown in the following sections.

 

An example of this use case examines a scholarly research collaboration network, answering the following query: ”How do I find the shortest path from myself to the most popular researcher in my field?”. The academic knowledge graph use case using SemOpenAlex that shows co-authorship relations for publications, attempting to find connections between influential researchers in the same field by tracing co-authorship relations. The knowledge graph includes additional metadata that can be leveraged by learning algorithms, such as publication information, including titles, authors, venues, and authors’ research fields. The use case defines “popular researcher” based on a metric considering extensive publications and co-authors. 

 

SemOpenAlex workflow diagram

Image: The workflow diagram for the SemOpenAlex use case

 

A workflow (as shown in the above diagram) for the use case is schematized in the user interface above with an ontology that is discussed in the next section. Tasks in the workflow include algorithms such as loading a subgraph using a SPARQL query or computing the Betweenness Centrality values for nodes in the graph, to give some examples. Using the metaphactory interface allows a user to select predefined algorithms based on available hardware and programming languages to execute their workflow. Because the workflow is expressed in RDF, it can be understood by a human as well as the computer responsible for executing it. The workflow contains pointers to data and functions that it describes, provided by the technical user who sets it up; a program or agent can execute it automatically with minimal input.

 

Executable workflows for graph algorithms

Workflows can be represented explicitly in RDF according to an ontology that also connects to a data source. This allows an agent, whether human or AI, to interact with and execute user-specified workflows dynamically while a system is given access to the required algorithms. If the workflow also specifies data flow, it can be used to execute an entire pipeline of different functions, sending input from one function to another automatically and showing the user the end result. This is advantageous when integrating knowledge graphs with AI since it abstracts away the information that is already known about how to use algorithms, which allows a developer or user to customize the execution of a workflow without always giving it lots of very specific instructions on what to do.

 

Workflow ontology

Image: workflow ontology

 

The workflows used in the SemOpenAlex use case are structured according to the ontology shown above. At the top, you can see the workflow class, which must contain at most one first task, which itself can have any number of next tasks. This can model a sequential or even tree-like structure based on user needs, where the first task is the root of the tree and all next tasks branch from there. This tree-like structure is intended to correspond roughly to the directed acyclic graph (DAG) that is required by the toolkit to define its workflows, ensuring that operations are well defined and not running in an infinite loop. When an end-user interacts with the system, all they are required to do is specify the workflow of tasks and associate them with available algorithms preloaded according to the other parts of the ontology.

 

For a developer who sets up the system, they specify available algorithms using the class BGO, or basic graph operation. This class represents a type of graph algorithm, such as breadth-first search. Each of these abstract BGOs is then connected with at least one implemented algorithm. It is possible to include multiple implementations of the same algorithm in various languages or with different inputs and outputs so that different execution pathways and scenarios can be made available to support diverse use cases. The system is then able to dynamically choose the algorithm and runtime environment best suited for execution of the task.

The motivation behind modeling and developing these workflows is that they enable an expert or team of experts to configure the Graph-Massivizer Toolkit in advance with algorithms and workflows that they wish to make available to an end user. The end user can then utilize these advanced AI techniques via a simple interface in metaphactory that builds and executes workflows correctly without needing to already have a deep technical understanding of AI techniques. Next, we will see an example of how the same type of approach can be integrated with a neuro-symbolic system that has its own unique advantages.

 

Analytics and agentic AI with metis

Graph analytics and agentic AI can be combined with neuro-symbolic approaches to achieve unique advantages. One such system, metis, is a knowledge-driven AI platform combining large language models and knowledge graphs to deliver AI agents that provide generative power, semantic precision and contextual, explainable insights. 

 

metis, which sits on top of metaphactory, is capable of exploring existing data in a knowledge graph and is also able to guide users through the modeling process when they design their ontology, or can be customized or selected from an agent repository. The AI component aids a user in their interaction with the knowledge graph in a conversational interface by translating natural language into machine-readable actions and instructions. We will see how this tool can be integrated with the use case discussed previously so that a user is able to perform analytics using a simple conversational interface.

 

Because metis allows interaction with a knowledge graph, users can utilize AI tools to improve productivity without sacrificing the essential trust that comes with using a well-designed ontology to manage their data.

 

Algorithms for graph analytics are supported as extensions to metaphactory applications and can also be integrated with metis. This makes the adoption of analytics features seamless with the platform usage in general. An app can act as a tool for metis, allowing it to answer complex questions over the knowledge graph that require advanced machine learning algorithms. 

 

sample metis conversation using SemOpenAlex data

Image: a sample conversation in metis executing the use case with SemOpenAlex data

 

In the image above, you can see a sample conversation where metis is used to execute the same workflow as we saw in the previous section. Using only natural language, a user can request complex workflows such as finding the shortest path between an instance and the instance with the maximum centrality in a graph. In order to answer this question, the agent not only needs to execute the algorithms but also must understand how to generate workflows in case multiple operations are requested in sequence. And with this functionality, users don’t need to compose workflows themselves since they can simply ask the agent using natural language.

Conclusion

We have looked at various methods for using AI and knowledge graphs together in complementary ways. The SemOpenAlex use case from the Graph-Massivizer project featured multiple interesting applications of these techniques for automated workflow execution and agentic AI. There are many new, exciting possibilities being developed every day to support AI and large knowledge graphs at metaphacts in this as well as other projects and domains, so look forward to more updates from our team about new applications for AI with your knowledge graphs.

 

Explore solutions

After seeing some examples of how knowledge graphs and AI can work together, you may already have an idea for how these technologies can benefit your organization.

 

Speak with an expert to discuss your organization and specific use case, and learn how metaphactory can support your knowledge graph and AI use case. You can also request a demo of metis to see it in action.