Creating knowledge graph index out of a XML (DEXPI) file

Context:

I have a XML file (DEXPI) and I want to use it as a data source to implement Retrieval Augmented Generation (RAG) system using llama-index to fetch the correct context against any natural language query.

Current Issue:

I cannot use the XML file like a text document.
llama-index does not provide any type of splitter for XML data so that XML data can be correctly divided into chunks (nodes).
Even if we write some custom chunker/splitter, a lot of unwanted jargons would be still there in the chunks like XML tags and other metadata related to XML.

What did I try?

To solve this issue I have 2 approaches:

Approach 1:

Convert the XML into SQl tables (or CSVs). Convert these tables into natural language english text. Then pass this text to llama-index for further processing. Here, while preparing the knowledge graph index, the llama-index will automatically figure out the vertices (entities) and the edges (relationships) between them.

Approach 2:

Convert the XML into SQL tables (or CSVs). Convert these SQL tables into Graph DB entities & relationships manually. Then query the graph db by using a graph query generated from any LLM.

My Questions:

I need suggestions on which approach to choose currently & how effective they are.
Are there any better approaches to deal with XML data when using llama-index.

asked Dec 6, 2023 at 12:51

Deepak Tatyaji Ahire

5,2694 gold badges18 silver badges45 bronze badges

I would create a Powershell script to parse the xml into another format like CSV. You can also with Powershell connect to a SQL Server and store data into the database.
– jdweng
Commented Dec 6, 2023 at 13:52

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Collectives™ on Stack Overflow

Creating knowledge graph index out of a XML (DEXPI) file

0

Your Answer

Browse other questions tagged
xml
graph-databases
large-language-model
llama-index
retrieval-augmented-generation
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Browse other questions tagged xmlgraph-databaseslarge-language-modelllama-indexretrieval-augmented-generation or ask your own question.

Browse other questions tagged
xml
graph-databases
large-language-model
llama-index
retrieval-augmented-generation
or ask your own question.