Spaces:
Running
Running
File size: 7,850 Bytes
7b95278 1ab873b 743154a 1ab873b e1cced0 1626535 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
---
title: "Ontology-Enhanced RAG System"
emoji: "π§ "
colorFrom: "indigo"
colorTo: "blue"
sdk: "streamlit"
sdk_version: "1.44.0"
app_file: "app.py"
pinned: false
---
# Enhanced Ontology-RAG System
## Project Overview
This repository contains an advanced Retrieval-Augmented Generation (RAG) system that integrates structured ontologies with language models. The system demonstrates how formal ontological knowledge representation can enhance traditional vector-based retrieval methods to provide more accurate, contextually rich, and logically consistent answers to user queries.
The project implements a sophisticated architecture that combines:
- JSON-based ontology representation with classes, relationships, rules, and instances
- Knowledge graph visualization for exploring entity relationships
- Semantic path finding for multi-hop reasoning between concepts
- Comparative analysis between traditional vector-based RAG and ontology-enhanced RAG
The application is built with **Streamlit** for the frontend interface, uses **FAISS** for vector embeddings, **NetworkX** for graph representation, and integrates with **OpenAI's language models** for generating responses.
## Key Features
1. **RAG Comparison Demo**
- Side-by-side comparison of traditional and ontology-enhanced RAG
- Analysis of differences in answers and retrieved context
2. **Knowledge Graph Visualization**
- Interactive network graph for exploring the ontology structure
- Multiple layout algorithms (force-directed, hierarchical, radial, circular)
- Entity relationship exploration with customizable focus
3. **Ontology Structure Analysis**
- Visualization of class hierarchies and statistics
- Relationship usage and domain-range distribution analysis
- Graph statistics including node counts, edge counts, and centrality metrics
4. **Entity Exploration**
- Detailed entity information cards showing properties and relationships
- Relationship graphs centered on specific entities
- Neighborhood exploration for entities
5. **Semantic Path Visualization**
- Path visualization between entities with step-by-step explanation
- Visual representation of paths through the knowledge graph
- Connection to relevant business rules
6. **Reasoning Trace Visualization**
- Query analysis with entity and relationship detection
- Sankey diagrams showing information flow in the RAG process
- Explanation of reasoning steps
## Ontology Structure Example
The `data/enterprise_ontology.json` file contains a rich enterprise ontology that models organizational knowledge. Here's a breakdown of its key components:
### Classes (Entity Types)
The ontology defines a hierarchical class structure with inheritance relationships. For example:
- **Entity** (base class)
- **FinancialEntity** β Budget, Revenue, Expense
- **Asset** β PhysicalAsset, DigitalAsset, IntellectualProperty
- **Person** β InternalPerson β Employee β Manager
- **Process** β BusinessProcess, DevelopmentProcess, SupportProcess
- **Market** β GeographicMarket, DemographicMarket, BusinessMarket
Each class has a description and a set of defined properties. For instance, the `Employee` class includes properties like role, hire date, and performance rating.
### Relationships
The ontology defines explicit relationships between entity types, including:
- `ownedBy`: Connects Product to Department
- `managedBy`: Connects Department to Manager
- `worksOn`: Connects Employee to Product
- `purchases`: Connects Customer to Product
- `provides`: Connects Customer to Feedback
- `optimizedBy`: Relates Product to Feedback
Each relationship has metadata such as domain, range, cardinality, and inverse relationship name.
### Business Rules
The ontology contains formal business rules that constrain the knowledge model:
- "Every Product must be owned by exactly one Department"
- "Every Department must be managed by exactly one Manager"
- "Critical support tickets must be assigned to Senior employees or managers"
- "Product Lifecycle stages must follow a predefined sequence"
### Instances
The ontology includes concrete instances of the defined classes, such as:
- `product1`: An "Enterprise Analytics Suite" owned by the Engineering department
- `manager1`: A director named "Jane Smith" who manages the Engineering department
- `customer1`: "Acme Corp" who has purchased product1 and provided feedback
Each instance has properties and relationships to other instances, forming a connected knowledge graph.
This structured knowledge representation allows the system to perform semantic reasoning beyond what would be possible with simple text-based approaches, enabling it to answer complex queries that require understanding of hierarchical relationships, business rules, and multi-step connections between entities.
## Getting Started
### Prerequisites
- Python 3.8+
- OpenAI API key
### Installation
1. Clone this repository
2. Install the required dependencies:
```
pip install -r requirements.txt
```
3. Set up your OpenAI API key as an environment variable or in the Streamlit secrets
### Running the Application
To run the application locally:
```
streamlit run app.py
```
For deployment instructions, please refer to the [DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md).
## Project Structure
```
ontology-rag/
βββ .streamlit/
β βββ config.toml # Streamlit configuration
βββ data/
β βββ enterprise_ontology.json # Enterprise ontology data
β βββ enterprise_ontology.txt # Simplified text representation of ontology
βββ src/
β βββ __init__.py
β βββ knowledge_graph.py # Knowledge graph processing
β βββ ontology_manager.py # Ontology management
β βββ semantic_retriever.py # Semantic retrieval
β βββ visualization.py # Visualization functions
βββ static/
β βββ css/
β βββ styles.css # Custom styles
βββ app.py # Main application
βββ requirements.txt # Dependencies list
βββ DEPLOYMENT_GUIDE.md # Deployment instructions
βββ README.md # This file
```
## Use Cases
### Enterprise Knowledge Management
The ontology-enhanced RAG system can help organizations effectively organize and access their knowledge assets, connecting information across different departments and systems to provide more comprehensive business insights.
### Product Development Decision Support
By understanding the relationships between customer feedback, product features, and market data, the system can provide more valuable support for product development decisions.
### Complex Compliance Queries
In compliance scenarios where multiple rules and relationships need to be considered, the ontology-enhanced RAG can provide rule-based reasoning to ensure recommendations comply with all applicable policies and regulations.
### Diagnostics and Troubleshooting
In technical support and troubleshooting scenarios, the system can connect symptoms, causes, and solutions through multi-hop reasoning to provide more accurate diagnoses.
## Acknowledgments
This project demonstrates the integration of ontological knowledge with RAG systems for enhanced query answering capabilities. It builds upon research in knowledge graphs, semantic web technologies, and large language models.
## License
This project is licensed under the MIT License - see the license file for details.
## Reference:
Sharma, K., Kumar, P. and Li, Y., 2024. OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models. arXiv preprint arXiv:2412.15235. Available at: https://arxiv.org/abs/2412.15235 [Accessed 16 Apr. 2025]. |