Update README.md
Browse files
README.md
CHANGED
@@ -121,53 +121,53 @@ The system is modular, consisting of several Python components:
|
|
121 |
## 🚧 Limitations
|
122 |
|
123 |
- **Language**
|
124 |
-
|
125 |
|
126 |
- **Domain Specificity**
|
127 |
-
|
128 |
|
129 |
- **PDF Quality**
|
130 |
-
|
131 |
|
132 |
- **Scalability**
|
133 |
-
|
134 |
|
135 |
- **Relationship Nuance**
|
136 |
-
|
137 |
|
138 |
- **Temporal Accuracy**
|
139 |
-
|
140 |
|
141 |
- **Visualization Clutter**
|
142 |
-
|
143 |
|
144 |
---
|
145 |
|
146 |
## 🌱 Future Work
|
147 |
|
148 |
-
- **Multi-language Support**
|
149 |
-
|
150 |
|
151 |
- **Citation Integration**
|
152 |
-
|
153 |
|
154 |
- **ML-based Extraction**
|
155 |
-
|
156 |
|
157 |
- **Advanced Visualizations**
|
158 |
-
|
159 |
|
160 |
- **Improved Temporal Modeling**
|
161 |
-
|
162 |
|
163 |
- **Web Interface**
|
164 |
-
|
165 |
|
166 |
- **Knowledge Graph Export**
|
167 |
-
|
168 |
|
169 |
- **Concept Disambiguation**
|
170 |
-
|
171 |
|
172 |
---
|
173 |
|
|
|
121 |
## 🚧 Limitations
|
122 |
|
123 |
- **Language**
|
124 |
+
Optimized for English. Performance may degrade significantly on other languages.
|
125 |
|
126 |
- **Domain Specificity**
|
127 |
+
Achieves best results in AI/ML domains. Adaptation (e.g., domain-specific rules or keywords) is required for other fields.
|
128 |
|
129 |
- **PDF Quality**
|
130 |
+
Heavily reliant on clean text extraction. Scanned PDFs, complex layouts, or poor OCR significantly reduce accuracy.
|
131 |
|
132 |
- **Scalability**
|
133 |
+
Processing very large corpora (e.g., >10,000 papers) may require significant computational resources or distributed infrastructure.
|
134 |
|
135 |
- **Relationship Nuance**
|
136 |
+
Relationships are extracted based on co-occurrence and semantic similarity. Logical or causal connections may not be captured.
|
137 |
|
138 |
- **Temporal Accuracy**
|
139 |
+
Depends on accurate publication date extraction from metadata or filenames. Errors may affect timeline analysis.
|
140 |
|
141 |
- **Visualization Clutter**
|
142 |
+
Interactive graph visualizations become cluttered and less interpretable when node count exceeds ~1000.
|
143 |
|
144 |
---
|
145 |
|
146 |
## 🌱 Future Work
|
147 |
|
148 |
+
- **Multi-language Support**
|
149 |
+
Integration of multilingual NLP models to support non-English documents.
|
150 |
|
151 |
- **Citation Integration**
|
152 |
+
Incorporating citation links and citation graph data into network analysis.
|
153 |
|
154 |
- **ML-based Extraction**
|
155 |
+
Training supervised or semi-supervised models to improve concept and relation extraction quality.
|
156 |
|
157 |
- **Advanced Visualizations**
|
158 |
+
Implementation of timeline views, dashboards, and alternative graph layouts (e.g., hierarchical, clustered).
|
159 |
|
160 |
- **Improved Temporal Modeling**
|
161 |
+
Use of advanced time-series techniques to detect emerging trends and historical shifts.
|
162 |
|
163 |
- **Web Interface**
|
164 |
+
A user-friendly UI for uploading documents, viewing visualizations, and downloading results.
|
165 |
|
166 |
- **Knowledge Graph Export**
|
167 |
+
Export capabilities for standard knowledge graph formats like RDF, OWL, or JSON-LD.
|
168 |
|
169 |
- **Concept Disambiguation**
|
170 |
+
Methods to differentiate between identically named but contextually distinct concepts.
|
171 |
|
172 |
---
|
173 |
|