NextGenC commited on
Commit
6875d38
·
verified ·
1 Parent(s): 36332c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -121,53 +121,53 @@ The system is modular, consisting of several Python components:
121
  ## 🚧 Limitations
122
 
123
  - **Language**
124
- 1- Optimized for English. Performance may degrade significantly on other languages.
125
 
126
  - **Domain Specificity**
127
- 1. Achieves best results in AI/ML domains. Adaptation (e.g., domain-specific rules or keywords) is required for other fields.
128
 
129
  - **PDF Quality**
130
- 1. Heavily reliant on clean text extraction. Scanned PDFs, complex layouts, or poor OCR significantly reduce accuracy.
131
 
132
  - **Scalability**
133
- 1. Processing very large corpora (e.g., >10,000 papers) may require significant computational resources or distributed infrastructure.
134
 
135
  - **Relationship Nuance**
136
- 1. Relationships are extracted based on co-occurrence and semantic similarity. Logical or causal connections may not be captured.
137
 
138
  - **Temporal Accuracy**
139
- 1. Depends on accurate publication date extraction from metadata or filenames. Errors may affect timeline analysis.
140
 
141
  - **Visualization Clutter**
142
- 1. Interactive graph visualizations become cluttered and less interpretable when node count exceeds ~1000.
143
 
144
  ---
145
 
146
  ## 🌱 Future Work
147
 
148
- - **Multi-language Support**
149
- 1. Integration of multilingual NLP models to support non-English documents.
150
 
151
  - **Citation Integration**
152
- 1. Incorporating citation links and citation graph data into network analysis.
153
 
154
  - **ML-based Extraction**
155
- 1. Training supervised or semi-supervised models to improve concept and relation extraction quality.
156
 
157
  - **Advanced Visualizations**
158
- 1. Implementation of timeline views, dashboards, and alternative graph layouts (e.g., hierarchical, clustered).
159
 
160
  - **Improved Temporal Modeling**
161
- 1. Use of advanced time-series techniques to detect emerging trends and historical shifts.
162
 
163
  - **Web Interface**
164
- 1. A user-friendly UI for uploading documents, viewing visualizations, and downloading results.
165
 
166
  - **Knowledge Graph Export**
167
- 1. Export capabilities for standard knowledge graph formats like RDF, OWL, or JSON-LD.
168
 
169
  - **Concept Disambiguation**
170
- 1. Methods to differentiate between identically named but contextually distinct concepts.
171
 
172
  ---
173
 
 
121
  ## 🚧 Limitations
122
 
123
  - **Language**
124
+ Optimized for English. Performance may degrade significantly on other languages.
125
 
126
  - **Domain Specificity**
127
+ Achieves best results in AI/ML domains. Adaptation (e.g., domain-specific rules or keywords) is required for other fields.
128
 
129
  - **PDF Quality**
130
+ Heavily reliant on clean text extraction. Scanned PDFs, complex layouts, or poor OCR significantly reduce accuracy.
131
 
132
  - **Scalability**
133
+ Processing very large corpora (e.g., >10,000 papers) may require significant computational resources or distributed infrastructure.
134
 
135
  - **Relationship Nuance**
136
+ Relationships are extracted based on co-occurrence and semantic similarity. Logical or causal connections may not be captured.
137
 
138
  - **Temporal Accuracy**
139
+ Depends on accurate publication date extraction from metadata or filenames. Errors may affect timeline analysis.
140
 
141
  - **Visualization Clutter**
142
+ Interactive graph visualizations become cluttered and less interpretable when node count exceeds ~1000.
143
 
144
  ---
145
 
146
  ## 🌱 Future Work
147
 
148
+ - **Multi-language Support**
149
+ Integration of multilingual NLP models to support non-English documents.
150
 
151
  - **Citation Integration**
152
+ Incorporating citation links and citation graph data into network analysis.
153
 
154
  - **ML-based Extraction**
155
+ Training supervised or semi-supervised models to improve concept and relation extraction quality.
156
 
157
  - **Advanced Visualizations**
158
+ Implementation of timeline views, dashboards, and alternative graph layouts (e.g., hierarchical, clustered).
159
 
160
  - **Improved Temporal Modeling**
161
+ Use of advanced time-series techniques to detect emerging trends and historical shifts.
162
 
163
  - **Web Interface**
164
+ A user-friendly UI for uploading documents, viewing visualizations, and downloading results.
165
 
166
  - **Knowledge Graph Export**
167
+ Export capabilities for standard knowledge graph formats like RDF, OWL, or JSON-LD.
168
 
169
  - **Concept Disambiguation**
170
+ Methods to differentiate between identically named but contextually distinct concepts.
171
 
172
  ---
173