Chunhua Liao commited on
Commit
838e605
·
1 Parent(s): 15bbd23

Update TODO: Add MCP as enabling technology for features

Browse files
Files changed (1) hide show
  1. TODO.md +27 -23
TODO.md CHANGED
@@ -6,50 +6,54 @@ This list outlines the top 10 priority features identified to make this system a
6
  * Implement database storage (e.g., SQLite, PostgreSQL) for research goals, context memory (hypotheses, results), run history, and user feedback.
7
  * *Why:* Essential for any non-trivial use; allows resuming runs, comparing results across sessions, and managing generated data effectively.
8
 
9
- 2. **Enhanced Novelty/Feasibility Validation:**
10
- * Integrate external data sources for validation. Examples:
11
- * PubMed API search for novelty checks against existing literature.
 
 
 
 
 
12
  * Patent database search (e.g., Google Patents API) for IP checks.
13
  * Use specialized LLMs or structured data for deeper technical feasibility analysis.
14
- * *Why:* Grounds LLM assessments in real-world data, crucial for scientific rigor and entrepreneurial due diligence.
15
 
16
- 3. **Market/Impact Assessment Agent:**
17
- * Add a dedicated agent or enhance the Reflection agent to evaluate hypotheses based on potential market size, societal impact, commercial viability, or alignment with business goals.
18
- * *Why:* Directly addresses entrepreneurial needs, adding a crucial dimension beyond purely technical assessment.
19
 
20
- 4. **User Feedback Integration & Weighted Ranking:**
21
  * Allow users to rate, tag, or comment on hypotheses within the UI during/after cycles.
22
  * Modify the ranking system (e.g., Elo or a new multi-objective approach) to incorporate user feedback as a weighted factor alongside novelty, feasibility, etc.
23
  * *Why:* Makes the user an active participant, tailoring results to their domain expertise and specific priorities.
24
 
25
- 5. **Experimental Design / Next Step Suggestion Agent:**
26
- * Add an agent that analyzes top-ranked hypotheses and suggests concrete next steps.
27
- * Examples: Propose specific experiments, identify necessary resources/expertise, suggest validation methods.
28
- * *Why:* Increases the actionability of the generated hypotheses, bridging the gap between idea and practical execution for scientists.
29
 
30
- 6. **Literature Corpus Integration (RAG):**
31
  * Allow users to upload relevant documents (PDFs) or specify research areas/keywords.
32
- * Implement Retrieval-Augmented Generation (RAG) to provide this specific corpus as context to the LLM during generation and reflection steps.
33
- * *Why:* Massively improves the quality, relevance, and grounding of hypotheses within a specific domain.
34
 
35
- 7. **Advanced Visualization & Interaction:**
36
  * Enhance the similarity graph (e.g., node clustering, filtering by score/tags, sizing nodes by Elo).
37
  * Add plots showing score/metric trends over iterations.
38
  * Allow direct interaction with visualizations (e.g., selecting nodes to prune/evolve).
39
  * *Why:* Improves usability, facilitates pattern recognition, and allows for more intuitive exploration of the hypothesis space.
40
 
41
- 8. **Structured Input & Constraints:**
42
  * Develop a more structured way for users to define the research goal, including key variables, target outcomes, specific technical/budgetary constraints, target user/market segments.
43
  * Use this structured input to generate more targeted prompts for agents.
44
  * *Why:* Provides finer-grained control over the process, leading to more relevant and constrained results.
45
 
46
- 9. **Advanced Evolution Strategies:**
47
  * Implement more sophisticated methods for evolving hypotheses beyond simple combination.
48
  * Examples: LLM-driven mutation based on critiques, varied crossover techniques, targeted refinement prompts.
49
  * *Why:* Improves the core refinement capability of the system, potentially leading to more creative and robust hypotheses.
50
 
51
- 10. **Run Comparison & Trend Analysis:**
52
- * Store results from multiple runs (potentially with different settings).
53
- * Add UI features to compare results across runs (e.g., which settings produced higher-ranked hypotheses?).
54
- * Visualize trends over time (e.g., average Elo score per cycle).
55
- * *Why:* Enables meta-analysis of the system's performance and helps users understand how different parameters influence the outcomes.
 
6
  * Implement database storage (e.g., SQLite, PostgreSQL) for research goals, context memory (hypotheses, results), run history, and user feedback.
7
  * *Why:* Essential for any non-trivial use; allows resuming runs, comparing results across sessions, and managing generated data effectively.
8
 
9
+ 2. **Leverage Model Context Protocol (MCP):**
10
+ * Implement MCP servers to provide tools for external interactions (web search, database lookups, API calls).
11
+ * Refactor agents to use `use_mcp_tool` for accessing external data/functionality needed for validation, RAG, market analysis, etc.
12
+ * *Why:* Provides a standardized, robust, and extensible way to connect the AI to real-time external information and capabilities, enabling many other features on this list.
13
+
14
+ 3. **Enhanced Novelty/Feasibility Validation (via MCP):**
15
+ * Build MCP tools that wrap external APIs/services for validation:
16
+ * PubMed API search tool for novelty checks.
17
  * Patent database search (e.g., Google Patents API) for IP checks.
18
  * Use specialized LLMs or structured data for deeper technical feasibility analysis.
19
+ * *Why:* Grounds LLM assessments in real-world data, crucial for scientific rigor and entrepreneurial due diligence. (Enabled by MCP).
20
 
21
+ 4. **Market/Impact Assessment Agent (via MCP):**
22
+ * Add an agent or enhance Reflection agent to use MCP tools (e.g., web search, financial data APIs) to assess market size, impact, viability.
23
+ * *Why:* Directly addresses entrepreneurial needs, adding a crucial dimension beyond purely technical assessment. (Enabled by MCP).
24
 
25
+ 5. **User Feedback Integration & Weighted Ranking:**
26
  * Allow users to rate, tag, or comment on hypotheses within the UI during/after cycles.
27
  * Modify the ranking system (e.g., Elo or a new multi-objective approach) to incorporate user feedback as a weighted factor alongside novelty, feasibility, etc.
28
  * *Why:* Makes the user an active participant, tailoring results to their domain expertise and specific priorities.
29
 
30
+ 6. **Experimental Design / Next Step Suggestion Agent (via MCP):**
31
+ * Add an agent using MCP tools (e.g., querying protocol databases, chemical suppliers) to suggest concrete next steps, experiments, resources.
32
+ * *Why:* Increases the actionability of the generated hypotheses, bridging the gap between idea and practical execution for scientists. (Potentially enabled by MCP).
 
33
 
34
+ 7. **Literature Corpus Integration (RAG via MCP):**
35
  * Allow users to upload relevant documents (PDFs) or specify research areas/keywords.
36
+ * Implement RAG, potentially using an MCP server/tool to interface with a vector database or document retrieval system containing the user-provided corpus.
37
+ * *Why:* Massively improves the quality, relevance, and grounding of hypotheses within a specific domain. (Enabled by MCP).
38
 
39
+ 8. **Advanced Visualization & Interaction:**
40
  * Enhance the similarity graph (e.g., node clustering, filtering by score/tags, sizing nodes by Elo).
41
  * Add plots showing score/metric trends over iterations.
42
  * Allow direct interaction with visualizations (e.g., selecting nodes to prune/evolve).
43
  * *Why:* Improves usability, facilitates pattern recognition, and allows for more intuitive exploration of the hypothesis space.
44
 
45
+ 9. **Structured Input & Constraints:**
46
  * Develop a more structured way for users to define the research goal, including key variables, target outcomes, specific technical/budgetary constraints, target user/market segments.
47
  * Use this structured input to generate more targeted prompts for agents.
48
  * *Why:* Provides finer-grained control over the process, leading to more relevant and constrained results.
49
 
50
+ 10. **Advanced Evolution Strategies:**
51
  * Implement more sophisticated methods for evolving hypotheses beyond simple combination.
52
  * Examples: LLM-driven mutation based on critiques, varied crossover techniques, targeted refinement prompts.
53
  * *Why:* Improves the core refinement capability of the system, potentially leading to more creative and robust hypotheses.
54
 
55
+ 11. **Run Comparison & Trend Analysis:**
56
+ * Requires persistent storage (#1). Store results from multiple runs.
57
+ * Add UI features to compare results across runs.
58
+ * Visualize trends over time.
59
+ * *Why:* Enables meta-analysis and understanding of how settings impact outcomes.