Dylan Hillier
DylanASHillier
AI & ML interests
None yet
Organizations
None yet
Reasoning
-
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper • 2402.14830 • Published • 25 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50
Imitative Learning
Embodied useful
Model Internals
State Space Models
-
Repeat After Me: Transformers are Better than State Space Models at Copying
Paper • 2402.01032 • Published • 24 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 32 -
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Paper • 2402.10644 • Published • 81 -
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 42
Learning from feedback dir
-
Suppressing Pink Elephants with Direct Principle Feedback
Paper • 2402.07896 • Published • 11 -
Policy Improvement using Language Feedback Models
Paper • 2402.07876 • Published • 9 -
Direct Language Model Alignment from Online AI Feedback
Paper • 2402.04792 • Published • 34 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 68
Sample Efficiency
STLM
Benchmarks etc.
State Space Models
-
Repeat After Me: Transformers are Better than State Space Models at Copying
Paper • 2402.01032 • Published • 24 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 32 -
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Paper • 2402.10644 • Published • 81 -
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 42
Reasoning
-
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper • 2402.14830 • Published • 25 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50
Learning from feedback dir
-
Suppressing Pink Elephants with Direct Principle Feedback
Paper • 2402.07896 • Published • 11 -
Policy Improvement using Language Feedback Models
Paper • 2402.07876 • Published • 9 -
Direct Language Model Alignment from Online AI Feedback
Paper • 2402.04792 • Published • 34 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 68
Imitative Learning
Sample Efficiency
Embodied useful
STLM
Model Internals