 adrish
			's Collections
			adrish
			's Collections
			
			
		Papers to Read
		
	updated
			
 
				
				
 - Speculative Streaming: Fast LLM Inference without Auxiliary Models- 
			Paper
			 •- 
			2402.11131
			 •
			Published
				
			•- 
				43
			 
 - Generative Representational Instruction Tuning- 
			Paper
			 •- 
			2402.09906
			 •
			Published
				
			•- 
				54
			 
 - Chain-of-Thought Reasoning Without Prompting- 
			Paper
			 •- 
			2402.10200
			 •
			Published
				
			•- 
				109
			 
 - BitDelta: Your Fine-Tune May Only Be Worth One Bit- 
			Paper
			 •- 
			2402.10193
			 •
			Published
				
			•- 
				22
			 
 - Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
  Language Models- 
			Paper
			 •- 
			2402.13064
			 •
			Published
				
			•- 
				50
			 
 - FinTral: A Family of GPT-4 Level Multimodal Financial Large Language
  Models- 
			Paper
			 •- 
			2402.10986
			 •
			Published
				
			•- 
				80
			 
 - 2D Matryoshka Sentence Embeddings- 
			Paper
			 •- 
			2402.14776
			 •
			Published
				
			•- 
				6
			 
 - RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in
  Long-Horizon Generation- 
			Paper
			 •- 
			2403.05313
			 •
			Published
				
			•- 
				9
			 
 - Mamba: Linear-Time Sequence Modeling with Selective State Spaces- 
			Paper
			 •- 
			2312.00752
			 •
			Published
				
			•- 
				146
			 
 - Long-context LLMs Struggle with Long In-context Learning- 
			Paper
			 •- 
			2404.02060
			 •
			Published
				
			•- 
				37
			 
 - ReALM: Reference Resolution As Language Modeling- 
			Paper
			 •- 
			2403.20329
			 •
			Published
				
			•- 
				22
			 
 - ProAgent: Building Proactive Cooperative AI with Large Language Models- 
			Paper
			 •- 
			2308.11339
			 •
			Published
 - ProAgent: From Robotic Process Automation to Agentic Process Automation- 
			Paper
			 •- 
			2311.10751
			 •
			Published
				
			•- 
				10
			 
 - Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs- 
			Paper
			 •- 
			2404.05719
			 •
			Published
				
			•- 
				83
			 
 - General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model- 
			Paper
			 •- 
			2409.01704
			 •
			Published
				
			•- 
				83