With the big hype around AI agents these days, I couldnโt stop thinking about how AI agents could truly enhance real-world activities. What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boringโฆ
Passionate about outdoors, Iโve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. Thatโs why I built ๐๐น๐ฝ๐ถ๐ป๐ฒ ๐๐ด๐ฒ๐ป๐, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.
Built using Hugging Face's ๐๐บ๐ผ๐น๐ฎ๐ด๐ฒ๐ป๐๐ library, Alpine Agent combines the power of AI with trusted resources like ๐๐ฌ๐ช๐ต๐ฐ๐ถ๐ณ.๐ง๐ณ (https://skitour.fr/) and METEO FRANCE. Whether itโs suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.
In my latest blog post, I share how I developed this projectโfrom defining tools and integrating APIs to selecting the best LLMs like ๐๐ธ๐ฆ๐ฏ2.5-๐๐ฐ๐ฅ๐ฆ๐ณ-32๐-๐๐ฏ๐ด๐ต๐ณ๐ถ๐ค๐ต, ๐๐ญ๐ข๐ฎ๐ข-3.3-70๐-๐๐ฏ๐ด๐ต๐ณ๐ถ๐ค๐ต, or ๐๐๐-4.
โท๏ธ Curious how AI can enhance adventure planning?โจTry the app and share your thoughts: florentgbelidji/alpine-agent ๐ Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent
Many thanks to @m-ric for helping on building this tool with smolagents!
This work from Chinese startup @MiniMax-AI introduces a novel architecture that achieves state-of-the-art performance while handling context windows up to 4 million tokens - roughly 20x longer than current models. The key was combining lightning attention, mixture of experts (MoE), and a careful hybrid approach.
๐๐ฒ๐ ๐ถ๐ป๐๐ถ๐ด๐ต๐๐:
๐๏ธ MoE with novel hybrid attention: โฃ Mixture of Experts with 456B total parameters (45.9B activated per token) โฃ Combines Lightning attention (linear complexity) for most layers and traditional softmax attention every 8 layers
๐ Outperforms leading models across benchmarks while offering vastly longer context: โฃ Competitive with GPT-4/Claude-3.5-Sonnet on most tasks โฃ Can efficiently handle 4M token contexts (vs 256K for most other LLMs)
๐ฌ Technical innovations enable efficient scaling: โฃ Novel expert parallel and tensor parallel strategies cut communication overhead in half โฃ Improved linear attention sequence parallelism, multi-level padding and other optimizations achieve 75% GPU utilization (that's really high, generally utilization is around 50%)
๐ฏ Thorough training strategy: โฃ Careful data curation and quality control by using a smaller preliminary version of their LLM as a judge!
Overall, not only is the model impressive, but the technical paper is also really interesting! ๐ It has lots of insights including a great comparison showing how a 2B MoE (24B total) far outperforms a 7B model for the same amount of FLOPs.
๐ช๐ฒ'๐๐ฒ ๐ท๐๐๐ ๐ฟ๐ฒ๐น๐ฒ๐ฎ๐๐ฒ๐ฑ ๐๐บ๐ผ๐น๐ฎ๐ด๐ฒ๐ป๐๐ ๐๐ญ.๐ฏ.๐ฌ ๐, and it comes with a major feature: you can now log agent runs using OpenTelemetry to inspect them afterwards! ๐
This interactive format is IMO much easier to inspect big multi-step runs than endless console logs.