Jafta (Jafta September)

liked a Space about 1 month ago

146

ReconViaGen

🖥

High-fidelity 3D Geometry Generation from multi-view images

liked a model about 1 month ago

stabilityai/stable-fast-3d

Image-to-3D • 1B • Updated Apr 8 • 2.59k • 702

liked a model about 2 months ago

unsloth/LFM2-350M

Text Generation • 0.4B • Updated Jul 14 • 3.77k • 2

liked 2 models about 1 year ago

stepfun-ai/GOT-OCR2_0

Image-Text-to-Text • 0.7B • Updated Feb 4 • 44.5k • 1.52k

AI-MO/NuminaMath-7B-TIR

Text Generation • 7B • Updated Aug 14, 2024 • 416 • 348

liked 4 models over 1 year ago

reacted to m-ric's post with ❤️ over 1 year ago

Post

1739

𝗛𝗼𝘄 𝗱𝗼𝗲𝘀 𝗯𝗲𝗮𝗺 𝘀𝗲𝗮𝗿𝗰𝗵 𝗱𝗲𝗰𝗼𝗱𝗶𝗻𝗴 𝘄𝗼𝗿𝗸? ➡️ 𝙉𝙚𝙬 𝙫𝙞𝙨𝙪𝙖𝙡𝙞𝙯𝙖𝙩𝙞𝙤𝙣 𝙩𝙤𝙤𝙡! 👀

In Decoder-type LLMs like GPT4 or Mistral-Large, the output is generated one token (=word part) at a time. That's why they're nicknamed "stochastic parrots": the "thinking" process only happens one step at a time, so it can seem really myopic.

𝐒𝐨 𝐡𝐨𝐰 𝐢𝐬 𝐭𝐡𝐞 𝐧𝐞𝐱𝐭 𝐭𝐨𝐤𝐞𝐧 𝐬𝐞𝐥𝐞𝐜𝐭𝐞𝐝?

📊 Given its input sentence like "𝘞𝘩𝘢𝘵 𝘪𝘴 𝘵𝘩𝘦 7𝘵𝘩 𝘍𝘪𝘣𝘰𝘯𝘢𝘤𝘤𝘪 𝘯𝘶𝘮𝘣𝘦𝘳? 𝘛𝘩𝘦 7𝘵𝘩 𝘍𝘪𝘣𝘰𝘯𝘢𝘤𝘤𝘪 𝘯𝘶𝘮𝘣𝘦𝘳", the Decoder LLM generates, for each token in its vocabulary, a score that represents this token's probability of coming next.
For instance: "𝙞𝙨" gets score 0.56, and "𝙘𝙖𝙣" gets score 0.35.

🤑 𝐆𝐫𝐞𝐞𝐝𝐲 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 is the naive option where you simply take the next most probable token at each step. But this creates paths that maximize very short-term rewards, thus may overlook better paths for the long term (like this time when you played FIFA all evening and arrived unprepared to your school exam on the next day).
In our example, the next highest score token might be "𝙞𝙨", but this will strongly bias the LLM towards giving an hasty response. On the opposite, starting with "𝙘𝙖𝙣" could have been completed with "𝘣𝘦 𝘰𝘣𝘵𝘢𝘪𝘯𝘦𝘥 𝘧𝘳𝘰𝘮 𝘤𝘰𝘮𝘱𝘶𝘵𝘪𝘯𝘨 𝘱𝘳𝘦𝘷𝘪𝘰𝘶𝘴 𝘍𝘪𝘣𝘰𝘯𝘢𝘤𝘤𝘪 𝘯𝘶𝘮𝘣𝘦𝘳𝘴 𝘧𝘪𝘳𝘴𝘵", which steers the LLM towards a correct reasoning!

🗺️ 𝐁𝐞𝐚𝐦 𝐬𝐞𝐚𝐫𝐜𝐡 improves on greedy decoding by generating at each step several paths - called beams - instead of one. This allows the generation to explore a much larger space, thus find better completions. In our example, both the "𝙞𝙨" and the "𝙘𝙖𝙣" completion could be tested. ✅

👉 I've created a tool to let you visualize it, thank you @joaogante for your great help!
𝙏𝙧𝙮 𝙞𝙩 𝙝𝙚𝙧𝙚: m-ric/beam_search_visualizer

reacted to Jaward's post with ❤️ over 1 year ago

Post

Retrieval-Augmented Generation (RAG)
Redeemer of the "hallucination problem"

It is fair enough to argue that "hallucinations" in LLMs are just mere reflections of what we humans occasionally do - well it gets worse as we get older, but these models are brain inspired, thus such behaviors are likely inherently unavoidable. After all, we are just dreamers trying make sense of this life.

The best we can do is minimize and control it - but humanly how? By first feeding on relevant facts and then developing a habit that allows us to easily access those facts when needed. This is what RAG is all about - it's just a control mechanism that keeps the LLM aligned with reality and fact.

But How Does RAG Work?

Well, to some extent it is domain-specific but the overall workflow boils down to the following:

1. It makes use of a retrieval mechanism that hunts for facts relevant to a query - this involves an end-to-end backpropagation that leverages a retriever (Query Encoder + Document Index or Source of Truth) with a pre-trained generative model.

2. The generative model then uses the facts retrieved, performs some verification to give a more accurate response.

To summarize, the RAG architecture houses a pre-existing knowledge source model (termed parametric memory), which then utilizes a Source-of-Truth model or vector indexed data (termed non-parametric memory) that is accessed by a pre-trained neural retriever, in order to produce more informed, contextually appropriate and factually correct responses.

Sort of a "Genius Engine" if you might say. If only we humans could harness such, AGI would be much much sooner lol.

In the meantime, I have been Jaward Sesay (Chinese name 苏杰 Sujie) - a young Sierra Leonean, aspiring AI Researcher. I like to read, share and try implementing AI research papers. Also like dunking on big tech while rooting for open-source. My mentor @karpathy , I dream of him following me back on X lol. Thanks.