22 days ago

•

Thank you so much for your hard work ! This is an incredible model ! Better than the original as for me, it gave me an answer for 7 pages in WORD in small print, or answer for 3800 tokens.

I noticed that the GROK gives more detailed answers than the GPT version there too exactly but it's short and to the point. they are both good in their own way.

I was wondering if in the future there will be versions of GEMMA that can Reasoning mode with version 12B based on GROK?

reedmayhew

Owner 15 days ago

•

edited 15 days ago

Thank you so much for the feedback! I greatly appreciate it.

I've come up with a pretty good workflow for capturing the functions of new models as they come out, so I'm so glad to hear that it's working! Gemma 3 seems to take on the features of these datasets very well when it comes to finetuning on them.

As for reasoning with Grok 3, I'm not sure if I have the ability to access it's reasoning responses through the API yet. I might be able to trick it into reasoning, and if that's the case, then I can see about making a reasoning dataset for Gemma 3 to finetune on.

With my other models that reason, I actually have to get the reasoning output from the original models and then embed it into the dataset in order for it to work.

I'll see if I can though and update you here if I make any progress!

UPDATE:

Good news! I tested it and it seems like I can force it to reason. I'm re-running the dataset requests with reasoning from the model, and then I'll finetune new Gemma models with reasoning shortly!

E7Reine

15 days ago

•

edited 15 days ago

Greetings!
I am very pleased to hear about your successes, I have tried many different models, and I must admit Gemma is a success, especially the versions of it that you create.
why I focused on the GROK version it usually does not reduce the information point by point, breaking it down into sub categories 1. 1.1 1.1 1.2 etc.
Grok produces amazing informative results with more detailed information.

I also want to thank you for "o4-mini-high-gemma3-12B-distilled" it is excellent and in my opinion even surpasses GPT 4.5.

Reasoning GROK or o4-mini this would be very welcome, such model variants being the golden mean and as I think better than the regular versions at 27b🙏🙏
Thank you sincerely for your hard work.
I bow to you with deep respect.

reedmayhew

Owner 13 days ago

•

edited 13 days ago

Thank you! I very much appreciate it.

Unfortunately, I am running into an error in the Colab notebook that happens once in a while, where it says:

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 100 | Num Epochs = 6 | Total steps = 72
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 1,047,527,424/12,000,000,000 (8.73% trained)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/unsloth_zoo/loss_utils.py in _unsloth_get_batch_samples(self, epoch_iterator, num_batches, device, *args, **kwargs)
    281                 num_items_in_batch = sum(
--> 282                     [(x["labels"][..., 1:] != -100)\
    283                     .sum() for x in batch_samples]

4 frames
TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/unsloth_zoo/loss_utils.py in _unsloth_get_batch_samples(self, epoch_iterator, num_batches, device, *args, **kwargs)
    294                 num_items_in_batch = num_items_in_batch.to(device)
    295         except Exception as exception:
--> 296             raise RuntimeError(exception)
    297     pass
    298     return batch_samples, num_items_in_batch

RuntimeError: 'NoneType' object is not subscriptable

I know it's something inside the dataset that is causing it to happen, because I have had this occur when modifying a dataset and adding new lines.

However, I've never figured out what causes it, and because this is happening with a freshly generated dataset, I can't just undo a few recent changes to fix it, as it was untouched since it was created.

So, as of right now, I'm stuck. If someone knows what causes this error in Unsloth notebooks, I would greatly appreciate the insight, as I'd love to fix this error and provide you with the reasoning version of the model!

Wish me luck, and hopefully, we can get some insight on this. - Reed

reedmayhew

Owner 13 days ago

•

edited 12 days ago

Spoke too soon! GPT 4.1 to the rescue.

It found that one of the responses in the dataset had: "item متنContent = track.title;" in it, and Unsloth doesn't like the "متن" character.

Trying again to see if that fixed it.

Update: Training now! 12B and 4B reasoning versions should be available shortly.

Update 2: and... Another bug... Can't these notebooks just work???

https://github.com/unslothai/unsloth/issues/2422

https://github.com/unslothai/unsloth-zoo/issues/123

I'll update shortly...

https://github.com/unslothai/unsloth-zoo/pull/125

Waiting for pull request to get implemented...

reedmayhew

Owner 12 days ago

•

edited 12 days ago

Don't ask me how I got it to work...

I have a headache...

https://huggingface.co/reedmayhew/Grok-3-reasoning-gemma3-12B-distilled-GGUF

https://huggingface.co/reedmayhew/Grok-3-reasoning-gemma3-4B-distilled-GGUF

Enjoy!

reedmayhew

Owner 11 days ago

•

edited 11 days ago

The reasoning model works really well!!!

It was able to generate this with little effort: (Much more mesmerizing/smooth in a browser.)

CODE:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Spinning 3D Cube</title>
    <!-- Tailwind CSS CDN -->
    <script src="https://cdn.tailwindcss.com"></script>
    <style>
        /* Container for the cube */
        .cube-container {
            width: 200px;
            height: 200px;
            position: relative;
            margin: 50px auto;
        }

        /* The cube itself, styled with transforms */
        .cube {
            transform-style: preserve-3d;
            animation: spinCube 5s linear infinite;
        }

        /* Individual faces of the cube */
        .face {
            position: absolute;
            width: 200px;
            height: 200px;
            display: flex;
            justify-content: center;
            align-items: center;
            font-size: 1rem;
            opacity: 0.8;
        }

        /* Styling for each face with gradients */
        .front {
            background: linear-gradient(45deg, #3a7bd5, #2962ff);
            color: white;
            transform: translateZ(100px);
        }

        .back {
            background: linear-gradient(45deg, #e040fb, #d81b60);
            color: white;
            transform: rotateY(180deg) translateZ(100px);
        }

        .top {
            background: linear-gradient(to top, #ff7f50, #f9a35a);
            color: black;
            transform: rotateX(90deg) translateZ(100px);
        }

        .bottom {
            background: linear-gradient(to bottom, #2ecc71, #27ae60);
            color: white;
            transform: rotateX(-90deg) translateZ(100px);
        }

        .left {
            background: linear-gradient(90deg, #bdc3c7, #95a5a6);
            color: black;
            transform: rotateY(-90deg) translateZ(100px);
        }

        .right {
            background: linear-gradient(90deg, #1abc9c, #16a085);
            color: white;
            transform: rotateY(90deg) translateZ(100px);
        }

        /* Animation to spin the cube on all axes */
        

@keyframes
	 spinCube {
            from {
                transform: rotateX(0deg) rotateY(0deg) rotateZ(0deg);
            }
            to {
                transform: rotateX(360deg) rotateY(360deg) rotateZ(360deg);
            }
        }
    </style>
</head>
<body class="bg-gray-100 flex items-center justify-center h-screen">
    <div class="cube-container">
        <div class="cube">
            <div class="face front">Front</div>
            <div class="face back">Back</div>
            <div class="face top">Top</div>
            <div class="face bottom">Bottom</div>
            <div class="face left">Left</div>
            <div class="face right">Right</div>
        </div>
    </div>
</body>
</html>

PSM24

5 days ago

Spoke too soon! GPT 4.1 to the rescue.

It found that one of the responses in the dataset had: "item متنContent = track.title;" in it, and Unsloth doesn't like the "متن" character.

Trying again to see if that fixed it.

Update: Training now! 12B and 4B reasoning versions should be available shortly.

Update 2: and... Another bug... Can't these notebooks just work???

https://github.com/unslothai/unsloth/issues/2422

https://github.com/unslothai/unsloth-zoo/issues/123

I'll update shortly...

https://github.com/unslothai/unsloth-zoo/pull/125

Waiting for pull request to get implemented...

Can you create a Gemma thinking version using this Gemini-2.5 Pro dataset?
https://huggingface.co/datasets/PSM24/gemini-2.5-pro-100x

reedmayhew

Owner 5 days ago

Hello! I'm definitely can look into it. I might have to reproduce a similar dataset since there's a specific way I format the reasoning in order to get it to work, which is done behind the scenes in order to prompt the model to give the answer correctly.

But I can definitely look into making a Gemini 2.5 Pro reasoning model!

reedmayhew
/

Grok-3-gemma3-12B-distilled

SUPER !

UPDATE:

Update: Training now! 12B and 4B reasoning versions should be available shortly.

Update 2: and... Another bug... Can't these notebooks just work???

Don't ask me how I got it to work...

Update: Training now! 12B and 4B reasoning versions should be available shortly.

Update 2: and... Another bug... Can't these notebooks just work???