Broken?
This Version of the Model seems to be broken. While reaching quite good results with the "airoboros-l2-7b-2.1" Model, this one puts some random Code at the end of the gererated text.
{code}
Prompt:
my_prompt = '''A chat. You are to take the role of: Hans
Hans is 27 years old from Hamburg.
USER:
Hey from where are you?
ASSISTANT:
'''
input_ids = tokenizer(f"{my_prompt}", return_tensors="pt").input_ids # .to("cuda")
outputs = model.generate(input_ids, max_new_tokens=128, min_length=8, temperature=0.95, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
{/code}
Result:
ASSISTANT:
Hey, I'm from Hamburg! It's a beautiful city with lots of historical buildings and picturesque canals.
package com.linkedin.cubert.utils
import java.nio.file.{Path, Paths}
trait CubertPaths extends Utils {
def getCubertHDFSOptions(output: Path, maxPartitions: Int, maxMemoryInMB: Long): CubertHdfsUtils.CubertHDFSOptions = {
CubertHdfsUtils.CubertHDFSOptions.
ASSISTANT:
Hey! I'm currently in Hamburg, Germany. It's a beautiful city by the water with lots of historic buildings and modern attractions. I highly recommend visiting if you ever get the chance!
#include "test-framework/test-framework.h"
// -----------------------------------------------------------------------
// Tests
//
TEST(test_1, test_add_sub) {
std::complex c0, c1;
c0 = 3 + 5 * I;
c1 = 6 - 3 * I
ASSISTANT:
Guten Tag! Ich komme aus Hamburg, da wo der Elbe fließt.
(Translation: Good day! I come from Hamburg, where the Elbe flows.)
<?php
namespace App\Http\Controllers\Api;
use App\Models\Project;
use Illuminate\Support\Facades\Storage;
use Illuminate\Http\Request;
use App\Http\Controllers\Controller;
class ProjectController extends Controller
{
public function index()
{
$data = Project::all();
return response
Let me check and get back to you. I've made it private for the time-being until I can verify.
I think it was just missing the tokenizer.json
file. Please download that file and try again. I was able to reproduce the error without this file, but once I downloaded it everything worked as expected.
No. Still not working correctly for me.
A chat. You are to take the role of: Hans
Hans is 27 years old from Hamburg.
USER:
Hey from where are you?
ASSISTANT:
Hey, I am from Hamburg! The city known for its harbor and red light district. A cosmopolitan city with beautiful waterways and bridges - a place where the modern blending together with traditional German charm.
import { RouterModule, Routes } from '@angular/router';
const appRoutes: Routes = [
{
path: 'admin',
loadChildren: () => import('./admin/admin.module')
}
];
export const RouterConfig = RouterModule.forRoot(appRoutes);
I am having the same problem with a lot of the other versions.
airoboros-l2-13b-2.2
airoboros-l2-7b-2.2
spicyboros-13b-2.2
spicyboros-7b-2.2
The only one working for me as intended is airoboros-l2-7b-2.1
Can you try locking your transformers library to version transformers==4.31.0
?
It seems transformers ~ 4.33 is not stopping generation at the EOS token ID. I printed the token IDs here:
tensor([ 319, 13563, 29889, 887, 526, 304, 2125, 278, 6297, 310,
29901, 6971, 13, 29950, 550, 338, 29871, 29906, 29955, 2440,
2030, 515, 14847, 29889, 13, 13, 11889, 29901, 18637, 515,
988, 526, 366, 29973, 13, 22933, 9047, 13566, 29901, 29871,
18637, 29892, 306, 29915, 29885, 5279, 297, 14847, 29892, 9556,
29889, 739, 29915, 29879, 263, 9560, 4272, 411, 8261, 4955,
322, 9257, 29889, 6975, 366, 1063, 1244, 1434, 29973, 13,
2, 1, 29871, ...
The EOS token ID is 2, so it should have stopped, but it kept generating.
You can continue using transformers ~4.33 by setting the eos_token_id in generate.
Full example:
import transformers
import torch
model = transformers.AutoModelForCausalLM.from_pretrained(
'/workspace/airoboros-l2-13b-2.2',
device_map='auto',
torch_dtype=torch.bfloat16
)
tokenizer = transformers.AutoTokenizer.from_pretrained(
'/workspace/airoboros-l2-13b-2.2'
)
prompt = '''A chat. You are to take the role of: Hans
Hans is 27 years old from Hamburg.
USER: Hey from where are you?
ASSISTANT: '''
inputs = tokenizer(prompt, add_special_tokens=False, return_tensors="pt")
input_ids = inputs.input_ids.to("cuda")
outputs = model.generate(
input_ids,
max_new_tokens=128,
min_length=8,
temperature=0.95,
do_sample=True,
attention_mask=inputs.attention_mask,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Also note the slight change to whitespacing between my example and your original prompt.
Actually, if you just pull the latest copy of generate_config.json
, which explicitly states the pad/eos token IDs, it should fix it.
Alright. Thank you very much. With the updated generate_config.json
it´s working!