--- datasets: - wikimedia/wikipedia language: - en --- # Model Info xLSTM Trained on a shuffeld wikimedia/wikipedia 20231101.en dataset (seed=42) Model checkpoints as branches ``` per_device_train_batch_size=32, logging_steps=3650, gradient_accumulation_steps=8, num_train_epochs=1, weight_decay=0.1, warmup_steps=1_000, lr_scheduler_type="cosine", learning_rate=5e-4, save_steps=3650, fp16=True, ``` ## How to use Install: ``` pip install xlstm pip install mlstm_kernels pip install 'transformers @ git+https://git@github.com/NX-AI/transformers.git@integrate_xlstm_clean' ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer xlstm = AutoModelForCausalLM.from_pretrained("J4bb4wukis/xlstm_406m_wikipedia_en_shuffeld") tokenizer = AutoTokenizer.from_pretrained("J4bb4wukis/xlstm_406m_wikipedia_en_shuffeld") prompts = "Angela Merkel is" inputs = tokenizer(prompts,return_tensors='pt').input_ids outputs = xlstm.generate(inputs, max_new_tokens=100, do_sample=True, top_k=10, top_p=0.95) print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) ```