mamba-retriever-do_not_open-regular_chunk

This model is a fine-tuned version of state-spaces/mamba2-130m for information retrieval tasks.

Model Details

  • Base Model: state-spaces/mamba2-130m
  • Training Dataset: lei-ucsd/do_not_open
  • Configuration: regular_chunk
  • Task: Binary classification for passage relevance in information retrieval

Training Details

  • Dataset: do_not_open
  • Chunking Strategy: regular_chunk
  • Training Steps: 1000 (checkpoint-1000)

Usage

from transformers import AutoTokenizer, AutoModel
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("lei-ucsd/mamba-retriever-do_not_open-regular_chunk")
model = AutoModel.from_pretrained("lei-ucsd/mamba-retriever-do_not_open-regular_chunk")

# Example usage for information retrieval
text = "Your query text here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

with torch.no_grad():
    outputs = model(**inputs)
    # Use binary_logits for relevance scoring
    binary_logits = outputs.binary_logits

Model Architecture

The model extends the base Mamba architecture with a binary classification head for passage relevance scoring.

Training Data

This model was trained on the lei-ucsd/do_not_open dataset with regular_chunk chunking strategy.

Limitations

This model is designed specifically for information retrieval tasks and may not perform well on other text classification tasks.

Citation

If you use this model, please cite the original Mamba paper and the dataset used for training.

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lei-ucsd/mamba-retriever-do_not_open-regular_chunk

Finetuned
(7)
this model