SweRankEmbed-Large is a 7B bi-encoder for code retrieval. It significantly outperforms other embedding models on the issue localization task.

The model has been trained on large-scale issue localization data collected from public python github repositories. Check out our blog post and paper for more details!

You can combine SweRankEmbed with our SweRankLLM-Small or SweRankLLM-Large rerankers for even higher quality ranking performance.

Link to code: https://github.com/gangiswag/SweRank

Performance

SweRank models show SOTA localization performance on a variety of benchmarks like SWE-Bench-Lite and LocBench, considerably out-performing agent-based approaches relying on Claude-3.5

Model Name SWE-Bench-Lite Func@10 LocBench Func@15
OpenHands (Claude 3.5) 70.07 59.29
LocAgent (Claude 3.5) 77.37 60.71
CodeRankEmbed (137M) 58.76 50.89
GTE-Qwen2-7B-Instruct (7B) 70.44 57.14
SweRankEmbed-Small (137M) 74.45 63.39
SweRankEmbed-Large (7B) 82.12 67.32
+ GPT-4.1 reranker 87.96 74.64
+ SweRankLLM-Small (7B) reranker 86.13 74.46
+ SweRankLLM-Large (32B) reranker 88.69 76.25

Requirements

transformers>=4.39.2
flash_attn>=2.5.6

Usage with Sentence-Transformers

from from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Salesforce/SweRankEmbed-Large", trust_remote_code=True)
# In case you want to reduce the maximum length:
model.max_seq_length = 8192

queries = ['Calculate the n-th factorial']
documents = ['def fact(n):\n if n < 0:\n  raise ValueError\n return 1 if n == 0 else n * fact(n - 1)']

query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)

scores = query_embeddings @ document_embeddings.T

for query, query_scores in zip(queries, scores):
    doc_score_pairs = list(zip(documents, query_scores))
    doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
    # Output passages & scores
    print("Query:", query)
    for document, score in doc_score_pairs:
        print(score, document)

Observe the config_sentence_transformers.json to see all pre-built prompt names.

Usage with Huggingface Transformers

Important: the query prompt must include the following task instruction prefix: "*Instruct: Given a github issue, identify the code that needs to be changed to fix the issue.\nQuery: *"

import torch
import torch.nn.functional as F

from torch import Tensor
from transformers import AutoTokenizer, AutoModel

def last_token_pool(last_hidden_states: Tensor, attention_mask: Tensor) -> Tensor:
    left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
    if left_padding:
        return last_hidden_states[:, -1]
    else:
        sequence_lengths = attention_mask.sum(dim=1) - 1
        batch_size = last_hidden_states.shape[0]
        return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]

def get_detailed_instruct(task_description: str, query: str) -> str:
    return f'Instruct: {task_description}\nQuery: {query}'

# Each query must come with a one-sentence instruction that describes the task
task = 'Given a github issue, identify the code that needs to be changed to fix the issue.'

tokenizer = AutoTokenizer.from_pretrained('Salesforce/SweRankEmbed-Large',  trust_remote_code=True)
model = AutoModel.from_pretrained('Salesforce/SweRankEmbed-Large',  trust_remote_code=True)
model.eval()

max_length = 8192

queries = ['Calculate the n-th factorial']
queries_with_prefix  = [get_detailed_instruct(task, query) for query in queries]
query_inputs = tokenizer(queries_with_prefix, padding=True, truncation=True, return_tensors='pt', max_length=max_length)

documents = ['def fact(n):\n if n < 0:\n  raise ValueError\n return 1 if n == 0 else n * fact(n - 1)']
document_inputs = tokenizer(documents, padding=True, truncation=True, return_tensors='pt', max_length=max_length)

# Compute token embeddings
with torch.no_grad():
    query_embeddings = last_token_pool(model(**query_inputs).last_hidden_state, query_inputs["attention_mask"]])
    document_embeddings = last_token_pool(model(**document_inputs).last_hidden_state, document_inputs["attention_mask"]])


# normalize embeddings
query_embeddings = torch.nn.functional.normalize(query_embeddings, p=2, dim=1)
document_embeddings = torch.nn.functional.normalize(document_embeddings, p=2, dim=1)

scores = torch.mm(query_embeddings, document_embeddings.transpose(0, 1))
for query, query_scores in zip(queries, scores):
    doc_score_pairs = list(zip(documents, query_scores))
    doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
    #Output passages & scores
    print("Query:", query)
    for document, score in doc_score_pairs:
        print(score, document)

Citation

If you find this model work useful in your research, please consider citing our paper:

@article{reddy2025swerank,
  title={SweRank: Software Issue Localization with Code Ranking},
  author={Reddy, Revanth Gangi and Suresh, Tarun and Doo, JaeHyeok and Liu, Ye and Nguyen, Xuan Phi and Zhou, Yingbo and Yavuz, Semih and Xiong, Caiming and Ji, Heng and Joty, Shafiq},
  journal={arXiv preprint arXiv:2505.07849},
  year={2025}
}
Downloads last month
0
Safetensors
Model size
7.07B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Salesforce/SweRankEmbed-Large

Finetuned
(8)
this model

Collection including Salesforce/SweRankEmbed-Large