Droid
Collection
9 items
•
Updated
This is a text classification model based on answerdotai/ModernBERT-base
, fine-tuned to distinguish between human-written and AI-generated code.
The model was trained on the DroidCollection
dataset. It's designed as a binary classifier to address the core task of AI code detection.
A key feature of this model is its training objective, which combines standard Cross-Entropy Loss with a Batch-Hard Triplet Loss. This contrastive loss component encourages the model to learn more discriminative embeddings by pushing representations of human vs. machine code further apart in the vector space.
answerdotai/ModernBERT-base
Total Loss = CrossEntropyLoss + 0.1 * TripletLoss
The model predicts one of two classes. The mapping from ID to label is as follows:
{
"0": "HUMAN_GENERATED",
"1": "MACHINE_GENERATED"
}
The following code can be used for reproducibility:
TEXT_EMBEDDING_DIM = 768
class TLModel(nn.Module):
def __init__(self, text_encoder, projection_dim=128, num_classes=NUM_CLASSES, class_weights=None):
super().__init__()
self.text_encoder = text_encoder
self.num_classes = num_classes
text_output_dim = TEXT_EMBEDDING_DIM
self.additional_loss = losses.BatchHardSoftMarginTripletLoss(self.text_encoder)
self.text_projection = nn.Linear(text_output_dim, projection_dim)
self.classifier = nn.Linear(projection_dim, num_classes)
self.class_weights = class_weights
def forward(self, labels=None, input_ids=None, attention_mask=None):
actual_labels = labels
sentence_embeddings = self.text_encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state
sentence_embeddings = sentence_embeddings.mean(dim=1)
projected_text = F.relu(self.text_projection(sentence_embeddings))
logits = self.classifier(projected_text)
loss = None
cross_entropy_loss = None
contrastive_loss = None
if actual_labels is not None:
loss_fct_ce = nn.CrossEntropyLoss(weight=self.class_weights.to(logits.device) if self.class_weights is not None else None)
cross_entropy_loss = loss_fct_ce(logits.view(-1, self.num_classes), actual_labels.view(-1))
contrastive_loss = self.additional_loss.batch_hard_triplet_loss(embeddings=projected_text, labels=actual_labels)
lambda_contrast = 0.1
loss = cross_entropy_loss + lambda_contrast * contrastive_loss
output = {"logits": logits, "fused_embedding": projected_text}
if loss is not None:
output["loss"] = loss
if cross_entropy_loss is not None:
output["cross_entropy_loss"] = cross_entropy_loss
if contrastive_loss is not None:
output["contrastive_loss"] = contrastive_loss
return output
Base model
answerdotai/ModernBERT-base