File size: 3,270 Bytes
d2fadbf
 
 
 
 
 
 
 
c0eda19
d2fadbf
 
a67716a
d2fadbf
76912da
d2fadbf
 
 
 
 
 
 
 
 
 
 
 
 
a67716a
d2fadbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: apache-2.0
datasets:
- project-droid/DroidCollection
base_model:
- answerdotai/ModernBERT-large
pipeline_tag: text-classification
---

# DroidDetect-Large

This is a text classification model based on `answerdotai/ModernBERT-large`, fine-tuned to distinguish between **human-written**, and **AI-generated** code.

The model was trained on the `DroidCollection` dataset. It's designed as a **binary classifier** to address the core task of AI code detection.

A key feature of this model is its training objective, which combines standard **Cross-Entropy Loss** with a **Batch-Hard Triplet Loss**. This contrastive loss component encourages the model to learn more discriminative embeddings by pushing representations of human vs. machine code further apart in the vector space.

***

## Model Details

* **Base Model:** `answerdotai/ModernBERT-large`
* **Loss Function:** `Total Loss = CrossEntropyLoss + 0.1 * TripletLoss`
* **Dataset:** Filtered training set of the [DroidCollection](https://huggingface.co/datasets/project-droid/DroidCollection).

#### Label Mapping

The model predicts one of 2 classes. The mapping from ID to label is as follows:

```json
{
  "0": "HUMAN_GENERATED",
  "1": "MACHINE_GENERATED",
}
```

## Model Code

The following code can be used for reproducibility:

```python
TEXT_EMBEDDING_DIM = 1024


class TLModel(nn.Module):
    def __init__(self, text_encoder, projection_dim=128, num_classes=NUM_CLASSES, class_weights=None):
        super().__init__()
        self.text_encoder = text_encoder
        self.num_classes = num_classes
        text_output_dim = TEXT_EMBEDDING_DIM
        self.additional_loss = losses.BatchHardSoftMarginTripletLoss(self.text_encoder)

        self.text_projection = nn.Linear(text_output_dim, projection_dim)
        self.classifier = nn.Linear(projection_dim, num_classes)
        self.class_weights = class_weights

    def forward(self, labels=None, input_ids=None, attention_mask=None):
        actual_labels = labels
        sentence_embeddings = self.text_encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state
        sentence_embeddings = sentence_embeddings.mean(dim=1)
        projected_text = F.relu(self.text_projection(sentence_embeddings))
        logits = self.classifier(projected_text)
        loss = None
        cross_entropy_loss = None
        contrastive_loss = None

        if actual_labels is not None:
            loss_fct_ce = nn.CrossEntropyLoss(weight=self.class_weights.to(logits.device) if self.class_weights is not None else None)
            cross_entropy_loss = loss_fct_ce(logits.view(-1, self.num_classes), actual_labels.view(-1))
            contrastive_loss = self.additional_loss.batch_hard_triplet_loss(embeddings=projected_text, labels=actual_labels)
            lambda_contrast = 0.1
            loss = cross_entropy_loss + lambda_contrast * contrastive_loss


        output = {"logits": logits, "fused_embedding": projected_text}
        if loss is not None:
            output["loss"] = loss
        if cross_entropy_loss is not None:
             output["cross_entropy_loss"] = cross_entropy_loss
        if contrastive_loss is not None:
             output["contrastive_loss"] = contrastive_loss

        return output
```