DaniilOr commited on
Commit
5f55da8
·
verified ·
1 Parent(s): a363982

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -4
README.md CHANGED
@@ -1,5 +1,86 @@
 
 
 
 
 
 
 
 
1
 
2
- ---
3
- license: apache-2.0
4
- ---
5
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - project-droid/DroidCollection
5
+ base_model:
6
+ - answerdotai/ModernBERT-large
7
+ pipeline_tag: text-classification
8
+ ---
9
 
10
+ # DroidDetect-Large
11
+
12
+ This is a text classification model based on `answerdotai/ModernBERT-large`, fine-tuned to distinguish between **human-written**, **AI-refined**, **Adversarial** and **AI-generated** code.
13
+
14
+ The model was trained on the `DroidCollection` dataset. It's designed as a **4-class classifier** to address the core task of AI code detection.
15
+
16
+ A key feature of this model is its training objective, which combines standard **Cross-Entropy Loss** with a **Batch-Hard Triplet Loss**. This contrastive loss component encourages the model to learn more discriminative embeddings by pushing representations of human vs. machine code further apart in the vector space.
17
+
18
+ ***
19
+
20
+ ## Model Details
21
+
22
+ * **Base Model:** `answerdotai/ModernBERT-large`
23
+ * **Loss Function:** `Total Loss = CrossEntropyLoss + 0.1 * TripletLoss`
24
+ * **Dataset:** Filtered training set of the [DroidCollection](https://huggingface.co/datasets/project-droid/DroidCollection).
25
+
26
+ #### Label Mapping
27
+
28
+ The model predicts one of 4 classes. The mapping from ID to label is as follows:
29
+
30
+ ```json
31
+ {
32
+ "0": "HUMAN_GENERATED",
33
+ "1": "MACHINE_GENERATED",
34
+ "2": "MACHINE_REFINED",
35
+ "3": "MACHINE_GENERATED_ADVERSARIAL",
36
+ }
37
+ ```
38
+
39
+ ## Model Code
40
+
41
+ The following code can be used for reproducibility:
42
+
43
+ ```python
44
+ TEXT_EMBEDDING_DIM = 768
45
+
46
+
47
+ class TLModel(nn.Module):
48
+ def __init__(self, text_encoder, projection_dim=128, num_classes=NUM_CLASSES, class_weights=None):
49
+ super().__init__()
50
+ self.text_encoder = text_encoder
51
+ self.num_classes = num_classes
52
+ text_output_dim = TEXT_EMBEDDING_DIM
53
+ self.additional_loss = losses.BatchHardSoftMarginTripletLoss(self.text_encoder)
54
+
55
+ self.text_projection = nn.Linear(text_output_dim, projection_dim)
56
+ self.classifier = nn.Linear(projection_dim, num_classes)
57
+ self.class_weights = class_weights
58
+
59
+ def forward(self, labels=None, input_ids=None, attention_mask=None):
60
+ actual_labels = labels
61
+ sentence_embeddings = self.text_encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state
62
+ sentence_embeddings = sentence_embeddings.mean(dim=1)
63
+ projected_text = F.relu(self.text_projection(sentence_embeddings))
64
+ logits = self.classifier(projected_text)
65
+ loss = None
66
+ cross_entropy_loss = None
67
+ contrastive_loss = None
68
+
69
+ if actual_labels is not None:
70
+ loss_fct_ce = nn.CrossEntropyLoss(weight=self.class_weights.to(logits.device) if self.class_weights is not None else None)
71
+ cross_entropy_loss = loss_fct_ce(logits.view(-1, self.num_classes), actual_labels.view(-1))
72
+ contrastive_loss = self.additional_loss.batch_hard_triplet_loss(embeddings=projected_text, labels=actual_labels)
73
+ lambda_contrast = 0.1
74
+ loss = cross_entropy_loss + lambda_contrast * contrastive_loss
75
+
76
+
77
+ output = {"logits": logits, "fused_embedding": projected_text}
78
+ if loss is not None:
79
+ output["loss"] = loss
80
+ if cross_entropy_loss is not None:
81
+ output["cross_entropy_loss"] = cross_entropy_loss
82
+ if contrastive_loss is not None:
83
+ output["contrastive_loss"] = contrastive_loss
84
+
85
+ return output
86
+ ```