Fix inconsistencies in README
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ base_model:
|
|
| 9 |
|
| 10 |
# **RocqStar Ranker Embedder**
|
| 11 |
|
| 12 |
-
A self‑attentive embedding model for premise / proof selection in Rocq
|
| 13 |
|
| 14 |
## Model Details
|
| 15 |
|
|
@@ -19,7 +19,7 @@ RocqStar is a 125 M‑parameter Transformer encoder (768‑dim hidden size) wi
|
|
| 19 |
|
| 20 |
* **Model type:** Transformer encoder with self‑attentive pooling
|
| 21 |
* **Language(s):** Rocq / Coq (Gallina) syntax (tokens)
|
| 22 |
-
* **License
|
| 23 |
* **Fine‑tuned from:** `microsoft/codebert-base`
|
| 24 |
|
| 25 |
### Model Sources
|
|
@@ -27,12 +27,6 @@ RocqStar is a 125 M‑parameter Transformer encoder (768‑dim hidden size) wi
|
|
| 27 |
* **Repository:** [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBrains-Research/big-rocq)
|
| 28 |
* **Paper:** (coming soon)
|
| 29 |
|
| 30 |
-
## Bias, Risks, and Limitations
|
| 31 |
-
|
| 32 |
-
* **Domain limitation:** Trained only on Rocq projects; performance degrades on other languages or heavily tactic‑macro‑based proofs.
|
| 33 |
-
* **Data leakage:** May memorise proofs present in training data and surface them verbatim.
|
| 34 |
-
* **Proof similarity assumption:** While improved over BM25, cosine distance may still fail on very long or highly creative proofs.
|
| 35 |
-
|
| 36 |
## How to Get Started
|
| 37 |
|
| 38 |
Go to [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBrains-Research/big-rocq). In subdirectory `ranker-server` you can find the server that you need to run to be able to use the model from the CoqPilot plugin. Also it may be used as a reference example of how to do inference with the model.
|
|
@@ -46,8 +40,8 @@ Go to [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBra
|
|
| 46 |
|
| 47 |
### Training Procedure
|
| 48 |
|
| 49 |
-
* **Objective:** InfoNCE
|
| 50 |
-
* **Batch size:**
|
| 51 |
* **Optimizer / LR:** AdamW, lr = 4e‑6, linear warm‑up 10 %, 22k steps
|
| 52 |
* **Hardware:** 1× NVIDIA H100 GPU, 160 GB RAM, 14 h wall‑clock
|
| 53 |
|
|
@@ -71,4 +65,4 @@ Go to [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBra
|
|
| 71 |
|
| 72 |
#### Summary
|
| 73 |
|
| 74 |
-
RocqStar delivers consistent gains, up to 28% relative improvement over
|
|
|
|
| 9 |
|
| 10 |
# **RocqStar Ranker Embedder**
|
| 11 |
|
| 12 |
+
A self‑attentive embedding model for premise / proof selection in Rocq ITP. RocqStar is fine‑tuned from CodeBERT so that distances in the embedding space correlate with proof similarity rather than with surface‑level statement overlap. It replaces BM25/Jaccard similarity in retrieval‑augmented generation (RAG) pipelines such as **CoqPilot**, leading to higher proof success rates on the IMM‑300 benchmark.
|
| 13 |
|
| 14 |
## Model Details
|
| 15 |
|
|
|
|
| 19 |
|
| 20 |
* **Model type:** Transformer encoder with self‑attentive pooling
|
| 21 |
* **Language(s):** Rocq / Coq (Gallina) syntax (tokens)
|
| 22 |
+
* **License:** Apache‑2.0
|
| 23 |
* **Fine‑tuned from:** `microsoft/codebert-base`
|
| 24 |
|
| 25 |
### Model Sources
|
|
|
|
| 27 |
* **Repository:** [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBrains-Research/big-rocq)
|
| 28 |
* **Paper:** (coming soon)
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
## How to Get Started
|
| 31 |
|
| 32 |
Go to [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBrains-Research/big-rocq). In subdirectory `ranker-server` you can find the server that you need to run to be able to use the model from the CoqPilot plugin. Also it may be used as a reference example of how to do inference with the model.
|
|
|
|
| 40 |
|
| 41 |
### Training Procedure
|
| 42 |
|
| 43 |
+
* **Objective:** InfoNCE
|
| 44 |
+
* **Batch size:** 32
|
| 45 |
* **Optimizer / LR:** AdamW, lr = 4e‑6, linear warm‑up 10 %, 22k steps
|
| 46 |
* **Hardware:** 1× NVIDIA H100 GPU, 160 GB RAM, 14 h wall‑clock
|
| 47 |
|
|
|
|
| 65 |
|
| 66 |
#### Summary
|
| 67 |
|
| 68 |
+
RocqStar delivers consistent gains, up to 28% relative improvement over Jaccard-index based retrieval, especially for medium‑length theorems where proof similarity diverges most from statement similarity.
|