tomaarsen
/

span-marker-xlm-roberta-large-conll03-doc-context

@@ -8,11 +8,49 @@ tags:
 - ner
 - named-entity-recognition
 pipeline_tag: token-classification
 ---
 # SpanMarker for Named Entity Recognition
 This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) as the underlying encoder. See [train.py](train.py) for the training script.
 ## Usage
@@ -28,7 +66,7 @@ You can then run inference with this model like so:
 from span_marker import SpanMarkerModel
 # Download from the 🤗 Hub
-model = SpanMarkerModel.from_pretrained("span_marker_model_name")
 # Run inference
 entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
 ```

 - ner
 - named-entity-recognition
 pipeline_tag: token-classification
+widget:
+  - text: >-
+      Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic
+      to Paris.
+    example_title: Amelia Earhart
+model-index:
+  - name: >-
+      SpanMarker w. xlm-roberta-large on CoNLL03 with document-level context by Tom Aarsen
+    results:
+      - task:
+          type: token-classification
+          name: Named Entity Recognition
+        dataset:
+          type: conll2003
+          name: CoNLL03 w. document context
+          split: test
+          revision: 01ad4ad271976c5258b9ed9b910469a806ff3288
+        metrics:
+          - type: f1
+            value: 0.9442
+            name: F1
+          - type: precision
+            value: 0.9411
+            name: Precision
+          - type: recall
+            value: 0.9473
+            name: Recall
+datasets:
+  - conll2003
+  - tomaarsen/conll2003
+language:
+  - en
+metrics:
+  - f1
+  - recall
+  - precision
 ---
 # SpanMarker for Named Entity Recognition
 This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) as the underlying encoder. See [train.py](train.py) for the training script.
+Note that this model was trained with document-level context, i.e. it will primarily perform well when provided with enough context. It is recommended to call `model.predict` with a 🤗 Dataset with `tokens`, `document_id` and `sentence_id` columns.
+See the [documentation](https://tomaarsen.github.io/SpanMarkerNER/api/span_marker.modeling.html#span_marker.modeling.SpanMarkerModel.predict) of the `model.predict` method for more information.
 ## Usage
 from span_marker import SpanMarkerModel
 # Download from the 🤗 Hub
+model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-xlm-roberta-large-conll03-doc-context")
 # Run inference
 entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
 ```