Update README.md
Browse files
README.md
CHANGED
@@ -58,7 +58,7 @@ print(outputs[0]["generated_text"][-1])
|
|
58 |
|
59 |
## Evaluation Results
|
60 |
|
61 |
-
We evaluate
|
62 |
|
63 |
### Needle in a Haystack
|
64 |
|
|
|
58 |
|
59 |
## Evaluation Results
|
60 |
|
61 |
+
We evaluate Nemotron-UltraLong-8B on a diverse set of benchmarks, including long-context tasks (e.g., RULER, LV-Eval, and InfiniteBench) and standard tasks (e.g., MMLU, MATH, GSM-8K, and HumanEval). UltraLong-8B achieves superior performance on ultra-long context tasks while maintaining competitive results on standard benchmarks.
|
62 |
|
63 |
### Needle in a Haystack
|
64 |
|