90 1 2

Joel Niklaus

joelniklaus

Aexyno's profile picture

CapitainData's profile picture

trtm's profile picture

https://niklaus.ai

JoelNiklaus
JoelNiklaus
joelniklaus

AI & ML interests

Pretraining, Instruction Tuning, Domain Adaptation, Benchmarks, Legal Datasets

Recent Activity

new activity 9 days ago

joelniklaus/legal-english-roberta-base:Adding `safetensors` variant of this model

new activity 9 days ago

rcds/distilbert-SBD-de-judgements-laws:Adding `safetensors` variant of this model

new activity 9 days ago

joelniklaus/legal-spanish-roberta-large:Adding `safetensors` variant of this model

View all activity

Organizations

joelniklaus 's collections 13

SwiLTra-Bench

The code for creating the datasets is available at https://github.com/JoelNiklaus/SwissLegalTranslations.

joelniklaus/SwissLegalTranslations

Viewer • Updated Nov 25, 2024 • 673k • 22
joelniklaus/SwissLawTranslations

Viewer • Updated Mar 4 • 293k • 45
joelniklaus/SwissSupremeCourtPressReleaseTranslations

Viewer • Updated Mar 4 • 1.17k • 24
joelniklaus/SwissDecisionSummaryTranslations

Viewer • Updated Mar 4 • 75.5k • 59

SCALE Models

Scaling up the Complexity for Advanced Language Model Evaluation

joelniklaus/legal-swiss-roberta-large

Fill-Mask • Updated Aug 6, 2023 • 5 • 1
joelniklaus/legal-swiss-roberta-base

Fill-Mask • Updated Aug 6, 2023 • 20
joelniklaus/legal-swiss-longformer-base

Fill-Mask • 0.2B • Updated Aug 6, 2023 • 14 • 2
rcds/MiniLM-swiss_citation_extraction-de-fr-it

Token Classification • 0.1B • Updated Jun 16, 2023 • 4

MultiLegalPile Models

A 689GB Multilingual Legal Corpus

joelniklaus/legal-croatian-roberta-base

Fill-Mask • 0.1B • Updated Aug 6, 2023 • 22 • 2
joelniklaus/legal-romanian-roberta-base

Fill-Mask • Updated Feb 13, 2023 • 5 • 1
joelniklaus/legal-xlm-longformer-base

Fill-Mask • 0.2B • Updated Aug 6, 2023 • 710 • 4
joelniklaus/legal-english-roberta-large

Fill-Mask • 0.3B • Updated 9 days ago • 16 • 1

MultiLegalSBD Datasets

A Multilingual Legal Sentence Boundary Detection Dataset

rcds/MultiLegalSBD

Updated Nov 21, 2024 • 109 • 3

ClassActionPrediction Datasets

A Challenging Benchmark for Legal Judgment Prediction of Class Action Cases in the US

darrow-ai/USClassActionOutcomes_ExpertsAnnotations

Viewer • Updated Nov 6, 2022 • 200 • 75
darrow-ai/USClassActions

Viewer • Updated Jan 24, 2024 • 3k • 842 • 1

Anonymization

Automatic Anonymization of Swiss Federal Supreme Court Rulings

joelniklaus/legal-german-roberta-base

Fill-Mask • Updated Jan 7, 2023 • 15 • 1
joelniklaus/legal-french-roberta-base

Fill-Mask • Updated Jan 14, 2023 • 5
joelniklaus/legal-italian-roberta-base

Fill-Mask • Updated Jan 14, 2023 • 4 • 1

LegalLens Datasets

Datasets for the paper https://arxiv.org/abs/2402.04335

darrow-ai/LegalLensNLI

Viewer • Updated Jul 8, 2024 • 312 • 91 • 6
darrow-ai/LegalLensNER

Viewer • Updated Jul 8, 2024 • 1.33k • 43 • 4

LegalLMs

XLM-RoBERTa models with continued pretraining on the MultiLegalPile

joelniklaus/legal-xlm-roberta-large

Fill-Mask • 0.4B • Updated Aug 6, 2023 • 149 • 4
joelniklaus/legal-xlm-roberta-base

Fill-Mask • 0.2B • Updated Nov 18, 2024 • 22 • 3
joelniklaus/legal-xlm-longformer-base

Fill-Mask • 0.2B • Updated Aug 6, 2023 • 710 • 4
joelniklaus/legal-swiss-roberta-large

Fill-Mask • Updated Aug 6, 2023 • 5 • 1

SCALE Datasets

Scaling up the Complexity for Advanced Language Model Evaluation

rcds/swiss_legislation

Viewer • Updated Oct 10, 2024 • 35.7k • 42 • 6
rcds/swiss_rulings

Viewer • Updated Jul 20, 2023 • 637k • 79 • 1
rcds/swiss_citation_extraction

Viewer • Updated Aug 31, 2023 • 255k • 39
rcds/swiss_leading_decision_summarization

Viewer • Updated Jul 20, 2023 • 18.2k • 42 • 5

MultiLegalPile Datasets

A 689GB Multilingual Legal Corpus

joelniklaus/Multi_Legal_Pile

Updated Jan 12, 2024 • 1.06k • 59
joelniklaus/Multi_Legal_Pile_Commercial

Updated Oct 18, 2023 • 19 • 8

MultiLegalSBD Models

A Multilingual Legal Sentence Boundary Detection Dataset

rcds/distilbert-SBD-de-judgements-laws

Token Classification • 0.1B • Updated 9 days ago • 7
rcds/distilbert-SBD-en-judgements-laws

Token Classification • 0.1B • Updated Dec 12, 2024 • 24
rcds/distilbert-SBD-es-judgements-laws

Token Classification • Updated Oct 23, 2023 • 4
rcds/distilbert-SBD-it-judgements-laws

Token Classification • Updated Oct 23, 2023 • 4

Anonymity at Risk? Datasets

Assessing Re-Identification Capabilities of Large Language Models

rcds/swiss_rulings

Viewer • Updated Jul 20, 2023 • 637k • 79 • 1
rcds/wikipedia-persons-masked

Viewer • Updated Dec 14, 2022 • 68.7k • 26 • 3
rcds/wikipedia-for-mask-filling

Viewer • Updated Mar 8, 2023 • 828k • 234

Explainability Datasets

Datasets for the paper https://arxiv.org/abs/2402.17013

rcds/occlusion_swiss_judgment_prediction

Viewer • Updated Mar 28, 2023 • 56.8k • 124
rcds/lower_court_insertion_swiss_judgment_prediction

Viewer • Updated Mar 28, 2023 • 2.25k • 31

SwiLTra-Bench

The code for creating the datasets is available at https://github.com/JoelNiklaus/SwissLegalTranslations.

joelniklaus/SwissLegalTranslations

Viewer • Updated Nov 25, 2024 • 673k • 22
joelniklaus/SwissLawTranslations

Viewer • Updated Mar 4 • 293k • 45
joelniklaus/SwissSupremeCourtPressReleaseTranslations

Viewer • Updated Mar 4 • 1.17k • 24
joelniklaus/SwissDecisionSummaryTranslations

Viewer • Updated Mar 4 • 75.5k • 59

LegalLMs

XLM-RoBERTa models with continued pretraining on the MultiLegalPile

joelniklaus/legal-xlm-roberta-large

Fill-Mask • 0.4B • Updated Aug 6, 2023 • 149 • 4
joelniklaus/legal-xlm-roberta-base

Fill-Mask • 0.2B • Updated Nov 18, 2024 • 22 • 3
joelniklaus/legal-xlm-longformer-base

Fill-Mask • 0.2B • Updated Aug 6, 2023 • 710 • 4
joelniklaus/legal-swiss-roberta-large

Fill-Mask • Updated Aug 6, 2023 • 5 • 1

SCALE Models

Scaling up the Complexity for Advanced Language Model Evaluation

joelniklaus/legal-swiss-roberta-large

Fill-Mask • Updated Aug 6, 2023 • 5 • 1
joelniklaus/legal-swiss-roberta-base

Fill-Mask • Updated Aug 6, 2023 • 20
joelniklaus/legal-swiss-longformer-base

Fill-Mask • 0.2B • Updated Aug 6, 2023 • 14 • 2
rcds/MiniLM-swiss_citation_extraction-de-fr-it

Token Classification • 0.1B • Updated Jun 16, 2023 • 4

SCALE Datasets

Scaling up the Complexity for Advanced Language Model Evaluation

rcds/swiss_legislation

Viewer • Updated Oct 10, 2024 • 35.7k • 42 • 6
rcds/swiss_rulings

Viewer • Updated Jul 20, 2023 • 637k • 79 • 1
rcds/swiss_citation_extraction

Viewer • Updated Aug 31, 2023 • 255k • 39
rcds/swiss_leading_decision_summarization

Viewer • Updated Jul 20, 2023 • 18.2k • 42 • 5

MultiLegalPile Models

A 689GB Multilingual Legal Corpus

joelniklaus/legal-croatian-roberta-base

Fill-Mask • 0.1B • Updated Aug 6, 2023 • 22 • 2
joelniklaus/legal-romanian-roberta-base

Fill-Mask • Updated Feb 13, 2023 • 5 • 1
joelniklaus/legal-xlm-longformer-base

Fill-Mask • 0.2B • Updated Aug 6, 2023 • 710 • 4
joelniklaus/legal-english-roberta-large

Fill-Mask • 0.3B • Updated 9 days ago • 16 • 1

MultiLegalPile Datasets

A 689GB Multilingual Legal Corpus

joelniklaus/Multi_Legal_Pile

Updated Jan 12, 2024 • 1.06k • 59
joelniklaus/Multi_Legal_Pile_Commercial

Updated Oct 18, 2023 • 19 • 8

MultiLegalSBD Datasets

A Multilingual Legal Sentence Boundary Detection Dataset

rcds/MultiLegalSBD

Updated Nov 21, 2024 • 109 • 3

MultiLegalSBD Models

A Multilingual Legal Sentence Boundary Detection Dataset

rcds/distilbert-SBD-de-judgements-laws

Token Classification • 0.1B • Updated 9 days ago • 7
rcds/distilbert-SBD-en-judgements-laws

Token Classification • 0.1B • Updated Dec 12, 2024 • 24
rcds/distilbert-SBD-es-judgements-laws

Token Classification • Updated Oct 23, 2023 • 4
rcds/distilbert-SBD-it-judgements-laws

Token Classification • Updated Oct 23, 2023 • 4

ClassActionPrediction Datasets

A Challenging Benchmark for Legal Judgment Prediction of Class Action Cases in the US

darrow-ai/USClassActionOutcomes_ExpertsAnnotations

Viewer • Updated Nov 6, 2022 • 200 • 75
darrow-ai/USClassActions

Viewer • Updated Jan 24, 2024 • 3k • 842 • 1

Anonymity at Risk? Datasets

Assessing Re-Identification Capabilities of Large Language Models

rcds/swiss_rulings

Viewer • Updated Jul 20, 2023 • 637k • 79 • 1
rcds/wikipedia-persons-masked

Viewer • Updated Dec 14, 2022 • 68.7k • 26 • 3
rcds/wikipedia-for-mask-filling

Viewer • Updated Mar 8, 2023 • 828k • 234

Anonymization

Automatic Anonymization of Swiss Federal Supreme Court Rulings

joelniklaus/legal-german-roberta-base

Fill-Mask • Updated Jan 7, 2023 • 15 • 1
joelniklaus/legal-french-roberta-base

Fill-Mask • Updated Jan 14, 2023 • 5
joelniklaus/legal-italian-roberta-base

Fill-Mask • Updated Jan 14, 2023 • 4 • 1

Explainability Datasets

Datasets for the paper https://arxiv.org/abs/2402.17013

rcds/occlusion_swiss_judgment_prediction

Viewer • Updated Mar 28, 2023 • 56.8k • 124
rcds/lower_court_insertion_swiss_judgment_prediction

Viewer • Updated Mar 28, 2023 • 2.25k • 31

LegalLens Datasets

Datasets for the paper https://arxiv.org/abs/2402.04335

darrow-ai/LegalLensNLI

Viewer • Updated Jul 8, 2024 • 312 • 91 • 6
darrow-ai/LegalLensNER

Viewer • Updated Jul 8, 2024 • 1.33k • 43 • 4