Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1810.04805

CLEAR: Character Unlearning in Textual and Visual Modalities

Paper • 2410.18057 • Published 25 days ago • 198
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published 18 days ago • 53
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published 17 days ago • 58
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published 13 days ago • 44

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
Playing Atari with Deep Reinforcement Learning

Paper • 1312.5602 • Published Dec 19, 2013
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

This is information about BERT

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
jjzha/skillspan

Viewer • Updated Sep 7, 2023 • 11.5k • 302

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 13
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30 • 16
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31 • 22

Must-read Papers

Some of the most important and insightful (in my opinion) AI papers of today, with a focus on NLP and LLMs.

ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 14
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 104

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 14
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

Fundamentals NLP

Distributed Representations of Sentences and Documents

Paper • 1405.4053 • Published May 16, 2014
Sequence to Sequence Learning with Neural Networks

Paper • 1409.3215 • Published Sep 10, 2014 • 3
PaLM: Scaling Language Modeling with Pathways

Paper • 2204.02311 • Published Apr 5, 2022 • 2
Recent Trends in Deep Learning Based Natural Language Processing

Paper • 1708.02709 • Published Aug 9, 2017

Must Reads On Language Model

Dive into the world of generative AI with some prominent papers of Language Model, unlocking the secrets of natural language processing.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 7
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11
OPT: Open Pre-trained Transformer Language Models

Paper • 2205.01068 • Published May 2, 2022 • 2

Research Papers

Research papers related to NLP.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
Self-Attention with Relative Position Representations

Paper • 1803.02155 • Published Mar 6, 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Paper • 2401.12954 • Published Jan 23 • 29

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs