metadata

language:
  - en
license: apache-2.0
widget:
  - text: The nodes of a computer network may include [MASK].
library_name: transformers

NetBERT 📶

NetBERT is a BERT-base model further pre-trained on a huge corpus of computer networking text (~23Gb).

Usage

You can use the raw model for masked language modeling (MLM), but it's mostly intended to be fine-tuned on a downstream task, especially one that uses the whole sentence to make decisions such as text classification, extractive question answering, or semantic search.

You can use this model directly with a pipeline for masked language modeling:

from transformers import pipeline

unmasker = pipeline('fill-mask', model='antoinelouis/netbert')
unmasker("The nodes of a computer network may include [MASK].")

You can also use this model to extract the features of a given text:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('antoinelouis/netbert')
model = AutoModel.from_pretrained('antoinelouis/netbert')

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Documentation

Detailed documentation on the pre-trained model, its implementation, and the data can be found on Github.

Citation

For attribution in academic contexts, please cite this work as:

@mastersthesis{louis2020netbert,
    title={NetBERT: A Pre-trained Language Representation Model for Computer Networking},
    author={Louis, Antoine},
    year={2020},
    school={University of Liege}
}