Papers
arxiv:1708.06025

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

Published on Aug 20, 2017
Authors:
,
,
,
,
,

Abstract

Evaluation of various word embedding models on Portuguese text shows that task-specific evaluations are more reliable than word analogies.

AI-generated summary

Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants. We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Word2Vec. We evaluated them intrinsically on syntactic and semantic analogies and extrinsically on POS tagging and sentence semantic similarity tasks. The obtained results suggest that word analogies are not appropriate for word embedding evaluation; task-specific evaluations appear to be a better option.

Community

Sign up or log in to comment

Models citing this paper 35

Browse 35 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1708.06025 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1708.06025 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.