Papers
arxiv:1906.01542

Natural Vocabulary Emerges from Free-Form Annotations

Published on Jun 4, 2019
Authors:
,
,

Abstract

An approach for automatically enriching and structuring free-form text annotations from untrained annotators to discover a natural vocabulary of object classes.

AI-generated summary

We propose an approach for annotating object classes using free-form text written by undirected and untrained annotators. Free-form labeling is natural for annotators, they intuitively provide very specific and exhaustive labels, and no training stage is necessary. We first collect 729 labels on 15k images using 124 different annotators. Then we automatically enrich the structure of these free-form annotations by discovering a natural vocabulary of 4020 classes within them. This vocabulary represents the natural distribution of objects well and is learned directly from data, instead of being an educated guess done before collecting any labels. Hence, the natural vocabulary emerges from a large mass of free-form annotations. To do so, we (i) map the raw input strings to entities in an ontology of physical objects (which gives them an unambiguous meaning); and (ii) leverage inter-annotator co-occurrences, as well as biases and knowledge specific to individual annotators. Finally, we also automatically extract natural vocabularies of reduced size that have high object coverage while remaining specific. These reduced vocabularies represent the natural distribution of objects much better than commonly used predefined vocabularies. Moreover, they feature more uniform sample distribution over classes.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1906.01542 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1906.01542 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.