KAILAS / README.md
fgrezes's picture
updated tags
fb1a8fa verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - nasa-impact/nasa-smd-ibm-v0.1
pipeline_tag: text-classification
tags:
  - unified
  - astronomy
  - thesaurus
  - uat
widget:
  - text: >-
      Solar Observations by Angelo Secchi. I. Digitization of Original Documents
      and Analysis of Group Numbers over the Period of 1853-1878. Angelo Secchi,
      an Italian Jesuit and prominent scientist of the 19th century, and one of
      the founders of modern astrophysics, observed the Sun regularly at the
      Collegio Romano in Rome, Italy, for more than 25 yr. Results from his
      observations are reported in articles published in the scientific journals
      of the time, as well as in drawings and personal notebooks that are stored
      in the historical archive of the Istituto Nazionale di Astrofisica
      Osservatorio Astronomico di Roma. The latter material, which reports solar
      observations performed from 1853-1878, includes original documents from
      Secchi and from a few of his close collaborators. The above unique
      material has recently been digitized for preservation purposes and for
      allowing the scientific exploitation of data not easily accessible so far.
      A total of more than 5400 digital images have been produced. Here we
      present the archival material and the new digital data derived from it. We
      also present results obtained from our primary analysis of the new digital
      data. In particular, we produced new measurements of the group number from
      1853-1878, which will be available for future recalibration of the group
      number series.
    example_title: Solar Observations by Angelo Secchi
  - text: >-
      Disk-resolved Photometric Properties of Pluto and the Coloring Materials
      across its Surface. A multiwavelength regionally dependent photometric
      analysis of Pluto's anti-Charon-facing hemisphere using images collected
      by New Horizons' Multispectral Visible Imaging Camera (MVIC) reveals large
      variations in the absolute value and spectral slope of the
      single-scattering albedo. Four regions of interest are analyzed: the dark
      equatorial belt, Pluto's north pole, nitrogen-rich regions, and the
      mid-latitude terrains. Regions dominated by volatile ices such as Lowell
      Regio and Sputnik Planitia present single-scattering albedos of ∼0.98 at
      492 nm, almost neutral across MVIC's visible wavelength range (400-910
      nm), indicating limited contributions from tholin materials. Pluto's dark
      equatorial regions, informally named Cthulhu and Krun Maculae, have
      single-scattering albedos of ∼0.16 at 492 nm and are the reddest regions.
      Applying the Hapke radiative transfer model to combined MVIC and Linear
      Etalon Imaging Spectral Array (LEISA) spectra (400-2500 nm) of Cthulhu
      Macula and Lowell Regio successfully reproduces the spectral properties of
      these two regions of dramatically disparate coloration, composition, and
      morphology. Since this model uses only a single coloring agent, very
      similar to the Titan-like tholin of Khare et al., to account for all of
      Pluto's colors, this result supports the Grundy et al. conclusion that
      Pluto's coloration is the result of photochemical products mostly produced
      in the atmosphere. Although cosmic rays and extreme ultraviolet photons
      reach Pluto's surface where they can drive chemical processing,
      observations of diverse surface colors do not require different chemical
      products produced in different environments. We report a correction
      scaling factor in the LEISA radiometric calibration of 0.74 ± 0.05.
    example_title: Photometric Properties of Pluto

KAILAS

KAILAS (aka Keyword Labeler At SciX aka Indus-UAT-Labeler aka nasa-smd-ibm-v0.1_UAT_Labeler) is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search.
This specific fork was finetuned on SciX Digital Library (https://scixplorer.org/, formerly NASA-ADS) proprietary data to label text with UAT labels (https://astrothesaurus.org/)

Model Details

  • Base Model: RoBERTa
  • Tokenizer: Custom
  • Parameters: 125M

Training Data

  • 18K titles, abstracts, body and acknowledgments from recent, quality astronomy papers
  • approximately 217M tokens

Contact

KAILAS is maintained by Dr. Felix Grezes and Dr. Jennifer Lynn Bartlett.