tomaarsen's picture
tomaarsen HF Staff
Add new SentenceTransformer model
6d5b5b1 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:99231
  - loss:CachedMultipleNegativesRankingLoss
base_model: microsoft/mpnet-base
widget:
  - source_sentence: who led the army that defeated the aztecs
    sentences:
      - >-
        Spanish conquest of the Aztec Empire The Spanish conquest of the Aztec
        Empire, or the Spanish-Aztec War (1519-21)[3] was one of the most
        significant and complex events in world history. There are multiple
        sixteenth-century narratives of the events by Spanish conquerors, their
        indigenous allies, and the defeated Aztecs. It was not solely a contest
        between a small contingent of Spaniards defeating the Aztec Empire, but
        rather the creation of a coalition of Spanish invaders with tributaries
        to the Aztecs, and most especially the Aztecs' indigenous enemies and
        rivals. They combined forces to defeat the Mexica of Tenochtitlan over a
        two-year period. For the Spanish, the expedition to Mexico was part of a
        project of Spanish colonization of the New World after twenty-five years
        of permanent Spanish settlement and further exploration in the
        Caribbean. The Spanish made landfall in Mexico in 1517. A Spanish
        settler in Cuba, Hernán Cortés, led an expedition (entrada) to Mexico,
        landing in February 1519, following an earlier expedition led by Juan de
        Grijalva to Yucatán in 1517. Two years later Cortés and his retinue set
        sail, thus beginning the expedition of exploration and conquest.[4] The
        Spanish campaign against the Aztec Empire had its final victory on
        August 13, 1521, when a coalition army of Spanish forces and native
        Tlaxcalan warriors led by Cortés and Xicotencatl the Younger captured
        the emperor Cuauhtemoc and Tenochtitlan, the capital of the Aztec
        Empire. The fall of Tenochtitlan marks the beginning of Spanish rule in
        central Mexico, and they established their capital of Mexico City on the
        ruins of Tenochtitlan.
      - >-
        The Girl with All the Gifts Justineau awakens in the Rosalind Franklin.
        Melanie leads her to a group of intelligent hungries, to whom Justineau,
        wearing an environmental protection suit, starts teaching the alphabet.
      - >-
        Wendy Makkena In 1992 she had a supporting role in the movie Sister Act
        as the shy but talented singing nun Sister Mary Robert, a role she
        reprised in Sister Act 2: Back in the Habit the following year. She
        appeared in various other television roles until 1997, when she starred
        in Air Bud, followed by the independent film Finding North. She
        continued appearing on television shows such as The Job, Oliver Beene,
        and Listen Up![citation needed]
  - source_sentence: who went to the most nba finals in a row
    sentences:
      - >-
        List of NBA franchise post-season streaks The San Antonio Spurs hold the
        longest active consecutive playoff appearances with 21 appearances,
        starting in the 1998 NBA Playoffs (also the longest active playoff
        streak in any major North American sports league as of 2017). The Spurs
        have won five NBA championships during the streak. The Philadelphia
        76ers (formerly known as Syracuse Nationals) hold the all-time record
        for consecutive playoff appearances with 22 straight appearances between
        1950 and 1971. The 76ers won two NBA championships during their streak.
        The Boston Celtics hold the longest consecutive NBA Finals appearance
        streak with ten appearances between 1957 and 1966. During the streak,
        the Celtics won eight consecutive NBA championships—also an NBA
        record.
      - >-
        Dear Dumb Diary Dear Dumb Diary is a series of children's novels by Jim
        Benton. Each book is written in the first person view of a middle school
        girl named Jamie Kelly. The series is published by Scholastic in English
        and Random House in Korean. Film rights to the series have been optioned
        by the Gotham Group.[2]
      - >-
        Voting rights in the United States Eligibility to vote in the United
        States is established both through the federal constitution and by state
        law. Several constitutional amendments (the 15th, 19th, and 26th
        specifically) require that voting rights cannot be abridged on account
        of race, color, previous condition of servitude, sex, or age for those
        above 18; the constitution as originally written did not establish any
        such rights during 1787–1870. In the absence of a specific federal law
        or constitutional provision, each state is given considerable discretion
        to establish qualifications for suffrage and candidacy within its own
        respective jurisdiction; in addition, states and lower level
        jurisdictions establish election systems, such as at-large or single
        member district elections for county councils or school boards.
  - source_sentence: who did the vocals on mcdonald's jingle i'm loving it
    sentences:
      - >-
        I'm Lovin' It (song) "I'm Lovin' It" is a song recorded by American
        singer-songwriter Justin Timberlake. It was written by Pusha T and
        produced by The Neptunes.
      - >-
        Vallabhbhai Patel As the first Home Minister and Deputy Prime Minister
        of India, Patel organised relief efforts for refugees fleeing from
        Punjab and Delhi and worked to restore peace across the nation. He led
        the task of forging a united India, successfully integrating into the
        newly independent nation those British colonial provinces that had been
        "allocated" to India. Besides those provinces that had been under direct
        British rule, approximately 565 self-governing princely states had been
        released from British suzerainty by the Indian Independence Act of 1947.
        Employing frank diplomacy with the expressed option to deploy military
        force, Patel persuaded almost every princely state to accede to India.
        His commitment to national integration in the newly independent country
        was total and uncompromising, earning him the sobriquet "Iron Man of
        India".[3] He is also affectionately remembered as the "Patron saint of
        India's civil servants" for having established the modern all-India
        services system. He is also called the Unifier of India.[4]
      - >-
        National debt of the United States As of July 31, 2018, debt held by the
        public was $15.6 trillion and intragovernmental holdings were $5.7
        trillion, for a total or "National Debt" of $21.3 trillion.[5] Debt held
        by the public was approximately 77% of GDP in 2017, ranked 43rd highest
        out of 207 countries.[6] The Congressional Budget Office forecast in
        April 2018 that the ratio will rise to nearly 100% by 2028, perhaps
        higher if current policies are extended beyond their scheduled
        expiration date.[7] As of December 2017, $6.3 trillion or approximately
        45% of the debt held by the public was owned by foreign investors, the
        largest being China (about $1.18 trillion) then Japan (about $1.06
        trillion).[8]
  - source_sentence: who is the actress of harley quinn in suicide squad
    sentences:
      - >-
        Tariffs in United States history Tariffs were the main source of revenue
        for the federal government from 1789 to 1914. During this period, there
        was vigorous debate between the various political parties over the
        setting of tariff rates. In general Democrats favored a tariff that
        would pay the cost of government, but no higher. Whigs and Republicans
        favored higher tariffs to protect and encourage American industry and
        industrial workers. Since the early 20th century, however, U.S. tariffs
        have been very low and have been much less a matter of partisan debate.
      - >-
        The Rolling Stones The Rolling Stones are an English rock band formed in
        London, England in 1962. The first stable line-up consisted of Brian
        Jones (guitar, harmonica), Mick Jagger (lead vocals), Keith Richards
        (guitar, backing vocals), Bill Wyman (bass), Charlie Watts (drums), and
        Ian Stewart (piano). Stewart was removed from the official line-up in
        1963 but continued as a touring member until his death in 1985. Jones
        left the band less than a month prior to his death in 1969, having
        already been replaced by Mick Taylor, who remained until 1974. After
        Taylor left the band, Ronnie Wood took his place in 1975 and has been on
        guitar in tandem with Richards ever since. Following Wyman's departure
        in 1993, Darryl Jones joined as their touring bassist. Touring
        keyboardists for the band have been Nicky Hopkins (1967–1982), Ian
        McLagan (1978–1981), Billy Preston (through the mid-1970s) and Chuck
        Leavell (1982–present). The band was first led by Brian Jones, but after
        developing into the band's songwriters, Jagger and Richards assumed
        leadership while Jones dealt with legal and personal troubles.
      - >-
        Margot Robbie After moving to the United States, Robbie starred in the
        short-lived ABC drama series Pan Am (2011–2012). In 2013, she made her
        big screen debut in Richard Curtis's romantic comedy-drama film About
        Time and co-starred in Martin Scorsese's biographical black comedy The
        Wolf of Wall Street. In 2015, Robbie co-starred in the romantic
        comedy-drama film Focus, appeared in the romantic World War II drama
        film Suite Française and starred in the science fiction film Z for
        Zachariah. That same year, she played herself in The Big Short. In 2016,
        she portrayed Jane Porter in the action-adventure film The Legend of
        Tarzan and Harley Quinn in the superhero film Suicide Squad. She
        appeared on Time magazine's "The Most Influential People of 2017"
        list.[4]
  - source_sentence: what is meaning of am and pm in time
    sentences:
      - >-
        America's Got Talent America's Got Talent (often abbreviated as AGT) is
        a televised American talent show competition, broadcast on the NBC
        television network. It is part of the global Got Talent franchise
        created by Simon Cowell, and is produced by Fremantle North America and
        SYCOtv, with distribution done by Fremantle. Since its premiere in June
        2006, each season is run during the network's summer schedule, with the
        show having featured various hosts - it is currently hosted by Tyra
        Banks, since 2017.[2] It is the first global edition of the franchise,
        after plans for a British edition in 2005 were suspended, following a
        dispute between Paul O'Grady, the planned host, and the British
        broadcaster ITV; production of this edition later resumed in 2007.[3]
      - >-
        Times Square Times Square is a major commercial intersection, tourist
        destination, entertainment center and neighborhood in the Midtown
        Manhattan section of New York City at the junction of Broadway and
        Seventh Avenue. It stretches from West 42nd to West 47th Streets.[1]
        Brightly adorned with billboards and advertisements, Times Square is
        sometimes referred to as "The Crossroads of the World",[2] "The Center
        of the Universe",[3] "the heart of The Great White Way",[4][5][6] and
        the "heart of the world".[7] One of the world's busiest pedestrian
        areas,[8] it is also the hub of the Broadway Theater District[9] and a
        major center of the world's entertainment industry.[10] Times Square is
        one of the world's most visited tourist attractions, drawing an
        estimated 50 million visitors annually.[11] Approximately 330,000 people
        pass through Times Square daily,[12] many of them tourists,[13] while
        over 460,000 pedestrians walk through Times Square on its busiest
        days.[7]
      - >-
        12-hour clock The 12-hour clock is a time convention in which the 24
        hours of the day are divided into two periods:[1] a.m. (from the Latin,
        ante meridiem, meaning before midday) and p.m. (post meridiem, meaning
        past midday).[2] Each period consists of 12 hours numbered: 12 (acting
        as zero),[3] 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, and 11. The 24 hour/day
        cycle starts at 12 midnight (often indicated as 12 a.m.), runs through
        12 noon (often indicated as 12 p.m.), and continues to the midnight at
        the end of the day. The 12-hour clock was developed over time from the
        mid-second millennium BC to the 16th century AD.
datasets:
  - sentence-transformers/natural-questions
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
co2_eq_emissions:
  emissions: 100.49635551408996
  energy_consumed: 0.37551604693967594
  source: codecarbon
  training_type: fine-tuning
  on_cloud: false
  cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
  ram_total_size: 31.777088165283203
  hours_used: 0.956
  hardware_used: 1 x NVIDIA GeForce RTX 3090
model-index:
  - name: MPNet base trained on Natural Questions pairs
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoClimateFEVER
          type: NanoClimateFEVER
        metrics:
          - type: cosine_accuracy@1
            value: 0.34
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.4
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.52
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.68
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.34
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.15333333333333332
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.124
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.08199999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.15666666666666665
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.19666666666666666
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.245
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.32899999999999996
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2874373031622126
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.40833333333333327
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.23720812652159773
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoDBPedia
          type: NanoDBPedia
        metrics:
          - type: cosine_accuracy@1
            value: 0.5
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.86
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.49999999999999994
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.456
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.4
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.033460574803481
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.1379456043967994
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.19236008288900686
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.27718360985007373
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.4815217166466425
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6581904761904761
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.34038401700749055
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoFEVER
          type: NanoFEVER
        metrics:
          - type: cosine_accuracy@1
            value: 0.56
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.68
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.78
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.84
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.56
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2333333333333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.16
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.08599999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.55
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.66
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.75
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.8
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.6774195829582105
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6484444444444444
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.6375252923987764
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoFiQA2018
          type: NanoFiQA2018
        metrics:
          - type: cosine_accuracy@1
            value: 0.32
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.48
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.54
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.62
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.32
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.19333333333333336
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.16
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.092
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.16969047619047622
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.2751031746031746
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.35526984126984124
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.43926984126984125
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.3484809857651704
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.412
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.29317086609082804
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoHotpotQA
          type: NanoHotpotQA
        metrics:
          - type: cosine_accuracy@1
            value: 0.5
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.62
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.66
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.72
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.26666666666666666
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.172
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.102
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.25
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.4
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.43
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.51
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.4646363664054244
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.5711666666666667
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.40372265204584074
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoMSMARCO
          type: NanoMSMARCO
        metrics:
          - type: cosine_accuracy@1
            value: 0.24
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.5
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.6
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.72
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.24
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.16666666666666663
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.12000000000000002
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.07200000000000001
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.24
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.5
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.6
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.72
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.4662300989052903
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.38596825396825385
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.3966342163469757
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoNFCorpus
          type: NanoNFCorpus
        metrics:
          - type: cosine_accuracy@1
            value: 0.38
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.5
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.52
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.6
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.38
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.32
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.28
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.22399999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.013385350979353738
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.04273144042314974
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.05311513788935319
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.09414123513400076
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.26378524693375555
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.44563492063492055
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.08765440666496092
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoNQ
          type: NanoNQ
        metrics:
          - type: cosine_accuracy@1
            value: 0.38
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.6
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.66
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.76
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.38
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.21333333333333332
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.14
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.08199999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.36
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.59
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.64
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.74
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.56110661357524
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.507888888888889
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.5087680789990747
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoQuoraRetrieval
          type: NanoQuoraRetrieval
        metrics:
          - type: cosine_accuracy@1
            value: 0.9
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.92
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.92
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.94
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.37999999999999995
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.236
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.12799999999999997
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.7873333333333332
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8786666666666667
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.8893333333333334
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.93
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9060776365512109
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9133333333333334
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8958434676434677
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoSCIDOCS
          type: NanoSCIDOCS
        metrics:
          - type: cosine_accuracy@1
            value: 0.42
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.58
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.62
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.84
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.42
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2866666666666666
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.23600000000000004
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.182
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.08866666666666669
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.1776666666666667
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.2426666666666667
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.3746666666666666
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.3499542026448552
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.524547619047619
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.26373167166015693
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoArguAna
          type: NanoArguAna
        metrics:
          - type: cosine_accuracy@1
            value: 0.16
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.62
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.72
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.88
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.16
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.20666666666666667
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.14400000000000002
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.088
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.16
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.62
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.72
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.88
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.5149778577506615
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.39771428571428574
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.4040354269913093
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoSciFact
          type: NanoSciFact
        metrics:
          - type: cosine_accuracy@1
            value: 0.42
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.62
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.64
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.72
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.42
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.21999999999999997
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.136
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.08199999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.385
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.59
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.61
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.71
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.5604524461977042
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.525357142857143
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.5119986720475846
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: NanoTouche2020
          type: NanoTouche2020
        metrics:
          - type: cosine_accuracy@1
            value: 0.5306122448979592
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8571428571428571
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.8979591836734694
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9591836734693877
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5306122448979592
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.5714285714285714
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.5346938775510204
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.41836734693877553
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.04151746748360336
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.12605393943663232
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.1939653912321698
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.2801024018430986
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.4767149331432611
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6991739552964044
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.3725605659989721
            name: Cosine Map@100
      - task:
          type: nano-beir
          name: Nano BEIR
        dataset:
          name: NanoBEIR mean
          type: NanoBEIR_mean
        metrics:
          - type: cosine_accuracy@1
            value: 0.43466248037676614
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.6290109890109891
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.6875353218210362
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.7830141287284145
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.43466248037676614
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.28549450549450545
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2229764521193093
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.15679748822605966
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.24890157970181387
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.3996026276045966
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.45551618871387467
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5449510580587446
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.48913807620304917
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.54598102464429
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.4117874969551566
            name: Cosine Map@100

MPNet base trained on Natural Questions pairs

This is a sentence-transformers model finetuned from microsoft/mpnet-base on the natural-questions dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: microsoft/mpnet-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Router(
    (sub_modules): ModuleDict(
      (query): Sequential(
        (0): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      )
      (document): Sequential(
        (0): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
      )
    )
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("tomaarsen/cls-last-split-pooling")
# Run inference
queries = [
    "what is meaning of am and pm in time",
]
documents = [
    '12-hour clock The 12-hour clock is a time convention in which the 24 hours of the day are divided into two periods:[1] a.m. (from the Latin, ante meridiem, meaning before midday) and p.m. (post meridiem, meaning past midday).[2] Each period consists of 12 hours numbered: 12 (acting as zero),[3] 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, and 11. The 24 hour/day cycle starts at 12 midnight (often indicated as 12 a.m.), runs through 12 noon (often indicated as 12 p.m.), and continues to the midnight at the end of the day. The 12-hour clock was developed over time from the mid-second millennium BC to the 16th century AD.',
    "America's Got Talent America's Got Talent (often abbreviated as AGT) is a televised American talent show competition, broadcast on the NBC television network. It is part of the global Got Talent franchise created by Simon Cowell, and is produced by Fremantle North America and SYCOtv, with distribution done by Fremantle. Since its premiere in June 2006, each season is run during the network's summer schedule, with the show having featured various hosts - it is currently hosted by Tyra Banks, since 2017.[2] It is the first global edition of the franchise, after plans for a British edition in 2005 were suspended, following a dispute between Paul O'Grady, the planned host, and the British broadcaster ITV; production of this edition later resumed in 2007.[3]",
    'Times Square Times Square is a major commercial intersection, tourist destination, entertainment center and neighborhood in the Midtown Manhattan section of New York City at the junction of Broadway and Seventh Avenue. It stretches from West 42nd to West 47th Streets.[1] Brightly adorned with billboards and advertisements, Times Square is sometimes referred to as "The Crossroads of the World",[2] "The Center of the Universe",[3] "the heart of The Great White Way",[4][5][6] and the "heart of the world".[7] One of the world\'s busiest pedestrian areas,[8] it is also the hub of the Broadway Theater District[9] and a major center of the world\'s entertainment industry.[10] Times Square is one of the world\'s most visited tourist attractions, drawing an estimated 50 million visitors annually.[11] Approximately 330,000 people pass through Times Square daily,[12] many of them tourists,[13] while over 460,000 pedestrians walk through Times Square on its busiest days.[7]',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.5485, 0.0270, 0.1584]])

Evaluation

Metrics

Information Retrieval

  • Datasets: NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020
  • Evaluated with InformationRetrievalEvaluator
Metric NanoClimateFEVER NanoDBPedia NanoFEVER NanoFiQA2018 NanoHotpotQA NanoMSMARCO NanoNFCorpus NanoNQ NanoQuoraRetrieval NanoSCIDOCS NanoArguAna NanoSciFact NanoTouche2020
cosine_accuracy@1 0.34 0.5 0.56 0.32 0.5 0.24 0.38 0.38 0.9 0.42 0.16 0.42 0.5306
cosine_accuracy@3 0.4 0.8 0.68 0.48 0.62 0.5 0.5 0.6 0.92 0.58 0.62 0.62 0.8571
cosine_accuracy@5 0.52 0.86 0.78 0.54 0.66 0.6 0.52 0.66 0.92 0.62 0.72 0.64 0.898
cosine_accuracy@10 0.68 0.9 0.84 0.62 0.72 0.72 0.6 0.76 0.94 0.84 0.88 0.72 0.9592
cosine_precision@1 0.34 0.5 0.56 0.32 0.5 0.24 0.38 0.38 0.9 0.42 0.16 0.42 0.5306
cosine_precision@3 0.1533 0.5 0.2333 0.1933 0.2667 0.1667 0.32 0.2133 0.38 0.2867 0.2067 0.22 0.5714
cosine_precision@5 0.124 0.456 0.16 0.16 0.172 0.12 0.28 0.14 0.236 0.236 0.144 0.136 0.5347
cosine_precision@10 0.082 0.4 0.086 0.092 0.102 0.072 0.224 0.082 0.128 0.182 0.088 0.082 0.4184
cosine_recall@1 0.1567 0.0335 0.55 0.1697 0.25 0.24 0.0134 0.36 0.7873 0.0887 0.16 0.385 0.0415
cosine_recall@3 0.1967 0.1379 0.66 0.2751 0.4 0.5 0.0427 0.59 0.8787 0.1777 0.62 0.59 0.1261
cosine_recall@5 0.245 0.1924 0.75 0.3553 0.43 0.6 0.0531 0.64 0.8893 0.2427 0.72 0.61 0.194
cosine_recall@10 0.329 0.2772 0.8 0.4393 0.51 0.72 0.0941 0.74 0.93 0.3747 0.88 0.71 0.2801
cosine_ndcg@10 0.2874 0.4815 0.6774 0.3485 0.4646 0.4662 0.2638 0.5611 0.9061 0.35 0.515 0.5605 0.4767
cosine_mrr@10 0.4083 0.6582 0.6484 0.412 0.5712 0.386 0.4456 0.5079 0.9133 0.5245 0.3977 0.5254 0.6992
cosine_map@100 0.2372 0.3404 0.6375 0.2932 0.4037 0.3966 0.0877 0.5088 0.8958 0.2637 0.404 0.512 0.3726

Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with NanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "climatefever",
            "dbpedia",
            "fever",
            "fiqa2018",
            "hotpotqa",
            "msmarco",
            "nfcorpus",
            "nq",
            "quoraretrieval",
            "scidocs",
            "arguana",
            "scifact",
            "touche2020"
        ]
    }
    
Metric Value
cosine_accuracy@1 0.4347
cosine_accuracy@3 0.629
cosine_accuracy@5 0.6875
cosine_accuracy@10 0.783
cosine_precision@1 0.4347
cosine_precision@3 0.2855
cosine_precision@5 0.223
cosine_precision@10 0.1568
cosine_recall@1 0.2489
cosine_recall@3 0.3996
cosine_recall@5 0.4555
cosine_recall@10 0.545
cosine_ndcg@10 0.4891
cosine_mrr@10 0.546
cosine_map@100 0.4118

Training Details

Training Dataset

natural-questions

  • Dataset: natural-questions at f9e894e
  • Size: 99,231 training samples
  • Columns: query and answer
  • Approximate statistics based on the first 1000 samples:
    query answer
    type string string
    details
    • min: 10 tokens
    • mean: 11.74 tokens
    • max: 24 tokens
    • min: 15 tokens
    • mean: 137.2 tokens
    • max: 508 tokens
  • Samples:
    query answer
    who is required to report according to the hmda Home Mortgage Disclosure Act US financial institutions must report HMDA data to their regulator if they meet certain criteria, such as having assets above a specific threshold. The criteria is different for depository and non-depository institutions and are available on the FFIEC website.[4] In 2012, there were 7,400 institutions that reported a total of 18.7 million HMDA records.[5]
    what is the definition of endoplasmic reticulum in biology Endoplasmic reticulum The endoplasmic reticulum (ER) is a type of organelle in eukaryotic cells that forms an interconnected network of flattened, membrane-enclosed sacs or tube-like structures known as cisternae. The membranes of the ER are continuous with the outer nuclear membrane. The endoplasmic reticulum occurs in most types of eukaryotic cells, but is absent from red blood cells and spermatozoa. There are two types of endoplasmic reticulum: rough and smooth. The outer (cytosolic) face of the rough endoplasmic reticulum is studded with ribosomes that are the sites of protein synthesis. The rough endoplasmic reticulum is especially prominent in cells such as hepatocytes. The smooth endoplasmic reticulum lacks ribosomes and functions in lipid manufacture and metabolism, the production of steroid hormones, and detoxification.[1] The smooth ER is especially abundant in mammalian liver and gonad cells. The lacy membranes of the endoplasmic reticulum were first seen in 1945 using elect...
    what does the ski mean in polish names Polish name Since the High Middle Ages, Polish-sounding surnames ending with the masculine -ski suffix, including -cki and -dzki, and the corresponding feminine suffix -ska/-cka/-dzka were associated with the nobility (Polish szlachta), which alone, in the early years, had such suffix distinctions.[1] They are widely popular today.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 16,
        "gather_across_devices": false
    }
    

Evaluation Dataset

natural-questions

  • Dataset: natural-questions at f9e894e
  • Size: 1,000 evaluation samples
  • Columns: query and answer
  • Approximate statistics based on the first 1000 samples:
    query answer
    type string string
    details
    • min: 10 tokens
    • mean: 11.78 tokens
    • max: 22 tokens
    • min: 11 tokens
    • mean: 135.64 tokens
    • max: 512 tokens
  • Samples:
    query answer
    difference between russian blue and british blue cat Russian Blue The coat is known as a "double coat", with the undercoat being soft, downy and equal in length to the guard hairs, which are an even blue with silver tips. However, the tail may have a few very dull, almost unnoticeable stripes. The coat is described as thick, plush and soft to the touch. The feeling is softer than the softest silk. The silver tips give the coat a shimmering appearance. Its eyes are almost always a dark and vivid green. Any white patches of fur or yellow eyes in adulthood are seen as flaws in show cats.[3] Russian Blues should not be confused with British Blues (which are not a distinct breed, but rather a British Shorthair with a blue coat as the British Shorthair breed itself comes in a wide variety of colors and patterns), nor the Chartreux or Korat which are two other naturally occurring breeds of blue cats, although they have similar traits.
    who played the little girl on mrs doubtfire Mara Wilson Mara Elizabeth Wilson[2] (born July 24, 1987) is an American writer and former child actress. She is known for playing Natalie Hillard in Mrs. Doubtfire (1993), Susan Walker in Miracle on 34th Street (1994), Matilda Wormwood in Matilda (1996) and Lily Stone in Thomas and the Magic Railroad (2000). Since retiring from film acting, Wilson has focused on writing.
    what year did the movie the sound of music come out The Sound of Music (film) The film was released on March 2, 1965 in the United States, initially as a limited roadshow theatrical release. Although critical response to the film was widely mixed, the film was a major commercial success, becoming the number one box office movie after four weeks, and the highest-grossing film of 1965. By November 1966, The Sound of Music had become the highest-grossing film of all-time—surpassing Gone with the Wind—and held that distinction for five years. The film was just as popular throughout the world, breaking previous box-office records in twenty-nine countries. Following an initial theatrical release that lasted four and a half years, and two successful re-releases, the film sold 283 million admissions worldwide and earned a total worldwide gross of $286,000,000.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "mini_batch_size": 16,
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • batch_sampler: no_duplicates
  • router_mapping: {'query': 'query', 'answer': 'document'}

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {'query': 'query', 'answer': 'document'}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss NanoClimateFEVER_cosine_ndcg@10 NanoDBPedia_cosine_ndcg@10 NanoFEVER_cosine_ndcg@10 NanoFiQA2018_cosine_ndcg@10 NanoHotpotQA_cosine_ndcg@10 NanoMSMARCO_cosine_ndcg@10 NanoNFCorpus_cosine_ndcg@10 NanoNQ_cosine_ndcg@10 NanoQuoraRetrieval_cosine_ndcg@10 NanoSCIDOCS_cosine_ndcg@10 NanoArguAna_cosine_ndcg@10 NanoSciFact_cosine_ndcg@10 NanoTouche2020_cosine_ndcg@10 NanoBEIR_mean_cosine_ndcg@10
-1 -1 - - 0.0630 0.1506 0.1219 0.0264 0.1597 0.0674 0.0332 0.0715 0.3045 0.0708 0.1367 0.1019 0.1166 0.1096
0.0026 1 5.5974 - - - - - - - - - - - - - - -
0.0129 5 5.3551 - - - - - - - - - - - - - - -
0.0258 10 5.0092 - - - - - - - - - - - - - - -
0.0387 15 4.4557 - - - - - - - - - - - - - - -
0.0515 20 3.2672 - - - - - - - - - - - - - - -
0.0644 25 1.9936 - - - - - - - - - - - - - - -
0.0773 30 1.3587 - - - - - - - - - - - - - - -
0.0902 35 1.0273 - - - - - - - - - - - - - - -
0.1031 40 0.7769 - - - - - - - - - - - - - - -
0.1160 45 0.5939 - - - - - - - - - - - - - - -
0.1289 50 0.4739 0.2552 0.2749 0.4683 0.6305 0.3643 0.4449 0.4521 0.1968 0.4122 0.8490 0.3413 0.5203 0.4890 0.4388 0.4525
0.1418 55 0.417 - - - - - - - - - - - - - - -
0.1546 60 0.4114 - - - - - - - - - - - - - - -
0.1675 65 0.3787 - - - - - - - - - - - - - - -
0.1804 70 0.3349 - - - - - - - - - - - - - - -
0.1933 75 0.3161 - - - - - - - - - - - - - - -
0.2062 80 0.3358 - - - - - - - - - - - - - - -
0.2191 85 0.2999 - - - - - - - - - - - - - - -
0.2320 90 0.3039 - - - - - - - - - - - - - - -
0.2448 95 0.2502 - - - - - - - - - - - - - - -
0.2577 100 0.225 0.1430 0.2907 0.4866 0.6736 0.3518 0.4464 0.4605 0.2073 0.4936 0.8874 0.3542 0.5146 0.5442 0.4653 0.4751
0.2706 105 0.263 - - - - - - - - - - - - - - -
0.2835 110 0.3001 - - - - - - - - - - - - - - -
0.2964 115 0.224 - - - - - - - - - - - - - - -
0.3093 120 0.2394 - - - - - - - - - - - - - - -
0.3222 125 0.2487 - - - - - - - - - - - - - - -
0.3351 130 0.1954 - - - - - - - - - - - - - - -
0.3479 135 0.2194 - - - - - - - - - - - - - - -
0.3608 140 0.2514 - - - - - - - - - - - - - - -
0.3737 145 0.2145 - - - - - - - - - - - - - - -
0.3866 150 0.2053 0.1190 0.2912 0.4807 0.6543 0.3429 0.4563 0.4598 0.2422 0.5075 0.9008 0.3552 0.5323 0.5575 0.4581 0.4799
0.3995 155 0.2405 - - - - - - - - - - - - - - -
0.4124 160 0.2207 - - - - - - - - - - - - - - -
0.4253 165 0.1908 - - - - - - - - - - - - - - -
0.4381 170 0.1832 - - - - - - - - - - - - - - -
0.4510 175 0.2108 - - - - - - - - - - - - - - -
0.4639 180 0.1901 - - - - - - - - - - - - - - -
0.4768 185 0.2118 - - - - - - - - - - - - - - -
0.4897 190 0.1813 - - - - - - - - - - - - - - -
0.5026 195 0.1848 - - - - - - - - - - - - - - -
0.5155 200 0.1932 0.1043 0.2857 0.4838 0.6747 0.3592 0.4611 0.4738 0.2415 0.5336 0.8939 0.3539 0.5101 0.5442 0.4726 0.4837
0.5284 205 0.2004 - - - - - - - - - - - - - - -
0.5412 210 0.1874 - - - - - - - - - - - - - - -
0.5541 215 0.1548 - - - - - - - - - - - - - - -
0.5670 220 0.1662 - - - - - - - - - - - - - - -
0.5799 225 0.158 - - - - - - - - - - - - - - -
0.5928 230 0.1951 - - - - - - - - - - - - - - -
0.6057 235 0.1935 - - - - - - - - - - - - - - -
0.6186 240 0.1665 - - - - - - - - - - - - - - -
0.6314 245 0.1557 - - - - - - - - - - - - - - -
0.6443 250 0.1987 0.0963 0.2834 0.4801 0.6737 0.3522 0.4610 0.4736 0.2478 0.5643 0.8999 0.3437 0.5225 0.5353 0.4802 0.4860
0.6572 255 0.1612 - - - - - - - - - - - - - - -
0.6701 260 0.1859 - - - - - - - - - - - - - - -
0.6830 265 0.1983 - - - - - - - - - - - - - - -
0.6959 270 0.1688 - - - - - - - - - - - - - - -
0.7088 275 0.1949 - - - - - - - - - - - - - - -
0.7216 280 0.1684 - - - - - - - - - - - - - - -
0.7345 285 0.1834 - - - - - - - - - - - - - - -
0.7474 290 0.1673 - - - - - - - - - - - - - - -
0.7603 295 0.185 - - - - - - - - - - - - - - -
0.7732 300 0.1529 0.0902 0.2827 0.4798 0.6636 0.3486 0.4528 0.4634 0.2530 0.5602 0.9064 0.3519 0.5204 0.5531 0.4727 0.4853
0.7861 305 0.2042 - - - - - - - - - - - - - - -
0.7990 310 0.1995 - - - - - - - - - - - - - - -
0.8119 315 0.1579 - - - - - - - - - - - - - - -
0.8247 320 0.1711 - - - - - - - - - - - - - - -
0.8376 325 0.17 - - - - - - - - - - - - - - -
0.8505 330 0.1539 - - - - - - - - - - - - - - -
0.8634 335 0.151 - - - - - - - - - - - - - - -
0.8763 340 0.1642 - - - - - - - - - - - - - - -
0.8892 345 0.1669 - - - - - - - - - - - - - - -
0.9021 350 0.1475 0.0911 0.2874 0.4843 0.6724 0.3450 0.4536 0.4590 0.2616 0.5611 0.9064 0.3501 0.5114 0.5675 0.4718 0.4870
0.9149 355 0.1842 - - - - - - - - - - - - - - -
0.9278 360 0.1858 - - - - - - - - - - - - - - -
0.9407 365 0.2033 - - - - - - - - - - - - - - -
0.9536 370 0.181 - - - - - - - - - - - - - - -
0.9665 375 0.1525 - - - - - - - - - - - - - - -
0.9794 380 0.1722 - - - - - - - - - - - - - - -
0.9923 385 0.1547 - - - - - - - - - - - - - - -
-1 -1 - - 0.2874 0.4815 0.6774 0.3485 0.4646 0.4662 0.2638 0.5611 0.9061 0.3500 0.5150 0.5605 0.4767 0.4891

Environmental Impact

Carbon emissions were measured using CodeCarbon.

  • Energy Consumed: 0.376 kWh
  • Carbon Emitted: 0.100 kg of CO2
  • Hours Used: 0.956 hours

Training Hardware

  • On Cloud: No
  • GPU Model: 1 x NVIDIA GeForce RTX 3090
  • CPU Model: 13th Gen Intel(R) Core(TM) i7-13700K
  • RAM Size: 31.78 GB

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 5.2.0.dev0
  • Transformers: 4.56.0.dev0
  • PyTorch: 2.7.1+cu126
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}