duckduckpuck's picture
update
0176448 verified
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:100000
- loss:MultipleNegativesRankingLoss
base_model: embaas/sentence-transformers-e5-large-v2
widget:
- source_sentence: 'Query: The Gaumont Film Company was founded before a studio that
was established in Denmark by what Danish filmmaker?
Context:
1. It is the first and oldest film company in the world, founded before other
studios such as Pathé (founded in 1896), Titanus (1904), Nordisk Film (1906),
Universal and Paramount Pictures (both founded in 1912).'
sentences:
- 'Nordisk Film (or Nordisk Film Distribution, USA affiliate: Great Northern Film
Company), established in Denmark in 1906 by Danish filmmaker Ole Olsen and also
the oldest continuously active film studio in the world. It is the third oldest
studio in the world behind the Gaumont Film Company and Pathé. Olsen started
his company in the Copenhagen suburb of Valby under the name "Ole Olsen''s Film
Factory" but soon changed it to the Nordisk Film Kompagni. In 1908, Olsen opened
an affiliate branch in New York, the Great Northern Film Company, to handle distribution
of his films to the American market. As Nordisk Film, it became a publicly traded
company in 1911.'
- The West Lodge, also known as the West Gate Lodge, to Cardiff Castle is a Grade
II* listed building, currently used as a tea room, in the centre of Cardiff, Wales. It
is approximately 100 m west of the Castle, with the Animal Wall running in-between.
- 'Julmust (Swedish: "jul" "Yule" and "must " "not yet fermented juice of fruit
or berries", though there is no such juice in "julmust") is a soft drink that
is mainly consumed in Sweden around Christmas. During the other part of the year
it is usually quite difficult to find in stores, but sometimes it is sold at other
times of the year under the name "must". At Easter the name is påskmust (from
"påsk ", "Easter" / "Paschal" ["q.v."]). The content is the same regardless of
the marketing name, although the length of time it is stored before bottling differs;
however, the beverage is more closely associated with Christmas, somewhat less
with Easter and traditionally not at all with the summer. 45 million litres of
"julmust" are consumed during December, which is around 50% of the total soft
drink volume in December and 75% of the total yearly must sales.'
- source_sentence: 'Query: Finding Dory features the voice of an actress from "It''s
Always Sunny in Philadelphia" who plays what character on that show?
Context:
1. William J. "Willem" Dafoe (born July 22, 1955) is an American actor. A member
of the experimental theatre company the Wooster Group, he was nominated for the
Academy Award for Best Supporting Actor for his roles as Elias in Oliver Stone''s
"Platoon" (1986) and Max Schreck in the comedy-horror film "Shadow of the Vampire"
(2000). His other film appearances include "The Last Temptation of Christ" (1988),
"Mississippi Burning" (1988),"The English Patient" (1996), "American Psycho" (2000),
the "Spider-Man" trilogy (2002–2007), "John Wick" (2014), "The Grand Budapest
Hotel" (2014), and "Justice League" (2017). He has also had voice roles in "Finding
Nemo" (2003) and its sequel "Finding Dory" (2016), "Fantastic Mr. Fox" (2009),
"John Carter" (2012) and the recent adaptation of "Death Note" (2017).
2. Gregory Grene is an American musician based in New York City, who grew up in
Chicago and County Cavan, Ireland. He is the son of the classicist David Grene. With
his band The Prodigals, he writes and plays a style of music that melds Irish
traditional and rock influences, and he has also recorded a solo album, FlipSides
(2008), with musicians ranging from John Doyle, former guitarist with Solas, to
Tony Cedras, a multi-instrumentalist who has played in Paul Simon''s band since
the Graceland tour. Grene''s music has received critical acclaim over the years,
was included in the Rough Guide to Irish Music compilation, and was featured in
the soundtrack for "Pride and Glory", a movie starring Ed Norton, Colin Farrell,
Jon Voight and Noah Emmerich, as well as on television in the ABC show "Mercy"
and the FX series "It''s Always Sunny in Philadelphia.
3. Albert Lawrence Brooks (born Albert Lawrence Einstein; July 22, 1947) is an
American actor, filmmaker, author, and comedian. He received an Academy Award
nomination for Best Supporting Actor for 1987''s "Broadcast News" and was widely
praised for his performance in the 2011 film "Drive". His voice acting credits
include Marlin in "Finding Nemo" (2003) and "Finding Dory" (2016), and recurring
guest voices for "The Simpsons", including Russ Cargill in "The Simpsons Movie"
(2007). He has directed, written, and starred in several comedy films, such as
"Modern Romance" (1981), "Lost in America" (1985), and "Defending Your Life" (1991). He
is also the author of "2030: The Real Story of What Happens to America" (2011).'
sentences:
- Finding Dory is a 2016 American 3D computer-animated comedy adventure film produced
by Pixar Animation Studios and released by Walt Disney Pictures. Directed by
Andrew Stanton with co-direction by Angus MacLane, the screenplay was written
by Stanton and Victoria Strouse. The film is a sequel/spinoff to 2003's "Finding
Nemo" and features the returning voices of Ellen DeGeneres and Albert Brooks,
with Hayden Rolence (replacing Alexander Gould), Ed O'Neill, Kaitlin Olson, Ty
Burrell, Diane Keaton and Eugene Levy joining the cast. The film focuses on the
amnesiac fish Dory, who journeys to be reunited with her parents.
- Mali Finn (March 8, 1938 November 28, 2007), born Mary Alice Mann, was an American
casting director and former English and drama teacher. She cast numerous actors
in successful films, including Edward Furlong, Leonardo DiCaprio, and Russell
Crowe.
- Dave Mader III (born June 30, 1955) is an American stock car racing driver from
Maylene, Alabama. Winner of the 1978 Snowball Derby, he is a former competitor
in all three of NASCAR's national touring series.
- source_sentence: 'Query: Which American journalist born 1893 is known for having
an affair with a United Tax Court Judge?
Context:
1. The United States Revenue Act of 1924 (43 Stat.  253 ) (June 2, 1924), also
known as the Mellon tax bill cut federal tax rates and established the U.S. Board
of Tax Appeals, which was later renamed the United States Tax Court in 1942. The
bill was named after U.S. Secretary of the Treasury Andrew Mellon.
2. The United States Tax Court is composed of 19 members appointed by the President
and confirmed by the Senate. Reappointment, when requested by a Tax Court judge
is generally "pro forma" regardless of the political party of the appointing President
and the political party of the re-appointing (sitting) President. By statute,
Congress has granted to the President the power to remove the judges of the U.S.
Tax Court "for inefficiency, neglect of duty, or malfeasance in office..."
3. In a ruling issued in June 2015, Tax Court Judge Vito Bianco ruled that the
hospital would be required to pay property taxes on nearly all of its 40 acres
campus.
4. Marion Janet Harron (September 3, 1903 – September 26, 1972) was a United States
Tax Court judge (c.1936), and best known for having an affair with Lorena Hickok.'
sentences:
- '"PC Principal Final Justice" (also known as "PC Principal") is the tenth and
final episode of the nineteenth season and the 267th overall episode of the animated
television series "South Park", written and directed by series co-creator Trey
Parker. The episode premiered on Comedy Central on December 9, 2015. It is the
third and final part of a three-episode story arc that began with the episode
"Sponsored Content" and continued in the episode "Truth and Advertising", which
collectively serve as the season finale. The episode parodies the abundance of
online advertising, as well as gun politics in the United States, as part of its
season-long lampoon of political correctness.'
- Lorena Alice Hickok (March 7, 1893 May 1, 1968) was an American journalist known
for her close romantic relationship with First Lady Eleanor Roosevelt.
- The Tragedy of Julius Caesar is a tragedy by William Shakespeare, believed to
have been written in 1599. It is one of several plays written by Shakespeare
based on true events from Roman history, which also include "Coriolanus" and "Antony
and Cleopatra".
- source_sentence: 'Query: What computer was developed first, the Matra or the Orao?
Context:
1. Alice is an open-source object-based educational programming language with
an integrated development environment (IDE). Alice uses a drag and drop environment
to create computer animations using 3D models. The software was developed first
at University of Virginia in 1994, then Carnegie Mellon (from 1997), by a research
group led by Randy Pausch.
2. Mécanique Aviation Traction or Matra ("M"écanique "A"viation "TRA"ction) was
a French company covering a wide range of activities mainly related to automobiles,
bicycles, aeronautics and weaponry.'
sentences:
- Yakov Naumovich Pokhis, better known as Yakov Smirnoff (born 24 January 1951),
is a Soviet-born American comedian, actor and writer. After emigrating to the
United States in 1977, Smirnoff began performing as a stand-up comic. He reached
his biggest success in the mid-to-late 1980s, appearing in several films and the
television sitcom vehicle "What a Country! ". His comic persona was of a naive
immigrant from the Soviet Union who was perpetually confused and delighted by
life in the United States. His humor combined a mockery of life under Communism
and of consumerism in the United States, as well as word play caused by misunderstanding
of American phrases and culture, all punctuated by the catchphrase, "And I thought,
'What a country!'
- Guardians of Order was a Canadian company founded in 1996 by Mark C. MacKinnon
in Guelph, Ontario. The company's business output consisted of role-playing games
(RPGs). Their first game is the anime inspired "Big Eyes, Small Mouth". In 2006
Guardians of Order ceased operations due to overwhelming debt.
- Orao (en. "Eagle") was an 8-bit computer developed by PEL Varaždin in 1984. Its
marketing and distribution was done by "Velebit Informatika". It was used as
a standard primary school computer in Croatia and Vojvodina from 1985 to 1991.
- source_sentence: 'Query: Are Quarto and Strange Synergy both types of games?
Context:
1. An exotic star is a hypothetical compact star composed of something other than
electrons, protons, neutrons, or muons; and balanced against gravitational collapse
by degeneracy pressure or other quantum properties. These include quark stars
(composed of quarks) and perhaps strange stars (based upon strange quark matter,
a condensate of up, down and strange quarks), as well as speculative preon stars
(composed of preons, which are hypothetical particles and "building blocks" of
quarks, if quarks prove to be decomposable into component sub-particles). Of
the various types of exotic star proposed, the most well evidenced and understood
is the quark star.
2. Philip Lavery (born August 17, 1990 in Dublin) is an Irish racing cyclist who
most recently rode for the Synergy Baku team. Lavery won the 2010 Tour of the
North and won a bronze medal at the 2010 Commonwealth Games in India, as part
of the Northern Irish team pursuit squad. During the summer of 2013, Lavery joined
the Cofidis team as a "stagiaire", after taking several victories in French domestic
racing.
3. Quarto is a board game for two players invented by Swiss mathematician Blaise
Müller in 1991.'
sentences:
- The Lacy Dog or Blue Lacy Dog is a breed of working dog that originated in Texas
in the mid-19th century. The Lacy was first recognized in 2001 by the Texas Senate. In
Senate Resolution No. 436, the 77th Legislature honored the Lacy as "a true Texas
breed". In June 2005, Governor Rick Perry signed the legislation adopting the
Blue Lacy as "the official State Dog Breed of Texas". As expected, the vast majority
of Lacy dogs are found in Texas. However, as the breed becomes more well recognized,
there are breeding populations being established across the United States, Canada,
and most recently in Europe.
- 'Holingol (a.k.a. Huolin Gol; Mongolian: ᠬᠣᠣᠯᠢᠠ ᠭᠣᠤᠯ ᠬᠣᠲᠠ (Хоолингол хот); Chinese:
霍林郭勒 "Huolinguole") is a county-level city of Inner Mongolia, China.'
- Strange Synergy is a card game published by Steve Jackson Games in which players
build a team of super heroes to battle an opponent's team.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on embaas/sentence-transformers-e5-large-v2
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: eval split
type: eval_split
metrics:
- type: cosine_accuracy@1
value: 0.923
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9865
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.991
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9945
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.923
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32883333333333326
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19820000000000004
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09945000000000001
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.923
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9865
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.991
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9945
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9651637010519428
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9551150793650793
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9553006088171921
name: Cosine Map@100
---
# SentenceTransformer based on embaas/sentence-transformers-e5-large-v2
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [embaas/sentence-transformers-e5-large-v2](https://huggingface.co/embaas/sentence-transformers-e5-large-v2). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [embaas/sentence-transformers-e5-large-v2](https://huggingface.co/embaas/sentence-transformers-e5-large-v2) <!-- at revision 86001e787b4f6bda8cdc8c2095c0493dd135484e -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 1024 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: PeftModelForFeatureExtraction
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```