|
--- |
|
tags: |
|
- sentence-transformers |
|
- sentence-similarity |
|
- feature-extraction |
|
- generated_from_trainer |
|
- dataset_size:100000 |
|
- loss:MultipleNegativesRankingLoss |
|
base_model: embaas/sentence-transformers-e5-large-v2 |
|
widget: |
|
- source_sentence: 'Query: The Gaumont Film Company was founded before a studio that |
|
was established in Denmark by what Danish filmmaker? |
|
|
|
|
|
Context: |
|
|
|
|
|
1. It is the first and oldest film company in the world, founded before other |
|
studios such as Pathé (founded in 1896), Titanus (1904), Nordisk Film (1906), |
|
Universal and Paramount Pictures (both founded in 1912).' |
|
sentences: |
|
- 'Nordisk Film (or Nordisk Film Distribution, USA affiliate: Great Northern Film |
|
Company), established in Denmark in 1906 by Danish filmmaker Ole Olsen and also |
|
the oldest continuously active film studio in the world. It is the third oldest |
|
studio in the world behind the Gaumont Film Company and Pathé. Olsen started |
|
his company in the Copenhagen suburb of Valby under the name "Ole Olsen''s Film |
|
Factory" but soon changed it to the Nordisk Film Kompagni. In 1908, Olsen opened |
|
an affiliate branch in New York, the Great Northern Film Company, to handle distribution |
|
of his films to the American market. As Nordisk Film, it became a publicly traded |
|
company in 1911.' |
|
- The West Lodge, also known as the West Gate Lodge, to Cardiff Castle is a Grade |
|
II* listed building, currently used as a tea room, in the centre of Cardiff, Wales. It |
|
is approximately 100 m west of the Castle, with the Animal Wall running in-between. |
|
- 'Julmust (Swedish: "jul" "Yule" and "must " "not yet fermented juice of fruit |
|
or berries", though there is no such juice in "julmust") is a soft drink that |
|
is mainly consumed in Sweden around Christmas. During the other part of the year |
|
it is usually quite difficult to find in stores, but sometimes it is sold at other |
|
times of the year under the name "must". At Easter the name is påskmust (from |
|
"påsk ", "Easter" / "Paschal" ["q.v."]). The content is the same regardless of |
|
the marketing name, although the length of time it is stored before bottling differs; |
|
however, the beverage is more closely associated with Christmas, somewhat less |
|
with Easter and traditionally not at all with the summer. 45 million litres of |
|
"julmust" are consumed during December, which is around 50% of the total soft |
|
drink volume in December and 75% of the total yearly must sales.' |
|
- source_sentence: 'Query: Finding Dory features the voice of an actress from "It''s |
|
Always Sunny in Philadelphia" who plays what character on that show? |
|
|
|
|
|
Context: |
|
|
|
|
|
1. William J. "Willem" Dafoe (born July 22, 1955) is an American actor. A member |
|
of the experimental theatre company the Wooster Group, he was nominated for the |
|
Academy Award for Best Supporting Actor for his roles as Elias in Oliver Stone''s |
|
"Platoon" (1986) and Max Schreck in the comedy-horror film "Shadow of the Vampire" |
|
(2000). His other film appearances include "The Last Temptation of Christ" (1988), |
|
"Mississippi Burning" (1988),"The English Patient" (1996), "American Psycho" (2000), |
|
the "Spider-Man" trilogy (2002–2007), "John Wick" (2014), "The Grand Budapest |
|
Hotel" (2014), and "Justice League" (2017). He has also had voice roles in "Finding |
|
Nemo" (2003) and its sequel "Finding Dory" (2016), "Fantastic Mr. Fox" (2009), |
|
"John Carter" (2012) and the recent adaptation of "Death Note" (2017). |
|
|
|
2. Gregory Grene is an American musician based in New York City, who grew up in |
|
Chicago and County Cavan, Ireland. He is the son of the classicist David Grene. With |
|
his band The Prodigals, he writes and plays a style of music that melds Irish |
|
traditional and rock influences, and he has also recorded a solo album, FlipSides |
|
(2008), with musicians ranging from John Doyle, former guitarist with Solas, to |
|
Tony Cedras, a multi-instrumentalist who has played in Paul Simon''s band since |
|
the Graceland tour. Grene''s music has received critical acclaim over the years, |
|
was included in the Rough Guide to Irish Music compilation, and was featured in |
|
the soundtrack for "Pride and Glory", a movie starring Ed Norton, Colin Farrell, |
|
Jon Voight and Noah Emmerich, as well as on television in the ABC show "Mercy" |
|
and the FX series "It''s Always Sunny in Philadelphia. |
|
|
|
3. Albert Lawrence Brooks (born Albert Lawrence Einstein; July 22, 1947) is an |
|
American actor, filmmaker, author, and comedian. He received an Academy Award |
|
nomination for Best Supporting Actor for 1987''s "Broadcast News" and was widely |
|
praised for his performance in the 2011 film "Drive". His voice acting credits |
|
include Marlin in "Finding Nemo" (2003) and "Finding Dory" (2016), and recurring |
|
guest voices for "The Simpsons", including Russ Cargill in "The Simpsons Movie" |
|
(2007). He has directed, written, and starred in several comedy films, such as |
|
"Modern Romance" (1981), "Lost in America" (1985), and "Defending Your Life" (1991). He |
|
is also the author of "2030: The Real Story of What Happens to America" (2011).' |
|
sentences: |
|
- Finding Dory is a 2016 American 3D computer-animated comedy adventure film produced |
|
by Pixar Animation Studios and released by Walt Disney Pictures. Directed by |
|
Andrew Stanton with co-direction by Angus MacLane, the screenplay was written |
|
by Stanton and Victoria Strouse. The film is a sequel/spinoff to 2003's "Finding |
|
Nemo" and features the returning voices of Ellen DeGeneres and Albert Brooks, |
|
with Hayden Rolence (replacing Alexander Gould), Ed O'Neill, Kaitlin Olson, Ty |
|
Burrell, Diane Keaton and Eugene Levy joining the cast. The film focuses on the |
|
amnesiac fish Dory, who journeys to be reunited with her parents. |
|
- Mali Finn (March 8, 1938 – November 28, 2007), born Mary Alice Mann, was an American |
|
casting director and former English and drama teacher. She cast numerous actors |
|
in successful films, including Edward Furlong, Leonardo DiCaprio, and Russell |
|
Crowe. |
|
- Dave Mader III (born June 30, 1955) is an American stock car racing driver from |
|
Maylene, Alabama. Winner of the 1978 Snowball Derby, he is a former competitor |
|
in all three of NASCAR's national touring series. |
|
- source_sentence: 'Query: Which American journalist born 1893 is known for having |
|
an affair with a United Tax Court Judge? |
|
|
|
|
|
Context: |
|
|
|
|
|
1. The United States Revenue Act of 1924 (43 Stat. 253 ) (June 2, 1924), also |
|
known as the Mellon tax bill cut federal tax rates and established the U.S. Board |
|
of Tax Appeals, which was later renamed the United States Tax Court in 1942. The |
|
bill was named after U.S. Secretary of the Treasury Andrew Mellon. |
|
|
|
2. The United States Tax Court is composed of 19 members appointed by the President |
|
and confirmed by the Senate. Reappointment, when requested by a Tax Court judge |
|
is generally "pro forma" regardless of the political party of the appointing President |
|
and the political party of the re-appointing (sitting) President. By statute, |
|
Congress has granted to the President the power to remove the judges of the U.S. |
|
Tax Court "for inefficiency, neglect of duty, or malfeasance in office..." |
|
|
|
3. In a ruling issued in June 2015, Tax Court Judge Vito Bianco ruled that the |
|
hospital would be required to pay property taxes on nearly all of its 40 acres |
|
campus. |
|
|
|
4. Marion Janet Harron (September 3, 1903 – September 26, 1972) was a United States |
|
Tax Court judge (c.1936), and best known for having an affair with Lorena Hickok.' |
|
sentences: |
|
- '"PC Principal Final Justice" (also known as "PC Principal") is the tenth and |
|
final episode of the nineteenth season and the 267th overall episode of the animated |
|
television series "South Park", written and directed by series co-creator Trey |
|
Parker. The episode premiered on Comedy Central on December 9, 2015. It is the |
|
third and final part of a three-episode story arc that began with the episode |
|
"Sponsored Content" and continued in the episode "Truth and Advertising", which |
|
collectively serve as the season finale. The episode parodies the abundance of |
|
online advertising, as well as gun politics in the United States, as part of its |
|
season-long lampoon of political correctness.' |
|
- Lorena Alice Hickok (March 7, 1893 – May 1, 1968) was an American journalist known |
|
for her close romantic relationship with First Lady Eleanor Roosevelt. |
|
- The Tragedy of Julius Caesar is a tragedy by William Shakespeare, believed to |
|
have been written in 1599. It is one of several plays written by Shakespeare |
|
based on true events from Roman history, which also include "Coriolanus" and "Antony |
|
and Cleopatra". |
|
- source_sentence: 'Query: What computer was developed first, the Matra or the Orao? |
|
|
|
|
|
Context: |
|
|
|
|
|
1. Alice is an open-source object-based educational programming language with |
|
an integrated development environment (IDE). Alice uses a drag and drop environment |
|
to create computer animations using 3D models. The software was developed first |
|
at University of Virginia in 1994, then Carnegie Mellon (from 1997), by a research |
|
group led by Randy Pausch. |
|
|
|
2. Mécanique Aviation Traction or Matra ("M"écanique "A"viation "TRA"ction) was |
|
a French company covering a wide range of activities mainly related to automobiles, |
|
bicycles, aeronautics and weaponry.' |
|
sentences: |
|
- Yakov Naumovich Pokhis, better known as Yakov Smirnoff (born 24 January 1951), |
|
is a Soviet-born American comedian, actor and writer. After emigrating to the |
|
United States in 1977, Smirnoff began performing as a stand-up comic. He reached |
|
his biggest success in the mid-to-late 1980s, appearing in several films and the |
|
television sitcom vehicle "What a Country! ". His comic persona was of a naive |
|
immigrant from the Soviet Union who was perpetually confused and delighted by |
|
life in the United States. His humor combined a mockery of life under Communism |
|
and of consumerism in the United States, as well as word play caused by misunderstanding |
|
of American phrases and culture, all punctuated by the catchphrase, "And I thought, |
|
'What a country!' |
|
- Guardians of Order was a Canadian company founded in 1996 by Mark C. MacKinnon |
|
in Guelph, Ontario. The company's business output consisted of role-playing games |
|
(RPGs). Their first game is the anime inspired "Big Eyes, Small Mouth". In 2006 |
|
Guardians of Order ceased operations due to overwhelming debt. |
|
- Orao (en. "Eagle") was an 8-bit computer developed by PEL Varaždin in 1984. Its |
|
marketing and distribution was done by "Velebit Informatika". It was used as |
|
a standard primary school computer in Croatia and Vojvodina from 1985 to 1991. |
|
- source_sentence: 'Query: Are Quarto and Strange Synergy both types of games? |
|
|
|
|
|
Context: |
|
|
|
|
|
1. An exotic star is a hypothetical compact star composed of something other than |
|
electrons, protons, neutrons, or muons; and balanced against gravitational collapse |
|
by degeneracy pressure or other quantum properties. These include quark stars |
|
(composed of quarks) and perhaps strange stars (based upon strange quark matter, |
|
a condensate of up, down and strange quarks), as well as speculative preon stars |
|
(composed of preons, which are hypothetical particles and "building blocks" of |
|
quarks, if quarks prove to be decomposable into component sub-particles). Of |
|
the various types of exotic star proposed, the most well evidenced and understood |
|
is the quark star. |
|
|
|
2. Philip Lavery (born August 17, 1990 in Dublin) is an Irish racing cyclist who |
|
most recently rode for the Synergy Baku team. Lavery won the 2010 Tour of the |
|
North and won a bronze medal at the 2010 Commonwealth Games in India, as part |
|
of the Northern Irish team pursuit squad. During the summer of 2013, Lavery joined |
|
the Cofidis team as a "stagiaire", after taking several victories in French domestic |
|
racing. |
|
|
|
3. Quarto is a board game for two players invented by Swiss mathematician Blaise |
|
Müller in 1991.' |
|
sentences: |
|
- The Lacy Dog or Blue Lacy Dog is a breed of working dog that originated in Texas |
|
in the mid-19th century. The Lacy was first recognized in 2001 by the Texas Senate. In |
|
Senate Resolution No. 436, the 77th Legislature honored the Lacy as "a true Texas |
|
breed". In June 2005, Governor Rick Perry signed the legislation adopting the |
|
Blue Lacy as "the official State Dog Breed of Texas". As expected, the vast majority |
|
of Lacy dogs are found in Texas. However, as the breed becomes more well recognized, |
|
there are breeding populations being established across the United States, Canada, |
|
and most recently in Europe. |
|
- 'Holingol (a.k.a. Huolin Gol; Mongolian: ᠬᠣᠣᠯᠢᠠ ᠭᠣᠤᠯ ᠬᠣᠲᠠ (Хоолингол хот); Chinese: |
|
霍林郭勒 "Huolinguole") is a county-level city of Inner Mongolia, China.' |
|
- Strange Synergy is a card game published by Steve Jackson Games in which players |
|
build a team of super heroes to battle an opponent's team. |
|
pipeline_tag: sentence-similarity |
|
library_name: sentence-transformers |
|
metrics: |
|
- cosine_accuracy@1 |
|
- cosine_accuracy@3 |
|
- cosine_accuracy@5 |
|
- cosine_accuracy@10 |
|
- cosine_precision@1 |
|
- cosine_precision@3 |
|
- cosine_precision@5 |
|
- cosine_precision@10 |
|
- cosine_recall@1 |
|
- cosine_recall@3 |
|
- cosine_recall@5 |
|
- cosine_recall@10 |
|
- cosine_ndcg@10 |
|
- cosine_mrr@10 |
|
- cosine_map@100 |
|
model-index: |
|
- name: SentenceTransformer based on embaas/sentence-transformers-e5-large-v2 |
|
results: |
|
- task: |
|
type: information-retrieval |
|
name: Information Retrieval |
|
dataset: |
|
name: eval split |
|
type: eval_split |
|
metrics: |
|
- type: cosine_accuracy@1 |
|
value: 0.923 |
|
name: Cosine Accuracy@1 |
|
- type: cosine_accuracy@3 |
|
value: 0.9865 |
|
name: Cosine Accuracy@3 |
|
- type: cosine_accuracy@5 |
|
value: 0.991 |
|
name: Cosine Accuracy@5 |
|
- type: cosine_accuracy@10 |
|
value: 0.9945 |
|
name: Cosine Accuracy@10 |
|
- type: cosine_precision@1 |
|
value: 0.923 |
|
name: Cosine Precision@1 |
|
- type: cosine_precision@3 |
|
value: 0.32883333333333326 |
|
name: Cosine Precision@3 |
|
- type: cosine_precision@5 |
|
value: 0.19820000000000004 |
|
name: Cosine Precision@5 |
|
- type: cosine_precision@10 |
|
value: 0.09945000000000001 |
|
name: Cosine Precision@10 |
|
- type: cosine_recall@1 |
|
value: 0.923 |
|
name: Cosine Recall@1 |
|
- type: cosine_recall@3 |
|
value: 0.9865 |
|
name: Cosine Recall@3 |
|
- type: cosine_recall@5 |
|
value: 0.991 |
|
name: Cosine Recall@5 |
|
- type: cosine_recall@10 |
|
value: 0.9945 |
|
name: Cosine Recall@10 |
|
- type: cosine_ndcg@10 |
|
value: 0.9651637010519428 |
|
name: Cosine Ndcg@10 |
|
- type: cosine_mrr@10 |
|
value: 0.9551150793650793 |
|
name: Cosine Mrr@10 |
|
- type: cosine_map@100 |
|
value: 0.9553006088171921 |
|
name: Cosine Map@100 |
|
--- |
|
|
|
# SentenceTransformer based on embaas/sentence-transformers-e5-large-v2 |
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [embaas/sentence-transformers-e5-large-v2](https://huggingface.co/embaas/sentence-transformers-e5-large-v2). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
- **Model Type:** Sentence Transformer |
|
- **Base model:** [embaas/sentence-transformers-e5-large-v2](https://huggingface.co/embaas/sentence-transformers-e5-large-v2) <!-- at revision 86001e787b4f6bda8cdc8c2095c0493dd135484e --> |
|
- **Maximum Sequence Length:** 512 tokens |
|
- **Output Dimensionality:** 1024 dimensions |
|
- **Similarity Function:** Cosine Similarity |
|
<!-- - **Training Dataset:** Unknown --> |
|
<!-- - **Language:** Unknown --> |
|
<!-- - **License:** Unknown --> |
|
|
|
### Model Sources |
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
### Full Model Architecture |
|
|
|
``` |
|
SentenceTransformer( |
|
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: PeftModelForFeatureExtraction |
|
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
(2): Normalize() |
|
) |
|
``` |