## facebook/tart-full-flan-t5-xl `facebook/tart-full-flan-t5-xl` is a multi-task cross-encoder model trained via instruction-tuning on approximately 40 retrieval tasks, initialized with [google/flan-t5-xl](https://huggingface.co/google/flan-t5-xl). ### Installation ``` git clone https://github.com/facebookresearch/tart pip install -r requirements.txt cd tart/TART ``` TART-full can be loaded through our customized EncT5 model. ```python from src.modeling_enc_t5 import EncT5ForSequenceClassification from src.tokenization_enc_t5 import EncT5Tokenizer import torch import torch.nn.functional as F # load TART full and tokenizer model = EncT5ForSequenceClassification.from_pretrained("tart_full_flan_t5_xl") tokenizer = EncT5Tokenizer.from_pretrained("tart_full_flan_t5_xl") model.eval() q = "What is the population of Tokyo?" in_answer = "retrieve a passage that answers this question from Wikipedia" p_1 = "The population of Japan's capital, Tokyo, dropped by about 48,600 people to just under 14 million at the start of 2022." p_2 = "Tokyo, officially the Tokyo Metropolis (東京都, Tōkyō-to), is the capital and largest city of Japan." # 1. TART-full can identify more relevant paragraph. features = tokenizer(['{0} [SEP] {1}'.format(in_answer, q), '{0} [SEP] {1}'.format(in_answer, q)], [p_1, p_2], padding=True, truncation=True, return_tensors="pt") with torch.no_grad(): scores = model(**features).logits normalized_scores = [float(score[1]) for score in F.softmax(scores, dim=1)] print([p_1, p_2]np.argmax(normalized_scores)) # "The population of Japan's capital, Tokyo, dropped by about 48,600 people to just under 14 million." # 2. TART-full can identify the document that is more relevant AND follows instructions. in_sim = "You need to find duplicated questions in Wiki forum. Could you find a question that is similar to this question" q_1 = "How many people live in Tokyo?" features = tokenizer(['{0} [SEP] {1}'.format(in_sim, q), '{0} [SEP] {1}'.format(in_sim, q)], [p, q_1], padding=True, truncation=True, return_tensors="pt") with torch.no_grad(): scores = model(**features).logits normalized_scores = [float(score[1]) for score in F.softmax(scores, dim=1)] print([p, q_1]np.argmax(normalized_scores)) # "How many people live in Tokyo?" ```