CrossEncoder based on aubmindlab/bert-base-arabertv2

This is a Cross Encoder model finetuned from aubmindlab/bert-base-arabertv2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the ๐Ÿค— Hub
model = CrossEncoder("yoriis/arabert-tydi-quqa-task-ar-v2")
# Get scores for pairs of texts
pairs = [
    ['ู‡ู„ ุฐูƒุฑ ุงู„ู‚ุฑุขู† ุฃู† ุงู„ุชูˆุฑุงุฉ ุชู… ุชุญุฑูŠูู‡ุงุŸ', 'ูŠุง ุฃูŠู‡ุง ุงู„ุฑุณูˆู„ ู„ุง ูŠุญุฒู†ูƒ ุงู„ุฐูŠู† ูŠุณุงุฑุนูˆู† ููŠ ุงู„ูƒูุฑ ู…ู† ุงู„ุฐูŠู† ู‚ุงู„ูˆุง ุขู…ู†ุง ุจุฃููˆุงู‡ู‡ู… ูˆู„ู… ุชุคู…ู† ู‚ู„ูˆุจู‡ู… ูˆู…ู† ุงู„ุฐูŠู† ู‡ุงุฏูˆุง ุณู…ุงุนูˆู† ู„ู„ูƒุฐุจ ุณู…ุงุนูˆู† ู„ู‚ูˆู… ุขุฎุฑูŠู† ู„ู… ูŠุฃุชูˆูƒ ูŠุญุฑููˆู† ุงู„ูƒู„ู… ู…ู† ุจุนุฏ ู…ูˆุงุถุนู‡ ูŠู‚ูˆู„ูˆู† ุฅู† ุฃูˆุชูŠุชู… ู‡ุฐุง ูุฎุฐูˆู‡ ูˆุฅู† ู„ู… ุชุคุชูˆู‡ ูุงุญุฐุฑูˆุง ูˆู…ู† ูŠุฑุฏ ุงู„ู„ู‡ ูุชู†ุชู‡ ูู„ู† ุชู…ู„ูƒ ู„ู‡ ู…ู† ุงู„ู„ู‡ ุดูŠุฆุง ุฃูˆู„ุฆูƒ ุงู„ุฐูŠู† ู„ู… ูŠุฑุฏ ุงู„ู„ู‡ ุฃู† ูŠุทู‡ุฑ ู‚ู„ูˆุจู‡ู… ู„ู‡ู… ููŠ ุงู„ุฏู†ูŠุง ุฎุฒูŠ ูˆู„ู‡ู… ููŠ ุงู„ุขุฎุฑุฉ ุนุฐุงุจ ุนุธูŠู…. ุณู…ุงุนูˆู† ู„ู„ูƒุฐุจ ุฃูƒุงู„ูˆู† ู„ู„ุณุญุช ูุฅู† ุฌุงุกูˆูƒ ูุงุญูƒู… ุจูŠู†ู‡ู… ุฃูˆ ุฃุนุฑุถ ุนู†ู‡ู… ูˆุฅู† ุชุนุฑุถ ุนู†ู‡ู… ูู„ู† ูŠุถุฑูˆูƒ ุดูŠุฆุง ูˆุฅู† ุญูƒู…ุช ูุงุญูƒู… ุจูŠู†ู‡ู… ุจุงู„ู‚ุณุท ุฅู† ุงู„ู„ู‡ ูŠุญุจ ุงู„ู…ู‚ุณุทูŠู†. ูˆูƒูŠู ูŠุญูƒู…ูˆู†ูƒ ูˆุนู†ุฏู‡ู… ุงู„ุชูˆุฑุงุฉ ููŠู‡ุง ุญูƒู… ุงู„ู„ู‡ ุซู… ูŠุชูˆู„ูˆู† ู…ู† ุจุนุฏ ุฐู„ูƒ ูˆู…ุง ุฃูˆู„ุฆูƒ ุจุงู„ู…ุคู…ู†ูŠู†.'],
    ['ุจู…ุงุฐุง ุดุจู‡ ุงู„ู„ู‡ ุงู„ุฐูŠ ูŠูุชุจุน ุงู„ุญุณู†ุฉ ุจุงู„ุฃุฐู‰ุŸ', 'ูู…ู† ุฃุธู„ู… ู…ู…ู† ูƒุฐุจ ุนู„ู‰ ุงู„ู„ู‡ ูˆูƒุฐุจ ุจุงู„ุตุฏู‚ ุฅุฐ ุฌุงุกู‡ ุฃู„ูŠุณ ููŠ ุฌู‡ู†ู… ู…ุซูˆู‰ ู„ู„ูƒุงูุฑูŠู†. ูˆุงู„ุฐูŠ ุฌุงุก ุจุงู„ุตุฏู‚ ูˆุตุฏู‚ ุจู‡ ุฃูˆู„ุฆูƒ ู‡ู… ุงู„ู…ุชู‚ูˆู†. ู„ู‡ู… ู…ุง ูŠุดุงุกูˆู† ุนู†ุฏ ุฑุจู‡ู… ุฐู„ูƒ ุฌุฒุงุก ุงู„ู…ุญุณู†ูŠู†. ู„ูŠูƒูุฑ ุงู„ู„ู‡ ุนู†ู‡ู… ุฃุณูˆุฃ ุงู„ุฐูŠ ุนู…ู„ูˆุง ูˆูŠุฌุฒูŠู‡ู… ุฃุฌุฑู‡ู… ุจุฃุญุณู† ุงู„ุฐูŠ ูƒุงู†ูˆุง ูŠุนู…ู„ูˆู†. ุฃู„ูŠุณ ุงู„ู„ู‡ ุจูƒุงู ุนุจุฏู‡ ูˆูŠุฎูˆููˆู†ูƒ ุจุงู„ุฐูŠู† ู…ู† ุฏูˆู†ู‡ ูˆู…ู† ูŠุถู„ู„ ุงู„ู„ู‡ ูู…ุง ู„ู‡ ู…ู† ู‡ุงุฏ. ูˆู…ู† ูŠู‡ุฏ ุงู„ู„ู‡ ูู…ุง ู„ู‡ ู…ู† ู…ุถู„ ุฃู„ูŠุณ ุงู„ู„ู‡ ุจุนุฒูŠุฒ ุฐูŠ ุงู†ุชู‚ุงู….'],
    ['ู‡ู„ ู‡ู†ุงูƒ ุฅุดุงุฑุงุช ููŠ ุงู„ู‚ุฑุขู† ุนู† ู†ู‡ุงูŠุฉ ุงู„ูƒูŠุงู† ุงู„ุตู‡ูŠูˆู†ูŠุŸ', 'ูˆู‡ู„ ุฃุชุงูƒ ุญุฏูŠุซ ู…ูˆุณู‰. ุฅุฐ ุฑุฃู‰ ู†ุงุฑุง ูู‚ุงู„ ู„ุฃู‡ู„ู‡ ุงู…ูƒุซูˆุง ุฅู†ูŠ ุขู†ุณุช ู†ุงุฑุง ู„ุนู„ูŠ ุขุชูŠูƒู… ู…ู†ู‡ุง ุจู‚ุจุณ ุฃูˆ ุฃุฌุฏ ุนู„ู‰ ุงู„ู†ุงุฑ ู‡ุฏู‰. ูู„ู…ุง ุฃุชุงู‡ุง ู†ูˆุฏูŠ ูŠุง ู…ูˆุณู‰. ุฅู†ูŠ ุฃู†ุง ุฑุจูƒ ูุงุฎู„ุน ู†ุนู„ูŠูƒ ุฅู†ูƒ ุจุงู„ูˆุงุฏ ุงู„ู…ู‚ุฏุณ ุทูˆู‰. ูˆุฃู†ุง ุงุฎุชุฑุชูƒ ูุงุณุชู…ุน ู„ู…ุง ูŠูˆุญู‰. ุฅู†ู†ูŠ ุฃู†ุง ุงู„ู„ู‡ ู„ุง ุฅู„ู‡ ุฅู„ุง ุฃู†ุง ูุงุนุจุฏู†ูŠ ูˆุฃู‚ู… ุงู„ุตู„ุงุฉ ู„ุฐูƒุฑูŠ. ุฅู† ุงู„ุณุงุนุฉ ุขุชูŠุฉ ุฃูƒุงุฏ ุฃุฎููŠู‡ุง ู„ุชุฌุฒู‰ ูƒู„ ู†ูุณ ุจู…ุง ุชุณุนู‰. ูู„ุง ูŠุตุฏู†ูƒ ุนู†ู‡ุง ู…ู† ู„ุง ูŠุคู…ู† ุจู‡ุง ูˆุงุชุจุน ู‡ูˆุงู‡ ูุชุฑุฏู‰.'],
    ['ู„ู…ุงุฐุง ุญุฑู… ุงู„ู„ู‡ ุงู„ุชุจู†ูŠุŸ', 'ูˆู‚ุงู„ูˆุง ู‡ุฐู‡ ุฃู†ุนุงู… ูˆุญุฑุซ ุญุฌุฑ ู„ุง ูŠุทุนู…ู‡ุง ุฅู„ุง ู…ู† ู†ุดุงุก ุจุฒุนู…ู‡ู… ูˆุฃู†ุนุงู… ุญุฑู…ุช ุธู‡ูˆุฑู‡ุง ูˆุฃู†ุนุงู… ู„ุง ูŠุฐูƒุฑูˆู† ุงุณู… ุงู„ู„ู‡ ุนู„ูŠู‡ุง ุงูุชุฑุงุก ุนู„ูŠู‡ ุณูŠุฌุฒูŠู‡ู… ุจู…ุง ูƒุงู†ูˆุง ูŠูุชุฑูˆู†. ูˆู‚ุงู„ูˆุง ู…ุง ููŠ ุจุทูˆู† ู‡ุฐู‡ ุงู„ุฃู†ุนุงู… ุฎุงู„ุตุฉ ู„ุฐูƒูˆุฑู†ุง ูˆู…ุญุฑู… ุนู„ู‰ ุฃุฒูˆุงุฌู†ุง ูˆุฅู† ูŠูƒู† ู…ูŠุชุฉ ูู‡ู… ููŠู‡ ุดุฑูƒุงุก ุณูŠุฌุฒูŠู‡ู… ูˆุตูู‡ู… ุฅู†ู‡ ุญูƒูŠู… ุนู„ูŠู…. ู‚ุฏ ุฎุณุฑ ุงู„ุฐูŠู† ู‚ุชู„ูˆุง ุฃูˆู„ุงุฏู‡ู… ุณูู‡ุง ุจุบูŠุฑ ุนู„ู… ูˆุญุฑู…ูˆุง ู…ุง ุฑุฒู‚ู‡ู… ุงู„ู„ู‡ ุงูุชุฑุงุก ุนู„ู‰ ุงู„ู„ู‡ ู‚ุฏ ุถู„ูˆุง ูˆู…ุง ูƒุงู†ูˆุง ู…ู‡ุชุฏูŠู†.'],
    ['ู…ู‚ุงุชู„ูˆ ุฏุงุนุด ู…ุซู„ุง ุฃูˆ ุงู„ู…ูุณุฏูˆู† ููŠ ุงู„ุฃุฑุถ ู…ู† ุงู„ุชู†ุธูŠู…ุงุช ุงู„ุฅุฑู‡ุงุจูŠุฉุŒ ูŠุชูˆุถุคูˆู† ุฃูŠุถุงุŒ ูู‡ู„ ู‡ุฐุง ูŠุฌุนู„ู‡ู… ุฃุทู‡ุงุฑุงุŸ', 'ูƒูŠู ูŠูƒูˆู† ู„ู„ู…ุดุฑูƒูŠู† ุนู‡ุฏ ุนู†ุฏ ุงู„ู„ู‡ ูˆุนู†ุฏ ุฑุณูˆู„ู‡ ุฅู„ุง ุงู„ุฐูŠู† ุนุงู‡ุฏุชู… ุนู†ุฏ ุงู„ู…ุณุฌุฏ ุงู„ุญุฑุงู… ูู…ุง ุงุณุชู‚ุงู…ูˆุง ู„ูƒู… ูุงุณุชู‚ูŠู…ูˆุง ู„ู‡ู… ุฅู† ุงู„ู„ู‡ ูŠุญุจ ุงู„ู…ุชู‚ูŠู†. ูƒูŠู ูˆุฅู† ูŠุธู‡ุฑูˆุง ุนู„ูŠูƒู… ู„ุง ูŠุฑู‚ุจูˆุง ููŠูƒู… ุฅู„ุง ูˆู„ุง ุฐู…ุฉ ูŠุฑุถูˆู†ูƒู… ุจุฃููˆุงู‡ู‡ู… ูˆุชุฃุจู‰ ู‚ู„ูˆุจู‡ู… ูˆุฃูƒุซุฑู‡ู… ูุงุณู‚ูˆู†. ุงุดุชุฑูˆุง ุจุขูŠุงุช ุงู„ู„ู‡ ุซู…ู†ุง ู‚ู„ูŠู„ุง ูุตุฏูˆุง ุนู† ุณุจูŠู„ู‡ ุฅู†ู‡ู… ุณุงุก ู…ุง ูƒุงู†ูˆุง ูŠุนู…ู„ูˆู†. ู„ุง ูŠุฑู‚ุจูˆู† ููŠ ู…ุคู…ู† ุฅู„ุง ูˆู„ุง ุฐู…ุฉ ูˆุฃูˆู„ุฆูƒ ู‡ู… ุงู„ู…ุนุชุฏูˆู†. ูุฅู† ุชุงุจูˆุง ูˆุฃู‚ุงู…ูˆุง ุงู„ุตู„ุงุฉ ูˆุขุชูˆุง ุงู„ุฒูƒุงุฉ ูุฅุฎูˆุงู†ูƒู… ููŠ ุงู„ุฏูŠู† ูˆู†ูุตู„ ุงู„ุขูŠุงุช ู„ู‚ูˆู… ูŠุนู„ู…ูˆู†.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'ู‡ู„ ุฐูƒุฑ ุงู„ู‚ุฑุขู† ุฃู† ุงู„ุชูˆุฑุงุฉ ุชู… ุชุญุฑูŠูู‡ุงุŸ',
    [
        'ูŠุง ุฃูŠู‡ุง ุงู„ุฑุณูˆู„ ู„ุง ูŠุญุฒู†ูƒ ุงู„ุฐูŠู† ูŠุณุงุฑุนูˆู† ููŠ ุงู„ูƒูุฑ ู…ู† ุงู„ุฐูŠู† ู‚ุงู„ูˆุง ุขู…ู†ุง ุจุฃููˆุงู‡ู‡ู… ูˆู„ู… ุชุคู…ู† ู‚ู„ูˆุจู‡ู… ูˆู…ู† ุงู„ุฐูŠู† ู‡ุงุฏูˆุง ุณู…ุงุนูˆู† ู„ู„ูƒุฐุจ ุณู…ุงุนูˆู† ู„ู‚ูˆู… ุขุฎุฑูŠู† ู„ู… ูŠุฃุชูˆูƒ ูŠุญุฑููˆู† ุงู„ูƒู„ู… ู…ู† ุจุนุฏ ู…ูˆุงุถุนู‡ ูŠู‚ูˆู„ูˆู† ุฅู† ุฃูˆุชูŠุชู… ู‡ุฐุง ูุฎุฐูˆู‡ ูˆุฅู† ู„ู… ุชุคุชูˆู‡ ูุงุญุฐุฑูˆุง ูˆู…ู† ูŠุฑุฏ ุงู„ู„ู‡ ูุชู†ุชู‡ ูู„ู† ุชู…ู„ูƒ ู„ู‡ ู…ู† ุงู„ู„ู‡ ุดูŠุฆุง ุฃูˆู„ุฆูƒ ุงู„ุฐูŠู† ู„ู… ูŠุฑุฏ ุงู„ู„ู‡ ุฃู† ูŠุทู‡ุฑ ู‚ู„ูˆุจู‡ู… ู„ู‡ู… ููŠ ุงู„ุฏู†ูŠุง ุฎุฒูŠ ูˆู„ู‡ู… ููŠ ุงู„ุขุฎุฑุฉ ุนุฐุงุจ ุนุธูŠู…. ุณู…ุงุนูˆู† ู„ู„ูƒุฐุจ ุฃูƒุงู„ูˆู† ู„ู„ุณุญุช ูุฅู† ุฌุงุกูˆูƒ ูุงุญูƒู… ุจูŠู†ู‡ู… ุฃูˆ ุฃุนุฑุถ ุนู†ู‡ู… ูˆุฅู† ุชุนุฑุถ ุนู†ู‡ู… ูู„ู† ูŠุถุฑูˆูƒ ุดูŠุฆุง ูˆุฅู† ุญูƒู…ุช ูุงุญูƒู… ุจูŠู†ู‡ู… ุจุงู„ู‚ุณุท ุฅู† ุงู„ู„ู‡ ูŠุญุจ ุงู„ู…ู‚ุณุทูŠู†. ูˆูƒูŠู ูŠุญูƒู…ูˆู†ูƒ ูˆุนู†ุฏู‡ู… ุงู„ุชูˆุฑุงุฉ ููŠู‡ุง ุญูƒู… ุงู„ู„ู‡ ุซู… ูŠุชูˆู„ูˆู† ู…ู† ุจุนุฏ ุฐู„ูƒ ูˆู…ุง ุฃูˆู„ุฆูƒ ุจุงู„ู…ุคู…ู†ูŠู†.',
        'ูู…ู† ุฃุธู„ู… ู…ู…ู† ูƒุฐุจ ุนู„ู‰ ุงู„ู„ู‡ ูˆูƒุฐุจ ุจุงู„ุตุฏู‚ ุฅุฐ ุฌุงุกู‡ ุฃู„ูŠุณ ููŠ ุฌู‡ู†ู… ู…ุซูˆู‰ ู„ู„ูƒุงูุฑูŠู†. ูˆุงู„ุฐูŠ ุฌุงุก ุจุงู„ุตุฏู‚ ูˆุตุฏู‚ ุจู‡ ุฃูˆู„ุฆูƒ ู‡ู… ุงู„ู…ุชู‚ูˆู†. ู„ู‡ู… ู…ุง ูŠุดุงุกูˆู† ุนู†ุฏ ุฑุจู‡ู… ุฐู„ูƒ ุฌุฒุงุก ุงู„ู…ุญุณู†ูŠู†. ู„ูŠูƒูุฑ ุงู„ู„ู‡ ุนู†ู‡ู… ุฃุณูˆุฃ ุงู„ุฐูŠ ุนู…ู„ูˆุง ูˆูŠุฌุฒูŠู‡ู… ุฃุฌุฑู‡ู… ุจุฃุญุณู† ุงู„ุฐูŠ ูƒุงู†ูˆุง ูŠุนู…ู„ูˆู†. ุฃู„ูŠุณ ุงู„ู„ู‡ ุจูƒุงู ุนุจุฏู‡ ูˆูŠุฎูˆููˆู†ูƒ ุจุงู„ุฐูŠู† ู…ู† ุฏูˆู†ู‡ ูˆู…ู† ูŠุถู„ู„ ุงู„ู„ู‡ ูู…ุง ู„ู‡ ู…ู† ู‡ุงุฏ. ูˆู…ู† ูŠู‡ุฏ ุงู„ู„ู‡ ูู…ุง ู„ู‡ ู…ู† ู…ุถู„ ุฃู„ูŠุณ ุงู„ู„ู‡ ุจุนุฒูŠุฒ ุฐูŠ ุงู†ุชู‚ุงู….',
        'ูˆู‡ู„ ุฃุชุงูƒ ุญุฏูŠุซ ู…ูˆุณู‰. ุฅุฐ ุฑุฃู‰ ู†ุงุฑุง ูู‚ุงู„ ู„ุฃู‡ู„ู‡ ุงู…ูƒุซูˆุง ุฅู†ูŠ ุขู†ุณุช ู†ุงุฑุง ู„ุนู„ูŠ ุขุชูŠูƒู… ู…ู†ู‡ุง ุจู‚ุจุณ ุฃูˆ ุฃุฌุฏ ุนู„ู‰ ุงู„ู†ุงุฑ ู‡ุฏู‰. ูู„ู…ุง ุฃุชุงู‡ุง ู†ูˆุฏูŠ ูŠุง ู…ูˆุณู‰. ุฅู†ูŠ ุฃู†ุง ุฑุจูƒ ูุงุฎู„ุน ู†ุนู„ูŠูƒ ุฅู†ูƒ ุจุงู„ูˆุงุฏ ุงู„ู…ู‚ุฏุณ ุทูˆู‰. ูˆุฃู†ุง ุงุฎุชุฑุชูƒ ูุงุณุชู…ุน ู„ู…ุง ูŠูˆุญู‰. ุฅู†ู†ูŠ ุฃู†ุง ุงู„ู„ู‡ ู„ุง ุฅู„ู‡ ุฅู„ุง ุฃู†ุง ูุงุนุจุฏู†ูŠ ูˆุฃู‚ู… ุงู„ุตู„ุงุฉ ู„ุฐูƒุฑูŠ. ุฅู† ุงู„ุณุงุนุฉ ุขุชูŠุฉ ุฃูƒุงุฏ ุฃุฎููŠู‡ุง ู„ุชุฌุฒู‰ ูƒู„ ู†ูุณ ุจู…ุง ุชุณุนู‰. ูู„ุง ูŠุตุฏู†ูƒ ุนู†ู‡ุง ู…ู† ู„ุง ูŠุคู…ู† ุจู‡ุง ูˆุงุชุจุน ู‡ูˆุงู‡ ูุชุฑุฏู‰.',
        'ูˆู‚ุงู„ูˆุง ู‡ุฐู‡ ุฃู†ุนุงู… ูˆุญุฑุซ ุญุฌุฑ ู„ุง ูŠุทุนู…ู‡ุง ุฅู„ุง ู…ู† ู†ุดุงุก ุจุฒุนู…ู‡ู… ูˆุฃู†ุนุงู… ุญุฑู…ุช ุธู‡ูˆุฑู‡ุง ูˆุฃู†ุนุงู… ู„ุง ูŠุฐูƒุฑูˆู† ุงุณู… ุงู„ู„ู‡ ุนู„ูŠู‡ุง ุงูุชุฑุงุก ุนู„ูŠู‡ ุณูŠุฌุฒูŠู‡ู… ุจู…ุง ูƒุงู†ูˆุง ูŠูุชุฑูˆู†. ูˆู‚ุงู„ูˆุง ู…ุง ููŠ ุจุทูˆู† ู‡ุฐู‡ ุงู„ุฃู†ุนุงู… ุฎุงู„ุตุฉ ู„ุฐูƒูˆุฑู†ุง ูˆู…ุญุฑู… ุนู„ู‰ ุฃุฒูˆุงุฌู†ุง ูˆุฅู† ูŠูƒู† ู…ูŠุชุฉ ูู‡ู… ููŠู‡ ุดุฑูƒุงุก ุณูŠุฌุฒูŠู‡ู… ูˆุตูู‡ู… ุฅู†ู‡ ุญูƒูŠู… ุนู„ูŠู…. ู‚ุฏ ุฎุณุฑ ุงู„ุฐูŠู† ู‚ุชู„ูˆุง ุฃูˆู„ุงุฏู‡ู… ุณูู‡ุง ุจุบูŠุฑ ุนู„ู… ูˆุญุฑู…ูˆุง ู…ุง ุฑุฒู‚ู‡ู… ุงู„ู„ู‡ ุงูุชุฑุงุก ุนู„ู‰ ุงู„ู„ู‡ ู‚ุฏ ุถู„ูˆุง ูˆู…ุง ูƒุงู†ูˆุง ู…ู‡ุชุฏูŠู†.',
        'ูƒูŠู ูŠูƒูˆู† ู„ู„ู…ุดุฑูƒูŠู† ุนู‡ุฏ ุนู†ุฏ ุงู„ู„ู‡ ูˆุนู†ุฏ ุฑุณูˆู„ู‡ ุฅู„ุง ุงู„ุฐูŠู† ุนุงู‡ุฏุชู… ุนู†ุฏ ุงู„ู…ุณุฌุฏ ุงู„ุญุฑุงู… ูู…ุง ุงุณุชู‚ุงู…ูˆุง ู„ูƒู… ูุงุณุชู‚ูŠู…ูˆุง ู„ู‡ู… ุฅู† ุงู„ู„ู‡ ูŠุญุจ ุงู„ู…ุชู‚ูŠู†. ูƒูŠู ูˆุฅู† ูŠุธู‡ุฑูˆุง ุนู„ูŠูƒู… ู„ุง ูŠุฑู‚ุจูˆุง ููŠูƒู… ุฅู„ุง ูˆู„ุง ุฐู…ุฉ ูŠุฑุถูˆู†ูƒู… ุจุฃููˆุงู‡ู‡ู… ูˆุชุฃุจู‰ ู‚ู„ูˆุจู‡ู… ูˆุฃูƒุซุฑู‡ู… ูุงุณู‚ูˆู†. ุงุดุชุฑูˆุง ุจุขูŠุงุช ุงู„ู„ู‡ ุซู…ู†ุง ู‚ู„ูŠู„ุง ูุตุฏูˆุง ุนู† ุณุจูŠู„ู‡ ุฅู†ู‡ู… ุณุงุก ู…ุง ูƒุงู†ูˆุง ูŠุนู…ู„ูˆู†. ู„ุง ูŠุฑู‚ุจูˆู† ููŠ ู…ุคู…ู† ุฅู„ุง ูˆู„ุง ุฐู…ุฉ ูˆุฃูˆู„ุฆูƒ ู‡ู… ุงู„ู…ุนุชุฏูˆู†. ูุฅู† ุชุงุจูˆุง ูˆุฃู‚ุงู…ูˆุง ุงู„ุตู„ุงุฉ ูˆุขุชูˆุง ุงู„ุฒูƒุงุฉ ูุฅุฎูˆุงู†ูƒู… ููŠ ุงู„ุฏูŠู† ูˆู†ูุตู„ ุงู„ุขูŠุงุช ู„ู‚ูˆู… ูŠุนู„ู…ูˆู†.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 7,756 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 11 characters
    • mean: 41.94 characters
    • max: 201 characters
    • min: 53 characters
    • mean: 344.11 characters
    • max: 1086 characters
    • min: 0.0
    • mean: 0.16
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    ู‡ู„ ุฐูƒุฑ ุงู„ู‚ุฑุขู† ุฃู† ุงู„ุชูˆุฑุงุฉ ุชู… ุชุญุฑูŠูู‡ุงุŸ ูŠุง ุฃูŠู‡ุง ุงู„ุฑุณูˆู„ ู„ุง ูŠุญุฒู†ูƒ ุงู„ุฐูŠู† ูŠุณุงุฑุนูˆู† ููŠ ุงู„ูƒูุฑ ู…ู† ุงู„ุฐูŠู† ู‚ุงู„ูˆุง ุขู…ู†ุง ุจุฃููˆุงู‡ู‡ู… ูˆู„ู… ุชุคู…ู† ู‚ู„ูˆุจู‡ู… ูˆู…ู† ุงู„ุฐูŠู† ู‡ุงุฏูˆุง ุณู…ุงุนูˆู† ู„ู„ูƒุฐุจ ุณู…ุงุนูˆู† ู„ู‚ูˆู… ุขุฎุฑูŠู† ู„ู… ูŠุฃุชูˆูƒ ูŠุญุฑููˆู† ุงู„ูƒู„ู… ู…ู† ุจุนุฏ ู…ูˆุงุถุนู‡ ูŠู‚ูˆู„ูˆู† ุฅู† ุฃูˆุชูŠุชู… ู‡ุฐุง ูุฎุฐูˆู‡ ูˆุฅู† ู„ู… ุชุคุชูˆู‡ ูุงุญุฐุฑูˆุง ูˆู…ู† ูŠุฑุฏ ุงู„ู„ู‡ ูุชู†ุชู‡ ูู„ู† ุชู…ู„ูƒ ู„ู‡ ู…ู† ุงู„ู„ู‡ ุดูŠุฆุง ุฃูˆู„ุฆูƒ ุงู„ุฐูŠู† ู„ู… ูŠุฑุฏ ุงู„ู„ู‡ ุฃู† ูŠุทู‡ุฑ ู‚ู„ูˆุจู‡ู… ู„ู‡ู… ููŠ ุงู„ุฏู†ูŠุง ุฎุฒูŠ ูˆู„ู‡ู… ููŠ ุงู„ุขุฎุฑุฉ ุนุฐุงุจ ุนุธูŠู…. ุณู…ุงุนูˆู† ู„ู„ูƒุฐุจ ุฃูƒุงู„ูˆู† ู„ู„ุณุญุช ูุฅู† ุฌุงุกูˆูƒ ูุงุญูƒู… ุจูŠู†ู‡ู… ุฃูˆ ุฃุนุฑุถ ุนู†ู‡ู… ูˆุฅู† ุชุนุฑุถ ุนู†ู‡ู… ูู„ู† ูŠุถุฑูˆูƒ ุดูŠุฆุง ูˆุฅู† ุญูƒู…ุช ูุงุญูƒู… ุจูŠู†ู‡ู… ุจุงู„ู‚ุณุท ุฅู† ุงู„ู„ู‡ ูŠุญุจ ุงู„ู…ู‚ุณุทูŠู†. ูˆูƒูŠู ูŠุญูƒู…ูˆู†ูƒ ูˆุนู†ุฏู‡ู… ุงู„ุชูˆุฑุงุฉ ููŠู‡ุง ุญูƒู… ุงู„ู„ู‡ ุซู… ูŠุชูˆู„ูˆู† ู…ู† ุจุนุฏ ุฐู„ูƒ ูˆู…ุง ุฃูˆู„ุฆูƒ ุจุงู„ู…ุคู…ู†ูŠู†. 1.0
    ุจู…ุงุฐุง ุดุจู‡ ุงู„ู„ู‡ ุงู„ุฐูŠ ูŠูุชุจุน ุงู„ุญุณู†ุฉ ุจุงู„ุฃุฐู‰ุŸ ูู…ู† ุฃุธู„ู… ู…ู…ู† ูƒุฐุจ ุนู„ู‰ ุงู„ู„ู‡ ูˆูƒุฐุจ ุจุงู„ุตุฏู‚ ุฅุฐ ุฌุงุกู‡ ุฃู„ูŠุณ ููŠ ุฌู‡ู†ู… ู…ุซูˆู‰ ู„ู„ูƒุงูุฑูŠู†. ูˆุงู„ุฐูŠ ุฌุงุก ุจุงู„ุตุฏู‚ ูˆุตุฏู‚ ุจู‡ ุฃูˆู„ุฆูƒ ู‡ู… ุงู„ู…ุชู‚ูˆู†. ู„ู‡ู… ู…ุง ูŠุดุงุกูˆู† ุนู†ุฏ ุฑุจู‡ู… ุฐู„ูƒ ุฌุฒุงุก ุงู„ู…ุญุณู†ูŠู†. ู„ูŠูƒูุฑ ุงู„ู„ู‡ ุนู†ู‡ู… ุฃุณูˆุฃ ุงู„ุฐูŠ ุนู…ู„ูˆุง ูˆูŠุฌุฒูŠู‡ู… ุฃุฌุฑู‡ู… ุจุฃุญุณู† ุงู„ุฐูŠ ูƒุงู†ูˆุง ูŠุนู…ู„ูˆู†. ุฃู„ูŠุณ ุงู„ู„ู‡ ุจูƒุงู ุนุจุฏู‡ ูˆูŠุฎูˆููˆู†ูƒ ุจุงู„ุฐูŠู† ู…ู† ุฏูˆู†ู‡ ูˆู…ู† ูŠุถู„ู„ ุงู„ู„ู‡ ูู…ุง ู„ู‡ ู…ู† ู‡ุงุฏ. ูˆู…ู† ูŠู‡ุฏ ุงู„ู„ู‡ ูู…ุง ู„ู‡ ู…ู† ู…ุถู„ ุฃู„ูŠุณ ุงู„ู„ู‡ ุจุนุฒูŠุฒ ุฐูŠ ุงู†ุชู‚ุงู…. 0.0
    ู‡ู„ ู‡ู†ุงูƒ ุฅุดุงุฑุงุช ููŠ ุงู„ู‚ุฑุขู† ุนู† ู†ู‡ุงูŠุฉ ุงู„ูƒูŠุงู† ุงู„ุตู‡ูŠูˆู†ูŠุŸ ูˆู‡ู„ ุฃุชุงูƒ ุญุฏูŠุซ ู…ูˆุณู‰. ุฅุฐ ุฑุฃู‰ ู†ุงุฑุง ูู‚ุงู„ ู„ุฃู‡ู„ู‡ ุงู…ูƒุซูˆุง ุฅู†ูŠ ุขู†ุณุช ู†ุงุฑุง ู„ุนู„ูŠ ุขุชูŠูƒู… ู…ู†ู‡ุง ุจู‚ุจุณ ุฃูˆ ุฃุฌุฏ ุนู„ู‰ ุงู„ู†ุงุฑ ู‡ุฏู‰. ูู„ู…ุง ุฃุชุงู‡ุง ู†ูˆุฏูŠ ูŠุง ู…ูˆุณู‰. ุฅู†ูŠ ุฃู†ุง ุฑุจูƒ ูุงุฎู„ุน ู†ุนู„ูŠูƒ ุฅู†ูƒ ุจุงู„ูˆุงุฏ ุงู„ู…ู‚ุฏุณ ุทูˆู‰. ูˆุฃู†ุง ุงุฎุชุฑุชูƒ ูุงุณุชู…ุน ู„ู…ุง ูŠูˆุญู‰. ุฅู†ู†ูŠ ุฃู†ุง ุงู„ู„ู‡ ู„ุง ุฅู„ู‡ ุฅู„ุง ุฃู†ุง ูุงุนุจุฏู†ูŠ ูˆุฃู‚ู… ุงู„ุตู„ุงุฉ ู„ุฐูƒุฑูŠ. ุฅู† ุงู„ุณุงุนุฉ ุขุชูŠุฉ ุฃูƒุงุฏ ุฃุฎููŠู‡ุง ู„ุชุฌุฒู‰ ูƒู„ ู†ูุณ ุจู…ุง ุชุณุนู‰. ูู„ุง ูŠุตุฏู†ูƒ ุนู†ู‡ุง ู…ู† ู„ุง ูŠุคู…ู† ุจู‡ุง ูˆุงุชุจุน ู‡ูˆุงู‡ ูุชุฑุฏู‰. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.5400 500 0.0274
1.0799 1000 0.0003
1.6199 1500 0.0001
2.1598 2000 0.0001
2.6998 2500 0.0001
0.7418 500 0.9666
1.4837 1000 0.3318
2.2255 1500 0.2711
2.9674 2000 0.2051
1.0309 500 0.3163
2.0619 1000 0.2196
1.0309 500 0.1761
2.0619 1000 0.129

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.53.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.9.0
  • Datasets: 2.14.4
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
12
Safetensors
Model size
135M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yoriis/arabert-tydi-quqa-task-ar-v2

Finetuned
(63)
this model