--- base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 datasets: [] language: [] library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:64000 - loss:DenoisingAutoEncoderLoss widget: - source_sentence: đ‘€Ÿā¤šā¤¨đ‘€™đ‘€ĸ𑀟 đ‘€žā¤šđ‘€Ēā¤šđ‘€ ā¤š đ‘€Ģđ‘Ŗā¤Ēđ‘Ŗ đ‘€žā¤¨đ‘€ ā¤š đ‘€žđ‘Ŗđ‘€ąā¤š ā¤Ŧđ‘€ĸđ‘€Ēđ‘€ ā¤šđ‘€¯ sentences: - ' ⤪⤚ ā¤Ŧđ‘€ĸđ‘€Ēđ‘€ ā¤š ā¤Ē⤚đ‘€Ēđ‘Ļ đ‘€Ŗā¤š đ‘€ ā¤šđ‘€Ģ⤚đ‘€ĸ𑀲đ‘€ĸ⤪⤚đ‘€Ēđ‘€ŗā¤š đ‘€Ŗā¤š ā¤ā¤šđ‘€Ÿđ‘Ļđ‘€Ÿđ‘€ŗā¤š ā¤žā¤šā¤Ŗā¤šđ‘€Ļ đ‘€žā¤šđ‘€ ā¤šđ‘€Ē ā¤Ŗā¤šđ‘€Ŗđ‘€Ŗā¤š đ‘€ ā¤šđ‘€Ģ⤚đ‘€ĸ𑀲đ‘€ĸđ‘€Ÿđ‘€ŗā¤š ⤪⤚ ā¤ĸ⤚đ‘€Ē đ‘€ĸ⤪⤚⤞đ‘€ĸđ‘€¯' - ' đ‘€Ŗā¤šđ‘€Ÿā¤Ŧā¤šđ‘€Ÿđ‘Ļ đ‘€Ŗā¤š đ‘€Ÿā¤šā¤¨đ‘€™đ‘€ĸ𑀟 đ‘€ đ‘Ŗā¤Ē⤚đ‘€Ēđ‘€Ļ ā¤Ēā¤šđ‘€Ÿā¤š đ‘€ĸ⤪⤚ đ‘€¤ā¤šđ‘€ ā¤š ā¤ĸ⤚ā¤ĸā¤ĸ⤚ đ‘€žđ‘Ŗ đ‘€žā¤šđ‘€Ēā¤šđ‘€ ā¤š đ‘€ĸđ‘€Ŗā¤šđ‘€Ÿ ā¤šđ‘€žā¤š đ‘€žđ‘€ąā¤šā¤Ēā¤šđ‘€Ÿā¤Ē⤚ đ‘€Ŗā¤š đ‘€ đ‘Ŗā¤Ē⤚đ‘€Ē đ‘€Ŗā¤šā¤¨đ‘€žā¤šđ‘€Ē đ‘€Ģđ‘Ŗā¤Ēđ‘Ŗ đ‘€Ŗā¤š đ‘€ŗā¤¨ā¤–đ‘€Ļ đ‘€žā¤¨đ‘€ ā¤š ⤪⤚ 𑀲đ‘€ĸ đ‘€Ÿā¤š đ‘€žđ‘Ŗđ‘€ąā¤š ā¤Ŧđ‘€ĸđ‘€Ēđ‘€ ā¤šđ‘€¯' - ā¤Ē⤚đ‘€Ēđ‘Ļ𑀠đ‘€ĸ ⤪⤚ ā¤ĸ⤍ā¤Ŧ⤚ đ‘€ąā¤š ā¤ā¤¨đ‘€Ÿā¤Ŧđ‘€ĸ⤪⤚đ‘€Ē ā¤đ‘€ąā¤šā¤˛ā¤˛đ‘Ŗđ‘€Ÿ ā¤ā¤šđ‘€˛ā¤š ā¤Ē⤚ ā¤žā¤šā¤˛đ‘€ĸā¤ĸđ‘€ĸ𑀟 ā¤ā¤šđ‘€ŗā¤šđ‘€Ē đ‘€ĸđ‘€Ēā¤šđ‘€Ÿ ⤚ ā¤Ŧā¤šđ‘€ŗā¤šđ‘€Ē ā¤Ē⤍đ‘€Ē𑀞đ‘€ĸ⤪⤪⤚ đ‘€žā¤¨đ‘€ ā¤š ⤪⤚ ⤤đ‘€ĸ đ‘€ąā¤š ā¤ā¤¨đ‘€Ÿā¤Ŧđ‘€ĸ⤪⤚đ‘€Ē đ‘€žđ‘€ąā¤šā¤˛ā¤˛ā¤šā¤Ŗđ‘Ļ ā¤Ĩđ‘€¯ - source_sentence: ā¤Ŗā¤šđ‘€Ÿā¤š ā¤Ŧ⤚ā¤ĸ⤚ đ‘€Ŗā¤š ⤞⤍đ‘€Ē⤚ đ‘€Ŗā¤š đ‘€Ŗā¤š ā¤Ē⤚ 𑀲đ‘€ĸ đ‘€Ŗā¤š sentences: - đ‘€˜đ‘Ŗđ‘€Ģ𑀟 𑀠đ‘€ĸ⤤đ‘€Ģ⤚đ‘Ļ⤞ đ‘Ŗā¤Ŧđ‘€ĸđ‘€Ŗđ‘€ĸ đ‘€ā¤šđ‘€Ÿ đ‘€Ģ⤚đ‘€ĸ𑀲đ‘Ļđ‘€ŗđ‘€Ģđ‘€ĸ đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē 𑀗 ā¤Ŧ⤚ đ‘€ąā¤šā¤Ēā¤šđ‘€Ÿ đ‘€Ŗđ‘€ĸđ‘€ŗā¤šđ‘€ ā¤ĸ⤚đ‘€Ļ 𑀭ā¤Ĩ𑀖ā¤Ĩđ‘€Žđ‘€¯ - ' đ‘€ąā¤šđ‘€Ÿđ‘€Ÿā¤šđ‘€Ÿ ā¤Ŗā¤šđ‘€Ÿā¤š ā¤Ē⤚đ‘€ĸđ‘€ ā¤šđ‘€žā¤š đ‘€ąā¤š ā¤đ‘€ąā¤šđ‘€Ē⤚đ‘€Ēđ‘€Ēā¤¨đ‘€Ÿ đ‘€Ģđ‘€Ē đ‘€ŗā¤¨ ⤤đ‘€ĸ ā¤Ŧ⤚ā¤ĸ⤚ đ‘€Ŗā¤š ⤞⤍đ‘€Ē⤚ đ‘€Ŗā¤š đ‘€Ŗā¤¨đ‘€ž ā¤ĸā¤¨ā¤žā¤šā¤žā¤žđ‘Ļ𑀟 ā¤šā¤Ŗā¤Ŗā¤¨đ‘€žā¤šđ‘€Ÿđ‘€ŗā¤¨ đ‘€Ŗā¤š đ‘€ ā¤šđ‘€ŗā¤¨ 𑀟đ‘Ļđ‘€ ā¤š ā¤Ē⤚ đ‘€Ģā¤šđ‘€Ÿā¤Ŗā¤šđ‘€Ē đ‘€Ŗā¤š ā¤Ē⤚ 𑀲đ‘€ĸ đ‘€ŗā¤šā¤¨đ‘€Ēđ‘€ĸ đ‘€Ŗā¤š đ‘€ŗā¤šā¤¨ā¤đ‘€ĸ 𑀲đ‘€ĸ⤪đ‘Ļ đ‘€Ŗā¤š đ‘€Ŗā¤šđ‘€¯' - ' ⤚ đ‘€žā¤šđ‘€Ēđ‘€žā¤šđ‘€ŗđ‘€Ģđ‘€ĸ𑀟 đ‘€Ŗđ‘Ŗđ‘€žā¤šđ‘€Ēđ‘€Ļ đ‘€ ā¤šđ‘€˜ā¤šā¤˛đ‘€ĸđ‘€ŗā¤šđ‘€Ē ā¤˛ā¤šā¤¨ā¤Ŗđ‘Ŗā¤Ŗđ‘€ĸ𑀟 đ‘€ĸđ‘€Ÿđ‘€Ŗđ‘€ĸ⤪⤚ đ‘€ĸā¤Ē⤚ ⤤đ‘Ļ ā¤ĸ⤚ā¤ĸā¤ĸ⤚đ‘€Ē đ‘€Ģā¤¨đ‘€žā¤¨đ‘€ ā¤šđ‘€Ē đ‘€žā¤¨ā¤˛ā¤š đ‘€Ŗā¤š đ‘€Ģ⤚đ‘€Ēđ‘€žđ‘Ŗđ‘€žđ‘€ĸ𑀟 đ‘€ŗđ‘€Ģ⤚đ‘€Ēđ‘€ĸđ‘€™ā¤š ⤚ đ‘€ĸđ‘€Ÿđ‘€Ŗđ‘€ĸ⤪⤚ đ‘€Ŗā¤š đ‘€žā¤¨đ‘€ ā¤š ā¤Ē⤚ā¤ĸā¤ĸ⤚ā¤Ē⤚đ‘€Ē đ‘€Ŗā¤š ā¤ĸđ‘€ĸ𑀟 đ‘€Ŗđ‘Ŗđ‘€žā¤š đ‘€Ŗā¤š 𑀞đ‘€ĸ⤪⤚⤪đ‘Ļ đ‘€žā¤šđ‘€™đ‘€ĸđ‘€Ŗđ‘Ŗđ‘€˜đ‘€ĸ𑀟 đ‘€žđ‘€ąā¤šđ‘€Ē⤚đ‘€Ēđ‘€Ē⤍ ā¤Ē⤚ đ‘€Ģā¤šđ‘€Ÿā¤Ŗā¤šđ‘€Ē đ‘€žđ‘€ąā¤šđ‘€Ē⤚đ‘€Ēđ‘€Ēā¤¨đ‘€Ÿ ⤞⤚⤍⤪⤚ ⤚ đ‘€žā¤šđ‘€ŗā¤šđ‘€Ēđ‘€¯' - source_sentence: đ‘€Ŗā¤¨ā¤ĸ⤚ ā¤ĸā¤ĸā¤¤đ‘€• đ‘€ ā¤šđ‘€ ā¤šđ‘€Ē ⤚⤞⤚ā¤Ēđ‘Ŗā¤¨đ‘€ đ‘€ĸ sentences: - đ‘€Ŗā¤¨ā¤ĸ⤚ đ‘€žā¤¨đ‘€ ā¤š đ‘€Ŗđ‘Ļđ‘€Ÿđ‘€žā¤ˇđ‘€Ŗđ‘Ļđ‘€Ÿđ‘€žđ‘€ ā¤šđ‘€Ÿā¤šđ‘€¤ā¤šđ‘€Ēā¤Ē⤚ ā¤ĸā¤ĸā¤¤đ‘€• đ‘€ ā¤šđ‘€ ā¤šđ‘€Ē đ‘€žā¤šđ‘€ŗđ‘€ŗđ‘Ļ⤪ ⤚⤞⤚ā¤Ēđ‘Ŗā¤¨đ‘€ đ‘€ĸ đ‘€¯ - ' ā¤šđ‘€Ÿ đ‘€˛ā¤šđ‘€Ē⤚ đ‘€ŗā¤šđ‘€ ā¤šđ‘€Ēđ‘€ąā¤š đ‘€žā¤¨đ‘€ ā¤š đ‘€Ŗā¤šā¤Ŧ⤚ ā¤ĸ⤚⤪⤚ ā¤šđ‘€Ÿ đ‘€˛ā¤šđ‘€Ŗā¤šđ‘€Ŗā¤š ā¤šā¤Ŗā¤Ŗā¤¨đ‘€žā¤šđ‘€Ÿ ā¤Ŧ⤚ đ‘€ŗā¤šā¤¨đ‘€Ēā¤šđ‘€Ÿ đ‘€ĸ⤪⤚⤞⤚đ‘€ĸ đ‘€Ÿā¤š đ‘€Ÿā¤šđ‘€˜đ‘Ļđ‘€Ēđ‘€ĸ⤪⤚ đ‘€ ā¤šđ‘€ŗā¤¨ ⤪⤚đ‘€Ēā¤šđ‘€¯' - ' đ‘€Ģ⤚đ‘Ĩā¤šđ‘€žā¤š ā¤šđ‘€¤ā¤šā¤ĸā¤Ē⤚đ‘€Ēđ‘€ąā¤š ā¤Ŗā¤šđ‘€Ÿā¤š đ‘€Ŗā¤š đ‘€ąā¤šđ‘€Ģ⤚⤞⤚ đ‘€ ā¤¨đ‘€ŗā¤šđ‘€ đ‘€ ā¤šđ‘€Ÿ ⤚ ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 ā¤šā¤Ŗā¤Ŗā¤¨đ‘€žā¤šđ‘€Ÿ ā¤Ŗā¤šā¤đ‘€ĸ đ‘€Ŗā¤š ā¤Ēā¤šđ‘€ąā¤šā¤Ŧ⤚đ‘€Ēđ‘€¯' - source_sentence: ā¤šđ‘€Ÿ sentences: - đ‘€ ā¤¨ā¤Ēā¤¨đ‘€ąā¤š ⤚ đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē ⤰ ā¤Ŧ⤚ đ‘€ąā¤šā¤Ēā¤šđ‘€Ÿ đ‘€ ā¤šā¤Ŗā¤¨đ‘€Ÿ ā¤ đ‘€§đ‘€§ā¤ đ‘€Ļ ā¤šđ‘€žā¤¨ đ‘€Ÿā¤š ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€˛ā¤šđ‘€ŗđ‘€ĸđ‘€Ÿđ‘€˜đ‘Ŗđ‘€˜đ‘€ĸ đ‘€Ŧ𑀧 đ‘€Ŗā¤š 𑀞đ‘Ļ ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€ąā¤šđ‘€Ÿđ‘€ĸ 𑀘đ‘€ĸđ‘€Ēā¤Ŧđ‘€ĸ𑀟 đ‘€Ŗā¤š ⤪⤚ ⤪đ‘€ĸ đ‘€Ģ⤚ā¤Ēđ‘€ŗā¤šđ‘€Ēđ‘€ĸ𑀟 𑀠đ‘€ĸ𑀟ā¤Ēā¤¨đ‘€Ÿā¤š đ‘€žā¤šā¤žā¤šđ‘€Ÿ ā¤ĸā¤šā¤Ŗā¤šđ‘€Ÿ ā¤Ēā¤šđ‘€ŗđ‘€Ģđ‘€ĸđ‘€Ÿđ‘€ŗā¤š ⤚ đ‘€žā¤šđ‘€Ÿđ‘Ŗđ‘€¯ - ' ā¤šđ‘€Ÿ ⤪đ‘€ĸ đ‘€ĸđ‘€ ā¤šđ‘€Ÿđ‘€ĸ𑀟 đ‘€ŗđ‘€¯' - ' đ‘€˛ā¤šđ‘€Ģā¤šđ‘€Ŗ ⤪⤚ đ‘€žā¤šđ‘€ đ‘€ ā¤šā¤˛ā¤š đ‘€žā¤šđ‘€žā¤šđ‘€Ē ā¤ đ‘€§đ‘€­ā¤ ā¤Ÿđ‘€­đ‘€° đ‘€Ŗā¤š đ‘€žđ‘€ąā¤šā¤˛ā¤˛ā¤šā¤Ŗđ‘Ļ 𑀭𑀧 đ‘€ ā¤šđ‘€ŗā¤¨ ā¤ĸā¤šđ‘€Ÿ đ‘€ŗđ‘€Ģā¤šđ‘€™ā¤šđ‘€ąā¤š ⤚ đ‘€ąā¤šđ‘€ŗā¤šđ‘€Ÿđ‘€Ÿđ‘€ĸ ⤠đ‘ĸ ⤚ đ‘€Ŗā¤¨đ‘€ž ā¤Ŧā¤šđ‘€ŗā¤šđ‘€¯' - source_sentence: ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ đ‘€ ā¤šđ‘€˛đ‘€ĸ đ‘€ ā¤šđ‘€Ģđ‘€ĸđ‘€ đ‘€ ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ē⤚đ‘€ĸđ‘€ ā¤šđ‘€žđ‘Ŗđ‘€Ÿ đ‘€Ŗā¤š đ‘€˛ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸ⤚ đ‘€ ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ⤪⤚⤪đ‘€ĸ𑀟 sentences: - ā¤šđ‘€ đ‘€ĸ𑀟ā¤Ē⤚⤤⤤đ‘€ĸ⤪⤚ ⤚ ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ŗđ‘Ļđ‘€Ēđ‘€ĸđ‘Ļđ‘€ŗ đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ đ‘€ ā¤šđ‘€˛đ‘€ĸ đ‘€ ā¤šđ‘€Ģđ‘€ĸđ‘€ đ‘€ ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ē⤚đ‘€Ēđ‘Ļ đ‘€Ŗā¤š ā¤žđ‘€ĸ𑀠ā¤ĸđ‘€ĸ𑀟 đ‘€ĸ𑀟ā¤Ŧā¤šđ‘€Ÿā¤Ē⤚ā¤Ēā¤Ēā¤¨đ‘€Ÿ ā¤Ēđ‘€ŗā¤šđ‘€Ēđ‘€ĸ𑀟 ā¤Ē⤚đ‘€ĸđ‘€ ā¤šđ‘€žđ‘Ŗđ‘€Ÿ đ‘€Ŗđ‘€ĸđ‘€Ēđ‘Ļā¤ĸ⤚ đ‘€Ŗā¤š đ‘€˛ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š đ‘€Ÿā¤š ā¤šđ‘€ đ‘€ĸđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸ⤚ đ‘€ ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€žđ‘€ąā¤šđ‘€Ÿā¤¤đ‘€ĸ⤪⤚đ‘€Ē đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ⤪⤚⤪đ‘€ĸ𑀟 ā¤Ēā¤šđ‘€˛đ‘€ĸ⤪⤚đ‘€Ēđ‘€ŗā¤¨đ‘€¯ - ā¤Ēđ‘Ŗā¤§đ‘€ŗā¤Ŗ ⤧đ‘€Ģđ‘€ĸđ‘€Ēđ‘€ĸ đ‘€ā¤šđ‘€Ÿ đ‘€Ģ⤚đ‘€ĸ𑀲đ‘Ļ đ‘€ŗđ‘€Ģđ‘€ĸ ⤚ đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē 𑀭𑀭 ā¤Ŧ⤚ đ‘€ąā¤šā¤Ēā¤šđ‘€Ÿ ⤚ā¤Ŧā¤¨đ‘€ŗā¤Ē⤚ 𑀭ā¤Ĩ𑀗𑀧𑀮 ā¤žā¤šđ‘€Ÿ đ‘€ąā¤šđ‘€ŗā¤šđ‘€Ÿ ā¤ĸā¤šđ‘€Ŗđ‘€ đ‘€ĸ𑀟ā¤Ēđ‘Ŗđ‘€Ÿ ā¤žā¤šđ‘€Ÿ đ‘€¤ā¤šđ‘€ ā¤ĸđ‘€ĸ⤚ 𑀟đ‘Ļđ‘€¯ - ā¤Ē⤚ā¤Ŧā¤Ŧā¤šđ‘€˛ā¤šđ‘€Ŗđ‘€ĸ đ‘€ ā¤šā¤Ēđ‘€ŗā¤¨ā¤Ŧā¤¨đ‘€Ÿđ‘€ĸ𑀟 đ‘€ ā¤¨ā¤Ēā¤šđ‘€Ÿđ‘Ļ 𑀟đ‘Ļ ⤚ đ‘€ŗā¤šđ‘€ŗđ‘€Ģđ‘Ļ𑀟 ⤚đ‘€Ē⤞đ‘€ĸā¤Ē đ‘€Ŗā¤šđ‘€žđ‘Ļ ā¤Ŗā¤šđ‘€Ÿđ‘€žđ‘€ĸ𑀟 ⤚ā¤Ŧā¤šđ‘€Ŗđ‘Ļ𑀤 ⤚ ⤚đ‘€Ēđ‘Ļđ‘€ąā¤š ā¤Ē⤚ ā¤Ēđ‘€ŗā¤šđ‘€žđ‘€ĸ⤪⤚đ‘€Ē 𑀟đ‘€ĸđ‘€˜ā¤šđ‘€Ēđ‘€¯ --- # SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 384 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("T-Blue/tsdae_pro_MiniLM_L12_2") # Run inference sentences = [ 'ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ đ‘€ ā¤šđ‘€˛đ‘€ĸ đ‘€ ā¤šđ‘€Ģđ‘€ĸđ‘€ đ‘€ ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ē⤚đ‘€ĸđ‘€ ā¤šđ‘€žđ‘Ŗđ‘€Ÿ đ‘€Ŗā¤š đ‘€˛ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸ⤚ đ‘€ ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ⤪⤚⤪đ‘€ĸ𑀟', 'ā¤šđ‘€ đ‘€ĸ𑀟ā¤Ē⤚⤤⤤đ‘€ĸ⤪⤚ ⤚ ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 ā¤Ŧđ‘€Ģđ‘Ŗđ‘€ŗā¤Ē đ‘€ŗđ‘Ļđ‘€Ēđ‘€ĸđ‘Ļđ‘€ŗ đ‘€ĸđ‘€ĸ đ‘€ŗđ‘€Ģđ‘€ĸ𑀟đ‘Ļ đ‘€ ā¤šđ‘€˛đ‘€ĸ đ‘€ ā¤šđ‘€Ģđ‘€ĸđ‘€ đ‘€ ā¤šđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ē⤚đ‘€Ēđ‘Ļ đ‘€Ŗā¤š ā¤žđ‘€ĸ𑀠ā¤ĸđ‘€ĸ𑀟 đ‘€ĸ𑀟ā¤Ŧā¤šđ‘€Ÿā¤Ē⤚ā¤Ēā¤Ēā¤¨đ‘€Ÿ ā¤Ēđ‘€ŗā¤šđ‘€Ēđ‘€ĸ𑀟 ā¤Ē⤚đ‘€ĸđ‘€ ā¤šđ‘€žđ‘Ŗđ‘€Ÿ đ‘€Ŗđ‘€ĸđ‘€Ēđ‘Ļā¤ĸ⤚ đ‘€Ŗā¤š đ‘€˛ā¤šđ‘€ŗā¤šā¤˛ā¤¨ā¤˛ā¤˛ā¤¨đ‘€žā¤š đ‘€Ÿā¤š ā¤šđ‘€ đ‘€ĸđ‘€Ÿā¤¤đ‘€ĸđ‘€Ļ ā¤Ŗā¤šđ‘€Ÿā¤š ā¤ĸ⤚ đ‘€ ā¤šđ‘€¤ā¤šā¤¨đ‘€Ÿā¤š ⤤đ‘€ĸ𑀞đ‘€ĸ𑀟 đ‘€žđ‘€ąā¤šđ‘€Ÿā¤¤đ‘€ĸ⤪⤚đ‘€Ē đ‘€Ģā¤šđ‘€Ÿđ‘€žā¤šā¤˛đ‘€ĸ ⤪⤚⤪đ‘€ĸ𑀟 ā¤Ēā¤šđ‘€˛đ‘€ĸ⤪⤚đ‘€Ēđ‘€ŗā¤¨đ‘€¯', 'ā¤Ēđ‘Ŗā¤§đ‘€ŗā¤Ŗ ⤧đ‘€Ģđ‘€ĸđ‘€Ēđ‘€ĸ đ‘€ā¤šđ‘€Ÿ đ‘€Ģ⤚đ‘€ĸ𑀲đ‘Ļ đ‘€ŗđ‘€Ģđ‘€ĸ ⤚ đ‘€Ēā¤šđ‘€Ÿā¤šđ‘€Ē 𑀭𑀭 ā¤Ŧ⤚ đ‘€ąā¤šā¤Ēā¤šđ‘€Ÿ ⤚ā¤Ŧā¤¨đ‘€ŗā¤Ē⤚ 𑀭ā¤Ĩ𑀗𑀧𑀮 ā¤žā¤šđ‘€Ÿ đ‘€ąā¤šđ‘€ŗā¤šđ‘€Ÿ ā¤ĸā¤šđ‘€Ŗđ‘€ đ‘€ĸ𑀟ā¤Ēđ‘Ŗđ‘€Ÿ ā¤žā¤šđ‘€Ÿ đ‘€¤ā¤šđ‘€ ā¤ĸđ‘€ĸ⤚ 𑀟đ‘Ļđ‘€¯', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 64,000 training samples * Columns: sentence_0 and sentence_1 * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | sentence_0 | sentence_1 | |:---------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------| | đ‘€žā¤¨đ‘€Ŗā¤¨ ā¤ĸđ‘€ĸđ‘€Ē𑀟đ‘€ĸ𑀟đ‘€Ļđ‘€žā¤¨đ‘€ŗā¤š ā¤Ēđ‘Ļđ‘€žā¤¨đ‘€Ÿ | ā¤Ēđ‘Ļđ‘€žā¤¨đ‘€Ÿ ā¤Ē⤚ā¤Ŧ⤚ ā¤Ŗā¤šđ‘€Ÿā¤š đ‘€žā¤¨đ‘€Ŗā¤¨ đ‘€Ŗā¤š ā¤ĸđ‘€ĸđ‘€Ē𑀟đ‘€ĸ𑀟đ‘€Ļđ‘€žā¤¨đ‘€ŗā¤š đ‘€Ŗā¤š ā¤Ēđ‘Ļđ‘€žā¤¨đ‘€Ÿ ā¤Ē⤚⤤đ‘€Ģđ‘Ŗā¤Ŧā¤šđ‘€¯ | | ⤚ ⤤đ‘€ĸā¤ĸđ‘€ĸā¤Ŗđ‘Ŗā¤Ŗđ‘€ĸ𑀟 đ‘€ŗā¤šđ‘€Ŗā¤šđ‘€Ēđ‘€ąā¤šđ‘€Ē đ‘€ŗā¤¨ ā¤ā¤šđ‘€Ē⤚ đ‘€ ā¤šā¤Ēđ‘€ŗā¤šā¤Ŗđ‘€ĸ𑀟 | ⤚ā¤ĸđ‘Ŗđ‘€žā¤šđ‘€ĸđ‘€žā¤šđ‘€ ā¤šđ‘€Ē ⤚ ā¤Ŗā¤šđ‘€ąā¤šđ‘€Ÿā¤¤đ‘€ĸ𑀟 ⤤đ‘€ĸā¤ĸđ‘€ĸā¤Ŗđ‘Ŗā¤Ŗđ‘€ĸ𑀟 đ‘€ŗā¤šđ‘€Ŗā¤šđ‘€Ēđ‘€ąā¤šđ‘€Ē đ‘€˜ā¤šđ‘€ ā¤šđ‘€™ā¤šđ‘€Ļ đ‘€ ā¤šđ‘€ŗā¤¨ ā¤šđ‘€ đ‘€˛ā¤šđ‘€Ÿđ‘€ĸ đ‘€¤ā¤š đ‘€ŗā¤¨ đ‘€ĸ⤪⤚ ā¤ā¤šđ‘€Ē⤚ đ‘€ ā¤¨ā¤Ēā¤šđ‘€Ÿđ‘Ļ ⤚ đ‘€ ā¤šā¤Ēđ‘€ŗā¤šā¤Ŗđ‘€ĸ𑀟 ⤚ā¤ĸđ‘Ŗđ‘€žā¤šđ‘€Ÿđ‘€ŗā¤¨đ‘€¯ | | đ‘€Ŗā¤š ā¤Ŧā¤¨đ‘€Ŗā¤¨đ‘€ đ‘€ ā¤šđ‘€ąā¤š đ‘€˜ā¤šđ‘€Ēđ‘€ĸđ‘€Ŗā¤¨đ‘€Ÿ đ‘€ ā¤¨đ‘€˜ā¤šā¤˛ā¤˛ā¤¨ ā¤Ē⤚ đ‘€¯ | ā¤Ē⤚ ā¤ĸ⤚ đ‘€Ŗā¤š ā¤Ŧā¤¨đ‘€Ŗā¤¨đ‘€ đ‘€ ā¤šđ‘€ąā¤š ā¤Ŧ⤚ đ‘€˜ā¤šđ‘€Ēđ‘€ĸđ‘€Ŗā¤¨đ‘€Ÿ ā¤šđ‘€Ÿā¤šđ‘€Ē⤤đ‘€Ģđ‘€ĸđ‘€ŗā¤Ē đ‘€Ŗā¤šā¤ĸā¤šđ‘€Ÿā¤ˇđ‘€Ŗā¤šā¤ĸā¤šđ‘€Ÿ đ‘€Ŗā¤š đ‘€ ā¤¨đ‘€˜ā¤šā¤˛ā¤˛ā¤¨ đ‘€ ā¤šđ‘€ŗā¤¨ ā¤šā¤˛ā¤šā¤ā¤š đ‘€Ŗā¤š ā¤ā¤¨đ‘€Ÿā¤Ŧđ‘€ĸ⤪⤚đ‘€Ē đ‘€ ā¤šđ‘€™ā¤šđ‘€ĸđ‘€žā¤šā¤Ē⤚ đ‘€™ā¤Ŗā¤šđ‘€Ÿā¤¤đ‘€ĸ ā¤Ē⤚ đ‘€˜ā¤šđ‘€ ā¤¨đ‘€ŗ đ‘€¯ | * Loss: [DenoisingAutoEncoderLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#denoisingautoencoderloss) ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 3 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Training Logs | Epoch | Step | Training Loss | |:-----:|:-----:|:-------------:| | 0.125 | 500 | 2.5392 | | 0.25 | 1000 | 1.4129 | | 0.375 | 1500 | 1.3383 | | 0.5 | 2000 | 1.288 | | 0.625 | 2500 | 1.2627 | | 0.75 | 3000 | 1.239 | | 0.875 | 3500 | 1.2208 | | 1.0 | 4000 | 1.2041 | | 1.125 | 4500 | 1.1743 | | 1.25 | 5000 | 1.1633 | | 1.375 | 5500 | 1.1526 | | 1.5 | 6000 | 1.1375 | | 1.625 | 6500 | 1.1313 | | 1.75 | 7000 | 1.1246 | | 1.875 | 7500 | 1.1162 | | 2.0 | 8000 | 1.1096 | | 2.125 | 8500 | 1.0876 | | 2.25 | 9000 | 1.0839 | | 2.375 | 9500 | 1.0791 | | 2.5 | 10000 | 1.0697 | | 2.625 | 10500 | 1.0671 | | 2.75 | 11000 | 1.0644 | | 2.875 | 11500 | 1.0579 | | 3.0 | 12000 | 1.0528 | ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.0.1 - Transformers: 4.42.4 - PyTorch: 2.3.1+cu121 - Accelerate: 0.33.0 - Datasets: 2.18.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### DenoisingAutoEncoderLoss ```bibtex @inproceedings{wang-2021-TSDAE, title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning", author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna", booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021", month = nov, year = "2021", address = "Punta Cana, Dominican Republic", publisher = "Association for Computational Linguistics", pages = "671--688", url = "https://arxiv.org/abs/2104.06979", } ```