tomaarsen's picture
tomaarsen HF Staff
Add new SparseEncoder model
6e150a5 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sparse-encoder
  - sparse
  - splade
  - generated_from_trainer
  - dataset_size:100000
  - loss:SpladeLoss
  - loss:SparseMultipleNegativesRankingLoss
  - loss:FlopsLoss
base_model: distilbert/distilbert-base-uncased
widget:
  - text: >-
      He does it right, but there are times that he doesn't (Joana) Let's go
      there and pee? Because she does not want to wear a diaper, she rips off
      her diaper (Filomena). The family caregiver may understand this action as
      a "pang" and "tantrum", and "forget" that these episodes are part of the
      clinical picture of dementia. Conflicts related to incontinence and other
      difficult-to-manage symptoms eventually lead to a variety of
      interpretations, and past history of the emotional relationship between
      the elderly and the family caregiver can cause older emotional issues to
      surface again in these episodes.

       With psycho-functional limitations, new demands arise that can be distressing for those who care because of affective involvement. Subjective constructions are fundamental elements in upkeeping the relationship of care 10 .

       Besides the psychological aspect involved in the loss of identity and the specific cognitive aspects of dementia, some behavioral and psychiatric changes are important even in the consultation with the ESF professionals: psychotic symptoms, agitation and aggression, mood swings, disinhibited behavior and euphoria, apathy and insomnia. Some studies [11] [12] [13] pointed out the significant association between the presence of apathy and a faster cognitive and functional decline in these patients. Another very relevant situation regarding the appearance of neuropsychiatric symptoms is the association of these symptoms with the institutionalization and shorter patient survival. They also showed that the highest Neuropsychiatric Inventory (NPI) score was signifi-cantly associated with more severe cognitive impairment, greater caregiver distress, and higher cost, but was not associated with a formal diagnosis of dementia performed by the primary care physician.

       Changed behaviors and even risky behaviors, such as turning on the gas switch and not turning off, stirring in pots on a hot stove, or ingestion of liquids or toxic materials are situations in the face of neuropsychiatric manifestations in dementia. Filomena reports several neuropsychiatric symptoms of her husband. She compares his behavior to that of children who explore the environment to discover the cause and effect of things and the sensations obtained by the senses. Her role in this context resembles that of a mother trying to prevent the child from getting hurt: He lights up the gas switch, he's just like a child, sometimes he starts to eat the slipper, I have to get it out of his mouth.

       Hallucination is another neuropsychiatric symptom described by family caregivers. Joana reports that when the husband talks to people who have died, the family members feel fear and distance themselves. Filomena has fun when her mother speaks with those who have died: "She talks to those who have passed away, she sends the dog out, which does not exist". Each family caregiver experiences the symptoms presented by the dementia in a unique way, and ways to address and interpret this phenomenon and give meaning to their experience.

       The negative development of dementia perceived by Celina, Filomena, Maria, Teresa and Joana show that the disease follows a course that transcends the biological event itself. The dementia process evidences psychological and sociocultural constructions permeated by meanings and interpretations according to those who live and those who maintain interpersonal relationships with the elderly person with dementia.

       In the discourse of family caregivers, seniors with dementia have aggressive behaviors such as agitation, spitting, cursing, clawing, throwing objects, revealing a level of aggression that can impact the feelings and interpretations produced during the care routine. Freud 14 affirms that human instincts are of two types: Those who tend to preserve and unite, which we call 'erotic' [...] with a deliberate expansion of the popular conception of 'sexuality'; and those who tend to destroy and kill, which we group as an aggressive or destructive instinct. All actions in human life involve the confluence of these two instincts of preservation and destruction. The ideal situation for life in society would be the dominance of reason over the instinctual life controlling destructive impulses, which is utopian. In this perspective, aggressiveness is inherent in the human condition.

       In seniors with dementia with a declining psychological realm of the Self, the progressive loss of identity and the repercussion of cognitive decline, an actual decline in the rational realm of psychic life emerges. This decline refers to the cerebral aspect of inhibitory control and social cognition, showing that the emergence of aggressive behaviors is related to the biological component. The declining reason turns its demands and needs into instinctual acts and more basic reflexes, and can produce a continuous imbalance in the expression between the instincts of preservation and aggression.

       Aggressiveness can be triggered by situations of frustration, when they do not get what they want, when they are afraid or consider some humiliating situation, when they are exposed to environmental overstimulation or feel any physical pain or side effects from medication.
  - text: >-
      Neurosurgery is of great interest to historians of medicine and technology
      because it is relatively young, because it developed in an era of journals
      and publications, because lines and traditions of training and mentorship
      are relatively clear, and because the technologies that enabled the
      evolution of the profession and acted as inflection points in the
      emergence of certain surgical approaches and procedures are at once well
      documented and remarkably unambiguous. To the extent that is the case for
      neurosurgery as a whole, it is even more so for surgery of the skull base.

       To trace the history of skull base surgery along its full expanse is to begin with Horsley and pituitary tumors (unless one wants to start even earlier with the treatment of trigeminal neuralgia); to move to Cushing's work in the same arena (but also that of many others as well); to emphasize the impact of microsurgical techniques and new imaging modalities; to outline once radically innovative, but now widely practiced anatomical approaches to the skull base; to emphasize the importance of team approaches; to discuss emerging therapeutic strategy as well as instrumentation and techniques; to acknowledge the importance of advances in neuroanesthesia and the medical and perioperative care of the neurosurgical patient; and to recognize the contributions of the many individuals who, over the past 25 years, have added to and furthered the field in these and other ways.

       It is not hard to point to leading individuals and important techniques. It is perhaps more difficult to frame them in a meaningful historical perspective because the work has occurred relatively recently, in the time frame historians call "near history." Difficulties arise from both an evaluative and a nosological standpoint. For example, from an evaluative standpoint, how does one stratify the relative importance of corticosteroids, osmotic diuretics, and CSF drainage techniques and technologies in the control of intracranial pressure and the facilitation of exposure for base of skull surgery? How does one think about the idea of hybrid surgery and stereotactic radiation? What will be the long-term view of anatomical approaches to giant basilar aneurysms in the light of endovascular surgery? Have we reached a tipping point in the management of vestibular schwannomas, given the availability of and the outcomes associated with stereotactic radiosurgery?

       From a nosological standpoint, should we think about base of skull surgery in terms of anatomical approaches? One textbook that does just that starts with subfrontal approaches and then moves around the calvaria and down to the petrous and temporal region in a Cook's tour of exposure, in the tradition of Henry's Extensile Exposure and comparable surgical classics. 1, 6 Other publications have explored a set of technologies. 5, 7, 10 Another focuses on the contribution of great men. 9 Many surgeons have written about specific particular pathologies at the skull base.

       Introduction their colleagues write about the premodern period. Elhadi and colleagues also comment on the introduction of radiography in early neurosurgery. Gross and Grossi and their colleagues concentrate on petrosal approaches; Schmitt and Jane on third ventriculostomy; and Chittiboina and colleagues on the history of a very simple but ubiquitous instrument, the Freer elevator, and its inventor. In contrast to the more comprehensive overviews written by Goodrich, Donald, and others, these essays concentrate on selected details. While it is important not to miss the forest for the trees, sometimes the trees are worth studying no less than the forest. 

       The authors report no conflict of interest.
  - text: |
      How do neuromediators contribute to the pathogenesis of pruritus in AD?
  - text: >-
      Pericardial effusion (PE) is a life-threatening condition, as accumulation
      of fluid in the pericardial sac can lead to cardiac tamponade and fatal
      shock. 1, 2 PE is often associated with an underlying disease or
      condition, and the causes can vary widely. 3, 4 Pericardiocentesis
      performed by needle (with or without echoguidance), and various surgical
      procedures (including subxiphoid pericardial tube drainage, pericardial
      window performed through a left anterior thoracotomy, or video-assisted
      thoracoscopic surgery) can alleviate PE. 5 Our retrospective clinical
      experiences of treating PE with subxiphoid pericardiostomy are presented
      in this study.

       We reviewed the medical records of patients who underwent subxiphoid pericardiostomy to treat persistent symptomatic PE in our clinic between 1990 and 2000. Echocardiography (ECG) was used to diagnose PE and N Becit, A Özyazicioglu, M Ceviz et al.

       determine the size of the effusion. A diastolic echo-free space of < 10 mm between the left ventricular posterior wall and pericardium was determined as mild PE, 10 -20 mm as moderate, and > 20 mm as severe PE. Patients with cardiac tamponade and/or moderate to severe PE were treated by subxiphoid pericardiostomy and tube drainage.

       Some patients with pre-operative tuberculosis were treated with an adult fourdrug regimen (isoniazid, 300 mg/day and rifampin, 600 mg/day for 12 months, streptomycin, 1 g/day for 2 months, and pyrazinamide, 2 g/day for 3 months) preoperatively. The effusion was drained after a 3-week course of anti-tuberculosis therapy. In these, and patients diagnosed with tuberculous pericarditis, the tuberculosis therapy regimen was given for 12 months post-operatively.

       The technique used for subxiphoid pericardiostomy (described previously 3 ) was performed under general anaesthetic, or local anaesthesia and sedation. General anaesthesia was preferred in children and was induced with 1.5 mg/kg ketamine. Neuromuscular block was achieved with 0.1 mg/kg vecuronium, and anaesthesia maintained with 60% N 2 O, 40% O 2 and 0.5 -1.0% isoflurane. Local anaesthetic (2% lidocaine solution) was injected into the dermal and subdermal layers, and sedation and analgesia was provided by 1 mg/kg ketamine intravenously. A piece of anterior pericardium, approximately 2 -4 cm in diameter, was excised under direct vision and submitted for histopathological analysis. The pericardial cavity was decompressed and fluid samples were collected for culture and cytological analysis. To prevent acute cardiac dilatation during decompression of the pericardial cavity, intravenous digoxin was administered and the pericardial cavity was decompressed gradually.

       The pericardial cavity was examined under direct vision and/or by digital examination to detect any tumour or adhesions. Gentle digital lysis of adhesions and opening of loculations were performed as needed, to enhance satisfactory drainage. A soft chest tube was placed in the pericardial cavity, lateral to the right ventricle, after pericardiotomy for post-operative drainage. It was connected to an underwater sealed system, and was removed when fluid drainage ceased.

       Patients with mild haemorrhagic effusion and cardiac tamponade, due to trauma or invasive cardiac interventions, were considered haemodynamically unstable and unsuitable for surgical subxiphoid pericardiostomy, even under local anaesthetic. These patients underwent pericardiocentesis in the intensive care unit, which provided immediate relief. Subxiphoid pericardiostomy was performed later if haemorrhagic PE persisted. Patients were followed, with physical examinations and ECG, in the outpatient clinic for at least 1 year.

       Numerical results are given as mean ± SD. Fisher's exact test was used to compare proportions between groups (comparison of the rates of recurrence and constriction between patient groups with uraemic pericarditis, tuberculous pericarditis and non-tuberculous bacterial pericarditis). The McNemar test was used for comparison of proportions within one group (to assess the significance of rates of recurrence and constriction in patients with tuberculous pericarditis). Statistical differences were considered significant if P < 0.05.
  - text: >-
      Henry M. Blumberg, MD In this issue of Infection Control and Hospital
      Epidemiology, a potpourri of tuberculosis (TB)-related articles are being
      published. 1-7 Tuberculosisrelated issues have been an important focus for
      the past decade for those in infection control and hospital epidemiology,
      especially in urban areas where the large majority of TB cases occur, 8
      but also, because of federal regulations, for those in low-endemic areas
      or areas where no TB cases occur (approximately half of the counties in
      the United States).

       The resurgence of TB beginning in the mid1980s in the United States (in large part, due to failure and underfunding of the public health infrastructure and to the epidemic of human immunodeficiency virus [HIV] infection) and outbreaks of TB have highlighted the risk of nosocomial transmission of TB. 9,10 These outbreaks affected both healthcare workers (HCWs) and patients. The fact that outbreaks in New York and Miami, among others, involved multidrug-resistant (MDR) strains that were associated with high morbidity and mortality among HIV-infected individuals punctuated the importance of effective TB infection control measures. Commingling of patients with unsuspected TB and those who were quite immunosuppressed led to amplification of nosocomial transmission. A decade ago, few institutions were prepared for the changing epidemiology of TB.

       Several recent studies have demonstrated that infection control measures are effective in preventing nosocomial transmission of TB, 11-13 and two reports in this issue, from institutions in Kentucky 1 and New York, 2 provide additional data on decreases in HCW tuberculin skin-test (TST) conversions following implementation of TB infection control measures. In most studies, multiple interventions (administrative controls, environmental controls, and respiratory protection) were initiated at approximately the same time, making it more difficult to identify the most crucial aspect of the program. The importance of TB infection control measures in contributing to the decline in TB cases in the United States, as well as the reduction in the number of MDR-TB cases in New York City, often has been understated. Increased federal funding for TB control activities and expansion of directly observed therapy clearly are important in efforts to prevent TB, but the initial decline in TB cases and in MDR TB in the United States beginning in 1993 likely was due, in large part, to interruption of TB transmission within healthcare facilities. Unfortunately, increased funding for TB control in the United States in the last 5 years often has not trickled down to inner-city hospitals, which frequently are the first line in the battle against TB.

       From our experience and that of others, it appears clear that administrative controls are the most important component of a TB infection control program. At Grady Memorial Hospital in Atlanta, we were able to decrease TB exposure episodes markedly and concomitantly to decrease HCW TST conversions after implementing an expanded respiratory isolation policy. 11 We continue to isolate appropriately approximately 95% of those subsequently diagnosed with TB. We were able to reduce TST conver-sion rates markedly during a period of time in which we had isolation rooms that would be considered suboptimal by Centers for Disease Control and Prevention (CDC) guidelines 14 (rooms that were under negative pressure but had less than six air changes per hour) and were using submicron masks. Implementation of better-engineered isolation rooms (>12 air changes per hour) with the completion of renovations to the hospital may have put us in better compliance with regulatory agencies and made the staff feel more secure, but has had little impact on further reducing low rates of HCW TST conversions. In addition, the termination of outbreaks and reduction of TST conversion rates at several institutions took place before introduction of National Institute for Occupational Safety and Health-approved masks and fit testing. 2,15,16 United States healthcare institutions are required by regulatory mandates to develop a "respiratory protection program" (including fit testing), which can be time-consuming, expensive, and logistically difficult. 17 Data published to date suggest that the impact of formal fit testing on proper mask use is small. 18 These federal mandates also have turned some well-meaning (trying to comply fully with the Occupational Safety and Health Administration [OSHA] regulations) but misguided infection control practitioners into "facial hair police." These types of processes divert time, effort, and resources away from what truly is effective in preventing nosocomial transmission of TB, as well as from other important infection control activities such as preventing nosocomial bloodstream infections or transmission of highly resistant pathogens such as vancomycin-resistant Enterococcus or preparing for the onslaught of vancomycin-resistant Staphylococcus aureus. At a time when US healthcare institutions are under enormous pressure due to healthcare reform, market forces, and managed care, it is essential that federal regulatory agencies look carefully at scientific data when issuing regulations.
datasets:
  - tomaarsen/miriad-4.4M-split
pipeline_tag: feature-extraction
library_name: sentence-transformers
metrics:
  - dot_accuracy@1
  - dot_accuracy@3
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@3
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@3
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@10
  - dot_mrr@10
  - dot_map@100
  - query_active_dims
  - query_sparsity_ratio
  - corpus_active_dims
  - corpus_sparsity_ratio
co2_eq_emissions:
  emissions: 59.41576277345894
  energy_consumed: 0.1528568486230041
  source: codecarbon
  training_type: fine-tuning
  on_cloud: false
  cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
  ram_total_size: 31.777088165283203
  hours_used: 0.431
  hardware_used: 1 x NVIDIA GeForce RTX 3090
model-index:
  - name: DistilBERT base trained on MIRIAD question-passage tuples
    results:
      - task:
          type: sparse-information-retrieval
          name: Sparse Information Retrieval
        dataset:
          name: miriad eval
          type: miriad_eval
        metrics:
          - type: dot_accuracy@1
            value: 0.7881
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.8885
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.9135
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 0.9374
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.7881
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.29616666666666663
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.1827
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09374
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.7881
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.8885
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.9135
            name: Dot Recall@5
          - type: dot_recall@10
            value: 0.9374
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.865908870046696
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.8426501984126957
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.8445697764573693
            name: Dot Map@100
          - type: query_active_dims
            value: 41.33399963378906
            name: Query Active Dims
          - type: query_sparsity_ratio
            value: 0.9986457637234195
            name: Query Sparsity Ratio
          - type: corpus_active_dims
            value: 219.5446014404297
            name: Corpus Active Dims
          - type: corpus_sparsity_ratio
            value: 0.9928070047362416
            name: Corpus Sparsity Ratio
      - task:
          type: sparse-information-retrieval
          name: Sparse Information Retrieval
        dataset:
          name: miriad test
          type: miriad_test
        metrics:
          - type: dot_accuracy@1
            value: 0.787
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.8825
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.9095
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 0.9362
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.787
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.29416666666666663
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.18190000000000003
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09362000000000002
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.787
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.8825
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.9095
            name: Dot Recall@5
          - type: dot_recall@10
            value: 0.9362
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.8636765666523286
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.8402174603174577
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.842249922303071
            name: Dot Map@100
          - type: query_active_dims
            value: 41.19380187988281
            name: Query Active Dims
          - type: query_sparsity_ratio
            value: 0.9986503570578638
            name: Query Sparsity Ratio
          - type: corpus_active_dims
            value: 221.07510375976562
            name: Corpus Active Dims
          - type: corpus_sparsity_ratio
            value: 0.9927568605019407
            name: Corpus Sparsity Ratio

DistilBERT base trained on MIRIAD question-passage tuples

This is a SPLADE Sparse Encoder model finetuned from distilbert/distilbert-base-uncased on the miriad-4.4_m-split dataset using the sentence-transformers library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

  • Model Type: SPLADE Sparse Encoder
  • Base model: distilbert/distilbert-base-uncased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 30522 dimensions
  • Similarity Function: Dot Product
  • Training Dataset:
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SparseEncoder(
  (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False}) with MLMTransformer model: DistilBertForMaskedLM 
  (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("tomaarsen/splade-distilbert-base-uncased-miriad")
# Run inference
queries = [
    "How have infection control measures been effective in preventing nosocomial transmission of TB?\n",
]
documents = [
    'Henry M. Blumberg, MD In this issue of Infection Control and Hospital Epidemiology, a potpourri of tuberculosis (TB)-related articles are being published. 1-7 Tuberculosisrelated issues have been an important focus for the past decade for those in infection control and hospital epidemiology, especially in urban areas where the large majority of TB cases occur, 8 but also, because of federal regulations, for those in low-endemic areas or areas where no TB cases occur (approximately half of the counties in the United States).\n\n The resurgence of TB beginning in the mid1980s in the United States (in large part, due to failure and underfunding of the public health infrastructure and to the epidemic of human immunodeficiency virus [HIV] infection) and outbreaks of TB have highlighted the risk of nosocomial transmission of TB. 9,10 These outbreaks affected both healthcare workers (HCWs) and patients. The fact that outbreaks in New York and Miami, among others, involved multidrug-resistant (MDR) strains that were associated with high morbidity and mortality among HIV-infected individuals punctuated the importance of effective TB infection control measures. Commingling of patients with unsuspected TB and those who were quite immunosuppressed led to amplification of nosocomial transmission. A decade ago, few institutions were prepared for the changing epidemiology of TB.\n\n Several recent studies have demonstrated that infection control measures are effective in preventing nosocomial transmission of TB, 11-13 and two reports in this issue, from institutions in Kentucky 1 and New York, 2 provide additional data on decreases in HCW tuberculin skin-test (TST) conversions following implementation of TB infection control measures. In most studies, multiple interventions (administrative controls, environmental controls, and respiratory protection) were initiated at approximately the same time, making it more difficult to identify the most crucial aspect of the program. The importance of TB infection control measures in contributing to the decline in TB cases in the United States, as well as the reduction in the number of MDR-TB cases in New York City, often has been understated. Increased federal funding for TB control activities and expansion of directly observed therapy clearly are important in efforts to prevent TB, but the initial decline in TB cases and in MDR TB in the United States beginning in 1993 likely was due, in large part, to interruption of TB transmission within healthcare facilities. Unfortunately, increased funding for TB control in the United States in the last 5 years often has not trickled down to inner-city hospitals, which frequently are the first line in the battle against TB.\n\n From our experience and that of others, it appears clear that administrative controls are the most important component of a TB infection control program. At Grady Memorial Hospital in Atlanta, we were able to decrease TB exposure episodes markedly and concomitantly to decrease HCW TST conversions after implementing an expanded respiratory isolation policy. 11 We continue to isolate appropriately approximately 95% of those subsequently diagnosed with TB. We were able to reduce TST conver-sion rates markedly during a period of time in which we had isolation rooms that would be considered suboptimal by Centers for Disease Control and Prevention (CDC) guidelines 14 (rooms that were under negative pressure but had less than six air changes per hour) and were using submicron masks. Implementation of better-engineered isolation rooms (>12 air changes per hour) with the completion of renovations to the hospital may have put us in better compliance with regulatory agencies and made the staff feel more secure, but has had little impact on further reducing low rates of HCW TST conversions. In addition, the termination of outbreaks and reduction of TST conversion rates at several institutions took place before introduction of National Institute for Occupational Safety and Health-approved masks and fit testing. 2,15,16 United States healthcare institutions are required by regulatory mandates to develop a "respiratory protection program" (including fit testing), which can be time-consuming, expensive, and logistically difficult. 17 Data published to date suggest that the impact of formal fit testing on proper mask use is small. 18 These federal mandates also have turned some well-meaning (trying to comply fully with the Occupational Safety and Health Administration [OSHA] regulations) but misguided infection control practitioners into "facial hair police." These types of processes divert time, effort, and resources away from what truly is effective in preventing nosocomial transmission of TB, as well as from other important infection control activities such as preventing nosocomial bloodstream infections or transmission of highly resistant pathogens such as vancomycin-resistant Enterococcus or preparing for the onslaught of vancomycin-resistant Staphylococcus aureus. At a time when US healthcare institutions are under enormous pressure due to healthcare reform, market forces, and managed care, it is essential that federal regulatory agencies look carefully at scientific data when issuing regulations.',
    'Drug Reaction with Eosinophilia and Systemic Symptoms (DRESS) syndrome is a severe and potentially life-threatening hypersensitivity reaction caused by exposure to certain medications (Phillips et al., 2011; Bocquet et al., 1996) . It is extremely heterogeneous in its manifestation but has characteristic delayed-onset cutaneous and multisystem features with a protracted natural history. The reaction typically starts with a fever, followed by widespread skin eruption of variable nature. This progresses to inflammation of internal organs such as hepatitis, pneumonitis, myocarditis and nephritis, and haematological abnormalities including eosinophilia and atypical lymphocytosis (Kardaun et al., 2013; Cho et al., 2017) .\n\n DRESS syndrome is most commonly classified according to the international scoring system developed by the RegiSCAR group (Kardaun et al., 2013) . RegiSCAR accurately defines the syndrome by considering the major manifestations, with each feature scored between −1 and 2, and 9 being the maximum total number of points. According to this classification, a score of < 2 means no case, 2-3 means possible case, 4-5 means probable case, and 6 or above means definite DRESS syndrome. Table 1 gives an overview of the RegiSCAR scoring system. DRESS syndrome usually develops 2 to 6 weeks after exposure to the causative drug, with resolution of symptoms after drug withdrawal in the majority of cases (Husain et al., 2013a) . Some patients require supportive treatment with corticosteroids, although there is a lack of evidence surrounding the most effective dose, route and duration of the therapy (Adwan, 2017) . Although extremely rare, with an estimated population risk of between 1 and 10 in 10,000 drug exposures, it is significant due to its high mortality rate, at around 10% (Tas and The pathogenesis of DRESS syndrome remains largely unknown. Current evidence suggests that patients may be genetically predisposed to this form of hypersensitivity, with a superimposed risk resulting from Human Herpes Virus (HHV) exposure and subsequent immune reactivation (Cho et al., 2017; Husain et al., 2013a) . In fact, the serological detection of HHV-6 has even been proposed as an additional diagnostic marker for DRESS syndrome (Shiohara et al., 2007) . Other potential risk factors identified are family history (Sullivan and Shear, 2001; Pereira De Silva et al., 2011) and concomitant drug use, particularly antibiotics . DRESS syndrome appears to occur in patients of any age, with patient demographics from several reviews finding age ranges between 6 and 89 years (Picard et al., 2010; Kano et al., 2015; Cacoub et al., 2013) . DRESS syndrome was first described as an adverse reaction to antiepileptic therapy, but has since been recognised as a complication of an extremely wide range of medications (Adwan, 2017) . In rheumatology, it has been classically associated with allopurinol and sulfasalazine, but has also been documented in association with many other drugs including leflunomide, hydroxychloroquine, febuxostat and NSAIDs (Adwan, 2017) . Recent evidence has also identified a significant risk of DRESS syndrome with strontium ranelate use (Cacoub et al., 2013) . Thus far, that is the only anti-osteoporotic drug associated with DRESS syndrome, although there are various cases of other adverse cutaneous reactions linked to anti-osteoporotic medications, ranging from benign maculopapular eruption to Stevens-Johnson syndrome (SJS) and Toxic Epidermal Necrolysis (TEN) . Denosumab, an antiresorptive RANK ligand (RANKL) inhibitor licensed for osteoporosis, is currently known to be associated with some dermatological manifestations including dermatitis, eczema, pruritus and, less commonly, cellulitis (Prolia, n.d.).\n\n We hereby describe the first documented case of DRESS syndrome associated with denosumab treatment.\n\n The patient is a 76-year old female with osteoporosis and a background of alcoholic fatty liver disease and lower limb venous insufficiency. Osteoporosis was first diagnosed in 2003 and treated with risedronate, calcium and vitamin D, until 2006. While on this treatment, the patient sustained T12 and L3 fractures, the latter treated with kyphoplasty, and was therefore deemed a non-responder to risedronate.',
    "The regulation of these events is known to go awry in certain pathologies especially in diseases associated with neurodegeneration. Mitochondrial fission helps to enhance the number of mitochondria, which can be efficiently distributed to each corner of neuronal cells and thus helps them to maintain their energy demands. Mitochondrial fission is highly essential during the periods of energy starvation to produce new, efficient mitochondrial energy generating systems. However, enhanced fission associated with bioenergetic crisis causes BAX foci formation on mitochondrial membrane and thus causes mitochondrial outer membrane permeabilization (MOMP), releasing cytochrome c and other pro apoptotic mediators into cytosol, results in apoptosis [93] . Impairment in the mitochondrial dynamics has also been observed in case of inflammatory neuropathies and oxaliplatin induced neuropathy [94] . Excessive nitric oxide is known to cause s-nitrosylation of dynamin related protein-1 (Drp-1), and increases the mitochondrial fission [95, 96] . Tumor necrosis factor-α (TNF-α) reported to inhibit the kinensin 1 protein, and thus impairs trafficking by halting mitochondrial movement along axons [97] . In addition to impaired dynamics, aggregates of abnormal shaped, damaged mitochondria are responsible for aberrant mitochondrial trafficking, which contributes to axonal degeneration observed in various peripheral neuropathies [81] .\n\n Autophagy is the discerning cellular catabolic process responsible for recycling the damaged proteins/ organelles in the cells [98] . Mitophagy is a selective autophagic process involved in recycling of damaged mitochondria and helps in supplying the constituents for mitochondrial biogenesis [99] . Excessive accumulation and impaired clearance of dysfunctional mitochondria are known to be observed in various disorders associated with oxidative stress [100] . Oxidative damage to Atg 4, a key component involved in mitophagy causes impaired autophagosome formation and clearance of damaged mitochondria [101] . Loss in the function of molecular chaperons and associated accumulation of damaged proteins are known to be involved in various peripheral neuropathies including trauma induced neuropathy [102, 103] . A model of demyelinating neuropathy corresponds to the accumulation of improperly folded myelin protein PMP-22 is also being observed recently [104, 105] .\n\n Mitochondrial dysfunction and associated disturbances are well connected to neuroinflammatory changes that occur in various neurodegenerative diseases [106] . Dysfunctional mitochondria are also implicated in several pathologies such as cardiovascular and neurodegenerative diseases. Several mitochondrial toxins have been found to inhibit the respiration in microglial cells and also inhibit IL-4 induced alternative anti inflammatory response and thus potentiates neuroinflammation [107] . Mitochondrial ROS are well identified to be involved in several inflammatory pathways such as NF-κB, MAPK activation [108] . Similarly, the pro inflammatory mediators released as a result of an inflammatory episode found to be interfere with the functioning of the mitochondrial electron transport chain and thus compromise ATP production [109] . TNF-α is known to inhibit the complex I, IV of ETC and decreases energy production. Nitric oxide (NO) is a potent inhibitor of cytochrome c oxidase (complex IV) and similarly IL-6 is also known to enhance mitochondrial generation of superoxide [110] . Mitochondrial dysfunction initiates inflammation by increased formation of complexes of damaged mitochondrial parts and cytoplasmic pattern recognition receptors (PRR's). The resulting inflammasome directed activation of interleukin-1β production, which starts an immune response and leads to Fig. (4) . Mitotoxicity in peripheral neuropathies: Various pathophysiological insults like hyperglycemic, chemotherapeutic and traumatic injury to the peripheral nerves results in mitochondrial dysfunction through enhanced generation of ROS induced biomolecular damage and bioenergetic crisis. Following the nerve injury accumulation of mitochondria occurs resulting in the release of mtDNA & formyl peptides into circulation which acts as Death associated molecular patterns (DAMP's). These are recognized by immune cells as foreign bodies and can elicit a local immune/inflammatory response. Interaction between inflammatory mediators and structural proteins involved in mitochondrial trafficking will cause impairment in mitochondrial motility. Oxidative stress induced damage to the mt proteins like Atg4, Parkin etc cause insufficient mitophagy. Excess nitrosative stress also results in excessive mt fission associated with apoptosis. In addition, mtDNA damage impairs its transcription and reduces mitochondrial biogenesis. Ca 2+ dyshomeostasis, loss in mitochondrial potential and bioenergetic crisis cause neuronal death via apoptosis/necrosis. All these modifications cause defects in ultra structure, physiology and trafficking of mitochondria resulting in loss of neuronal function producing peripheral neuropathy.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[42.9040,  1.9962,  2.1958]])

Evaluation

Metrics

Sparse Information Retrieval

Metric miriad_eval miriad_test
dot_accuracy@1 0.7881 0.787
dot_accuracy@3 0.8885 0.8825
dot_accuracy@5 0.9135 0.9095
dot_accuracy@10 0.9374 0.9362
dot_precision@1 0.7881 0.787
dot_precision@3 0.2962 0.2942
dot_precision@5 0.1827 0.1819
dot_precision@10 0.0937 0.0936
dot_recall@1 0.7881 0.787
dot_recall@3 0.8885 0.8825
dot_recall@5 0.9135 0.9095
dot_recall@10 0.9374 0.9362
dot_ndcg@10 0.8659 0.8637
dot_mrr@10 0.8427 0.8402
dot_map@100 0.8446 0.8422
query_active_dims 41.334 41.1938
query_sparsity_ratio 0.9986 0.9987
corpus_active_dims 219.5446 221.0751
corpus_sparsity_ratio 0.9928 0.9928

Training Details

Training Dataset

miriad-4.4_m-split

  • Dataset: miriad-4.4_m-split at 596b9ab
  • Size: 100,000 training samples
  • Columns: question and passage_text
  • Approximate statistics based on the first 1000 samples:
    question passage_text
    type string string
    details
    • min: 9 tokens
    • mean: 23.38 tokens
    • max: 71 tokens
    • min: 511 tokens
    • mean: 512.0 tokens
    • max: 512 tokens
  • Samples:
    question passage_text
    What factors may contribute to increased pulmonary conduit durability in patients who undergo the Ross operation compared to those with right ventricular outflow tract obstruction?
    I n 1966, Ross and Somerville 1 reported the first use of an aortic homograft to establish right ventricle-to-pulmonary artery continuity in a patient with tetralogy of Fallot and pulmonary atresia. Since that time, pulmonary position homografts have been used in a variety of right-sided congenital heart lesions. Actuarial 5-year homograft survivals for cryopreserved homografts are reported to range between 55% and 94%, with the shortest durability noted in patients less than 2 years of age. 4 Pulmonary position homografts also are used to replace pulmonary autografts explanted to repair left-sided outflow disease (the Ross operation). Several factors may be likely to favor increased pulmonary conduit durability in Ross patients compared with those with right ventricular outflow tract obstruction, including later age at operation (allowing for larger homografts), more normal pulmonary artery architecture, absence of severe right ventricular hypertrophy, and more natural positioning of ...
    How does MCAM expression in hMSC affect the growth and maintenance of hematopoietic progenitors? After culture in a 3-dimensional hydrogel-based matrix, which constitutes hypoxic conditions, MCAM expression is lost. Concordantly, Tormin et al. demonstrated that MCAM is down-regulated under hypoxic conditions. 10 Furthermore, it was shown by others and our group that oxygen tension causes selective modification of hematopoietic cell and mesenchymal stromal cell interactions in co-culture systems as well as influence HSPC metabolism. [44] [45] [46] Thus, the observed differences between Sharma et al. and our data in HSPC supporting capacity of hMSC are likely due to the different culture conditions used. Further studies are required to clarify the influence of hypoxia in our model system. Altogether these findings provide further evidence for the importance of MCAM in supporting HSPC. Furthermore, previous reports have shown that MCAM is down-regulated in MSC after several passages as well as during aging and differentiation. 19, 47 Interestingly, MCAM overexpression in hMSC enhance...
    What is the relationship between Fanconi anemia and breast and ovarian cancer susceptibility genes?
    ( 31 ) , of which 5% -10 % may be caused by genetic factors ( 32 ) , up to half a million of these patients may be at risk of secondary hereditary neoplasms. The historic observation of twofold to fi vefold increased risks of cancers of the ovary, thyroid, and connective tissue after breast cancer ( 33 ) presaged the later syndromic association of these tumors with inherited mutations of BRCA1, BRCA2, PTEN, and p53 ( 16 ) . By far the largest cumulative risk of a secondary cancer in BRCA mutation carriers is associated with cancer in the contralateral breast, which may reach a risk of 29.5% at 10 years ( 34 ) . The Breast Cancer Linkage Consortium ( 35 , 36 ) also documented threefold to fi vefold increased risks of subsequent cancers of prostate, pancreas, gallbladder, stomach, skin (melanoma), and uterus in BRCA2 mutation carriers and twofold increased risks of prostate and pancreas cancer in BRCA1 mutation carriers; these results are based largely on self-reported family history inf...
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
        "lambda_corpus": 3e-05,
        "lambda_query": 5e-05
    }
    

Evaluation Dataset

miriad-4.4_m-split

  • Dataset: miriad-4.4_m-split at 596b9ab
  • Size: 1,000 evaluation samples
  • Columns: question and passage_text
  • Approximate statistics based on the first 1000 samples:
    question passage_text
    type string string
    details
    • min: 8 tokens
    • mean: 23.55 tokens
    • max: 74 tokens
    • min: 512 tokens
    • mean: 512.0 tokens
    • max: 512 tokens
  • Samples:
    question passage_text
    What are some hereditary cancer syndromes that can result in various forms of cancer?
    Hereditary Cancer Syndromes, including Hereditary Breast and Ovarian Cancer (HBOC) and Lynch Syndrome (LS), can result in various forms of cancer due to germline mutations in cancer predisposition genes. While the major contributory genes for these syndromes have been identified and well-studied (BRCA1/ BRCA2 for HBOC and MSH2/MSH6/MLH1/PMS2/ EPCAM for LS), there remains a large percentage of associated cancer cases that are negative for germline mutations in these genes, including 80% of women with a personal or family history of breast cancer who are negative for BRCA1/2 mutations [1] . Similarly, between 30 and 50% of families fulfill stringent criteria for LS and test negative for germline mismatch repair gene mutations [2] . Adding complexity to these disorders is the significant overlap in the spectrum of cancers observed between various hereditary cancer syndromes, including many cancer susceptibility syndromes. Some that contribute to elevated breast cancer risk include Li-Frau...
    How do MAK-4 and MAK-5 exert their antioxidant properties?
    Hybrid F1 mice were injected with urethane (300 mg/kg) at 8 days of age. A group was then put on a MAK-supplemented diet, another group was fed a standard pellet diet. At 36 weeks of age the mice were sacrificed and the livers examined for the presence of tumors mouse (Panel A) and for the number of nodules per mouse (Panel B) (* p < 0.05, ** P < 0.001). Statistical analysis was performed by Two Way ANOVA Test followed by Post Hoc Bonferroni analysis.

    We than measured the influence of the MAK-4+5 combination on the expression of the three liver-specific connexins (cx26, cx32, and cx43). The level of cx26 expression was similar in all the groups of mice treated with the MAK-supplemented diet and in the control (Figure 4, Panel A) . A significant, time-dependent increase in cx32 was observed in the liver of all the groups of MAK treated mice compared to the normal diet-fed controls. Cx32 expression increased 2-fold after 1 week of treatment, and 3-to 4-fold at 3 months (Figure 4, Pane...
    What are the primary indications for a decompressive craniectomy, and what role does neurocritical care play in determining the suitability of a patient for this procedure? Decompressive craniectomy is a valid neurosurgical strategy now a day as an alternative to control an elevated intracranial pressure (ICP) and controlling the risk of uncal and/or subfalcine herniation, in refractory cases to the postural, ventilator, and pharmacological measures to control it. The neurocritical care and the ICP monitorization are key determinants to identify and postulate the inclusion criteria to consider a patient as candidate to this procedure, as it is always considered a rescue surgical technique. Head trauma and ischemic or hemorrhagic cerebrovascular disease with progressive deterioration due to mass effect are some of the cases that may require a decompressive craniectomy with its different variants. However, this procedure per se can have complications described in the postcraniectomy syndrome and may occur in short, medium, or even long term.

    1,2 The paradoxical herniation is a condition in which there is a deviation of the midline with mass effect, even t...
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
        "lambda_corpus": 3e-05,
        "lambda_query": 5e-05
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss miriad_eval_dot_ndcg@10 miriad_test_dot_ndcg@10
0.032 200 92.8349 - - -
0.064 400 0.6065 - - -
0.096 600 0.165 - - -
0.128 800 0.1471 - - -
0.16 1000 0.1321 0.1521 0.8001 -
0.192 1200 0.1269 - - -
0.224 1400 0.1015 - - -
0.256 1600 0.1049 - - -
0.288 1800 0.0893 - - -
0.32 2000 0.0921 0.0723 0.8505 -
0.352 2200 0.0845 - - -
0.384 2400 0.106 - - -
0.416 2600 0.0679 - - -
0.448 2800 0.0934 - - -
0.48 3000 0.1054 0.0810 0.8554 -
0.512 3200 0.1052 - - -
0.544 3400 0.0883 - - -
0.576 3600 0.0809 - - -
0.608 3800 0.0791 - - -
0.64 4000 0.0535 0.0688 0.8608 -
0.672 4200 0.0718 - - -
0.704 4400 0.0633 - - -
0.736 4600 0.0657 - - -
0.768 4800 0.0648 - - -
0.8 5000 0.0721 0.0662 0.8651 -
0.832 5200 0.0637 - - -
0.864 5400 0.0637 - - -
0.896 5600 0.0553 - - -
0.928 5800 0.062 - - -
0.96 6000 0.0542 0.0625 0.8663 -
0.992 6200 0.0595 - - -
-1 -1 - - 0.8659 0.8637

Environmental Impact

Carbon emissions were measured using CodeCarbon.

  • Energy Consumed: 0.153 kWh
  • Carbon Emitted: 0.059 kg of CO2
  • Hours Used: 0.431 hours

Training Hardware

  • On Cloud: No
  • GPU Model: 1 x NVIDIA GeForce RTX 3090
  • CPU Model: 13th Gen Intel(R) Core(TM) i7-13700K
  • RAM Size: 31.78 GB

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 4.2.0.dev0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.1
  • Datasets: 2.21.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

SpladeLoss

@misc{formal2022distillationhardnegativesampling,
      title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
      author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
      year={2022},
      eprint={2205.04733},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2205.04733},
}

SparseMultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
    }