slprl
/

slam

Audio-to-Audio

Transformers

Safetensors

speech_language_model

Model card Files Files and versions Community

gallilmaimon commited on Feb 19

Commit

a10a98b

verified ·

1 Parent(s): b003bad

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -33,13 +33,13 @@ The model was trained by next-token prediction over a subset of LibriSpeech, Lib
 ### Model Sources
-- **Repository:** [https://github.com/slp-rl/slam](https://github.com/slp-rl/slam)
 - **Paper:** [Soon!]
 - **Demo:** [Link](https://pages.cs.huji.ac.il/adiyoss-lab/slamming/)
 ## Uses
-This is a base SpeechLM and as such can be used to generate contiuations for speech segments, or as base for further tuning. See the _slam_
-[codebase](https://github.com/slp-rl/slam) for more details on usage, and checkout the [demo page](https://pages.cs.huji.ac.il/adiyoss-lab/slamming/) for some generation examples
 ### Out-of-Scope Use
 This model was trained on curated speech datasets which contain mainly audio-books and stories, as such the outputs should not be treated as factual in any way.
@@ -47,7 +47,7 @@ This model was trained on curated speech datasets which contain mainly audio-boo
 ## How to Get Started with the Model
-We refer users to the official repository for full usage explainations - [github](https://github.com/slp-rl/slam).
 ## Training Details
@@ -62,12 +62,12 @@ dataset [SpokenSwag](https://huggingface.co/datasets/slprl/SpokenSwag).
 ### Training Procedure
 This model was trained by next token prediction over several dataset, and then trained with DPO over [SpokenSwag](https://huggingface.co/datasets/slprl/SpokenSwag).
-Please refer to the [paper]() or [code](https://github.com/slp-rl/slam) for the full training recipes.
 #### Preprocessing
 Speech tokens are extracted from the audio using [Hubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz), and quantised using the
 official kmeans released with the model in [textlesslib](https://github.com/facebookresearch/textlesslib/tree/main). Units are de-duplicated.
-We encourage you to explore the official repository for full details - [github](https://github.com/slp-rl/slam).
 ## Evaluation
@@ -92,7 +92,7 @@ This model was trained as part of ["*Slamming*: Training a Speech Language Model
 This model was trained using **only a single Nvidia A5000 GPU**, 16 CPU cores and 24 GB of RAM for **24 hours**.
 #### Software
-The model was trained using the [*Slam*](https://github.com/slp-rl/slam) codebase which builds upon 🤗transformers extending it to support
 easy and efficent training of Speech Language Models.
 ## Citation

 ### Model Sources
+- **Repository:** [https://github.com/slp-rl/slamkit](https://github.com/slp-rl/slamkit)
 - **Paper:** [Soon!]
 - **Demo:** [Link](https://pages.cs.huji.ac.il/adiyoss-lab/slamming/)
 ## Uses
+This is a base SpeechLM and as such can be used to generate contiuations for speech segments, or as base for further tuning. See the _SlamKit_
+[codebase](https://github.com/slp-rl/slamkit) for more details on usage, and checkout the [demo page](https://pages.cs.huji.ac.il/adiyoss-lab/slamming/) for some generation examples
 ### Out-of-Scope Use
 This model was trained on curated speech datasets which contain mainly audio-books and stories, as such the outputs should not be treated as factual in any way.
 ## How to Get Started with the Model
+We refer users to the official repository for full usage explainations - [github](https://github.com/slp-rl/slamkit).
 ## Training Details
 ### Training Procedure
 This model was trained by next token prediction over several dataset, and then trained with DPO over [SpokenSwag](https://huggingface.co/datasets/slprl/SpokenSwag).
+Please refer to the [paper]() or [code](https://github.com/slp-rl/slamkit) for the full training recipes.
 #### Preprocessing
 Speech tokens are extracted from the audio using [Hubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz), and quantised using the
 official kmeans released with the model in [textlesslib](https://github.com/facebookresearch/textlesslib/tree/main). Units are de-duplicated.
+We encourage you to explore the official repository for full details - [github](https://github.com/slp-rl/slamkit).
 ## Evaluation
 This model was trained using **only a single Nvidia A5000 GPU**, 16 CPU cores and 24 GB of RAM for **24 hours**.
 #### Software
+The model was trained using the [*SlamKit*](https://github.com/slp-rl/slamkit) codebase which builds upon 🤗transformers extending it to support
 easy and efficent training of Speech Language Models.
 ## Citation