AdamG012
/

chat-opt-1.3b-rlhf-actor-deepspeed

Text Generation

text-generation-inference

Model card Files Files and versions

AdamG012 commited on Apr 25, 2023

Commit

cc519bc

·

1 Parent(s): c1541c7

Update README.md

Files changed (1) hide show

README.md +0 -19

README.md CHANGED Viewed

@@ -75,25 +75,6 @@ This pipeline can be broken up into three key steps:
-## Why did we choose DeepSpeed?
-**DeepSpeed Training:**
-The `main.py` Python code take the DeepSpeed config with the argument `--deepspeed_config ./ds_config.json`.
-We read up on the DeepSpeed documentation and created a specific coniguration based on their work. The json file `ds_config.json` here is set to take the [ZeRO-2](https://www.microsoft.com/en-us/research/blog/ZeRO-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/) stage and FP16, allowing must faster training and GPU memory saving. Note that ZeRO-2 is just one of the examples using our DeepSpeed. You may use ZeRO-1, Zero-3, ZeRO-Offload and ZeRO-infinity. For more information on DeepSpeed ZeRO family, please see this [tutorial link](https://www.deepspeed.ai/tutorials/zero/) for Zero-1/2/3 and this [tutorial ](https://www.deepspeed.ai/tutorials/zero-offload/)for Zero-Offload.
-To enable the DeepSpeed Zero family training, we injected several lines of code in order to enable this i.e.:
-```python
-model, optimizer, _, lr_scheduler = deepspeed.initialize(model=model, \
-  optimizer=optimizer, \
-  args=args,       \
-  lr_scheduler=lr_scheduler,  \
-  dist_init_required=True)
-```
 ## **Acknowledgements**
 We thank the following papers and open-source repositories. We especially thank DeepSpeed for their frameworks as well.


75
76
77



















78	## Acknowledgements
79
80	We thank the following papers and open-source repositories. We especially thank DeepSpeed for their frameworks as well.