Custom 4-bit Finetuning 5-7 times faster inference than QLora
pinned
1
#9 opened over 1 year ago
by
rmihaylov
Update tokenizer_config.json
#92 opened 15 days ago
by
Snanni
Update tokenizer_config.json
1
#91 opened 23 days ago
by
Maryammmmmm
falcon-40b-instruct error on Inference endpoint while deploying
#90 opened about 2 months ago
by
digitalsanjeev
![](https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/EXw0r9aqyC1AC7khjiKz3.png)
AI World
#89 opened 10 months ago
by
MohammadMuzamil
Adding `safetensors` variant of this model
#88 opened 12 months ago
by
Dennison33
![](https://cdn-avatars.huggingface.co/v1/production/uploads/65d59219acaead978d4db1a8/lIkOapjkgkG0dEVrZnANU.jpeg)
combining falcon 40b instruct with langchain
#87 opened about 1 year ago
by
rra21
Update generation_config.json
1
#85 opened over 1 year ago
by
nkasmanoff
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60d3850107da9c17c7270912/WzhEbEvjunrDJ2IpdOxtZ.png)
Update generation_config.json
1
#84 opened over 1 year ago
by
nkasmanoff
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60d3850107da9c17c7270912/WzhEbEvjunrDJ2IpdOxtZ.png)
Getting gibberish output with Falcon-40b instruct
2
#83 opened over 1 year ago
by
harsh244
Falcon 40B Inference on GKE Autopilot A100 40GB
3
#82 opened over 1 year ago
by
bshongwe
Adding `safetensors` variant of this model
#81 opened over 1 year ago
by
Flolight
Adding `safetensors` variant of this model
2
#80 opened over 1 year ago
by
Flolight
CPU or GPU
1
#76 opened over 1 year ago
by
lalit34
Optimizing Inference Time for Chat Conversations on Falcon
#73 opened over 1 year ago
by
humza-sami
![](https://cdn-avatars.huggingface.co/v1/production/uploads/633d6d4f48ab6a0add2ce1a3/qTO75kR0hk1Yn1SaP7ZPb.jpeg)
Use input attention mask instead of casual mask in attention
#72 opened over 1 year ago
by
CyberZHG
is there a way to not use trust_remote = True
#71 opened over 1 year ago
by
momentumhd
Unable to load and run finetuned falcon model
#70 opened over 1 year ago
by
DioulaD
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6375da34e3413701a9f141a7/M-dT9IkhIM-8DzRfnbAMo.jpeg)
Parameters contains nan numbers when loading model locally
#69 opened over 1 year ago
by
yunsxie
ValueError: sharded is not supported for AutoModel ERROR
8
#68 opened over 1 year ago
by
peyers
ValueError in KoboldAI when loading the model
1
#66 opened over 1 year ago
by
JermemyHaschal
Cannot set "instructions" when invoking inference endpoint
1
#65 opened over 1 year ago
by
aruana
Changes in modelling_RW.py to be able to handle past_key_values for faster model generations
#64 opened over 1 year ago
by
puru22
Model sometimes generates '</s>'
1
#63 opened over 1 year ago
by
jlzhou
Correct blogpost link
#62 opened over 1 year ago
by
isydmr
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1643832931386-noauth.png)
Error: ShardCannotStart
#61 opened over 1 year ago
by
Bhupesh2003
Finetuning Falcon-40B-Instruct For ChatBot Use Case
1
#59 opened over 1 year ago
by
sdkramer10
Adding `safetensors` variant of this model
2
#58 opened over 1 year ago
by
nth-attempt
Add `tokenizer_class` to get `pipeline` to load tokenizer
#57 opened over 1 year ago
by
chiragjn
![](https://cdn-avatars.huggingface.co/v1/production/uploads/5f4b82d579c1ba4c353d129f/00rPv_FRVN0QtBYjrgcJz.png)
Adding `safetensors` variant of this model
#56 opened over 1 year ago
by
shayan
ValueError: Error raised by inference API: Model tiiuae/falcon-40b-instruct time out using HuggingFaceHub
1
#55 opened over 1 year ago
by
nicoleds
Question about Apache 2.0 license
2
#54 opened over 1 year ago
by
psinger
![](https://cdn-avatars.huggingface.co/v1/production/uploads/636d18755aaed143cd6698ef/AalDh13Gp8jv1BfM5IASh.png)
Running the Falcon-40B-Instruct model on Azure Kubernetes Service
#53 opened over 1 year ago
by
zioproto
Experimental ggml demos
2
#52 opened over 1 year ago
by
matthoffner
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6424f28ef1d18f46decd414c/ytq223nXqxJB1gI2CoF81.png)
Truncated output from API call through langchain
4
#51 opened over 1 year ago
by
TMTechnology
Experiences with complex instructions
1
#50 opened over 1 year ago
by
Tuana
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1648853347063-617721aa4ce8f8cb2c2c7497.jpeg)
Update README.md
#49 opened over 1 year ago
by
saattrupdan
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1624975632470-60d368a613f774189902f555.jpeg)
Why Rotary Positional Embeddings Over Alibi?
#48 opened over 1 year ago
by
mallorbc
About Input validation error: `inputs` tokens + `max_new_tokens` must be <= 1512.
3
#47 opened over 1 year ago
by
Holynull
![](https://cdn-avatars.huggingface.co/v1/production/uploads/647803cd6e6c7ac608c55b29/JycqSxVxHDV43coJ2_7BG.jpeg)
is Alibi version available for fine tuning to a large context window?
3
#46 opened over 1 year ago
by
run
Finetune Falcon-4b with large token size.
2
#44 opened over 1 year ago
by
amnasher
Model returns entire input prompt together with output
11
#43 opened over 1 year ago
by
andee96
Instruction prompt
3
#42 opened over 1 year ago
by
mazzaqq
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60cc5366f8e062152b10e25f/yTlP7AMXUAxQePifxpbBT.jpeg)
Update README.md
#41 opened over 1 year ago
by
zagg8705
Arabic Language support
2
#40 opened over 1 year ago
by
Hgdawy
Request: DOI
#39 opened over 1 year ago
by
ongkn
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1676674250492-63a9009d2e05ca32e3511832.png)
what is the input token length of Falcon-40B and -7B models?
3
#38 opened over 1 year ago
by
sermolin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/62e1a37bcb1f164f2cb4b618/Z3WrVNpmodffUWjn4bV4K.png)
AttributeError: 'RWConfig' object has no attribute 'n_hea'
2
#36 opened over 1 year ago
by
ibrim
cuda error on more than 400 words
#35 opened over 1 year ago
by
a749734
test case one
1
#33 opened over 1 year ago
by
FALCONBoy