23 4 91

Shuyue Jia (Bruce)

shuyuej

https://shuyuej.com

SuperBruceJia

AI & ML interests

A Ph.D. Student at @vkola-lab, Boston University. Passionate about Large Language Models (LLMs), Multimodal Foundation Models, Generative AI, and Medical AI.

Organizations

New activity in shuyuej/MedLLaMA3-70B-base-INT2-GPTQ 4 months ago

not run

#1 opened 4 months ago by

rakmik

New activity in meta-llama/Llama-3.3-70B-Instruct 7 months ago

What Happens If the Prompt Exceeds 8,196 Tokens? And difference between input limit and context length limit?

👀 1

#36 opened 7 months ago by

averyyu99

quant versions?

👍 5

#12 opened 7 months ago by

apol

New activity in TheBloke/h2ogpt-research-oasst1-llama-65B-GPTQ 7 months ago

RecursionError: maximum recursion depth exceeded

#1 opened about 2 years ago by

WajihUllahBaig

New activity in shuyuej/e5-mistral-7b-instruct-GPTQ 11 months ago

missing model.safetensors.index.json

#1 opened 11 months ago by

kresimirfijacko

New activity in shuyuej/Mistral-Nemo-Instruct-2407-GPTQ 11 months ago

Can you create gptq 8 bits quants?

#1 opened 11 months ago by

rjmehta

New activity in hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4 11 months ago

Update quantize_config.json

#12 opened 11 months ago by

shuyuej

Update config.json

#11 opened 11 months ago by

shuyuej

Source codes to quantize the LLaMA 3.1 405B model

#10 opened 11 months ago by

shuyuej

New activity in hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4 12 months ago

Request for Mistral Large Instruct GPTQ INT4

#2 opened 12 months ago by

sparsh35

New activity in mistralai/Mamba-Codestral-7B-v0.1 12 months ago

Missing config.json

➕ 2

#6 opened 12 months ago by

wxl2001

New activity in CohereLabs/c4ai-command-r-v01 12 months ago

Learning Rate during pretraining

#58 opened 12 months ago by

shuyuej

New activity in NovaSearch/stella_en_1.5B_v5 12 months ago

Model max_seq_length

#6 opened 12 months ago by

shuyuej

New activity in Salesforce/SFR-Embedding-2_R 12 months ago

Model max_seq_length

#4 opened 12 months ago by

shuyuej

New activity in openlifescienceai/open_medical_llm_leaderboard about 1 year ago

Where can we find `eval_medical_llm.py` and `main.py`

#15 opened about 1 year ago by

shuyuej

New activity in google/gemma-7b about 1 year ago

Fine-Tune a gemma model for question answering

👍 1

#62 opened over 1 year ago by

Iamexperimenting

Weird Performance Issue with Gemma-7b compared to Gemma-2b with Qlora

#91 opened about 1 year ago by

UserDAN

New activity in mistralai/Mixtral-8x7B-Instruct-v0.1 over 1 year ago

What is the actual context size of mistralai/Mixtral-8x7B-Instruct-v0.1 model

#186 opened over 1 year ago by

Pradeep1995

New activity in google/gemma-7b over 1 year ago

Very different results with float16. [Actually, gemma-7b-it does not work with float16]

👍 3

#33 opened over 1 year ago by

EarthWorm001

Shuyue Jia (Bruce)

AI & ML interests

Organizations

shuyuej's activity

not run

What Happens If the Prompt Exceeds 8,196 Tokens? And difference between input limit and context length limit?

quant versions?

RecursionError: maximum recursion depth exceeded

missing model.safetensors.index.json

Can you create gptq 8 bits quants?

Update quantize_config.json

Update config.json

Source codes to quantize the LLaMA 3.1 405B model

Request for Mistral Large Instruct GPTQ INT4

Missing config.json

Learning Rate during pretraining

Model max_seq_length

Model max_seq_length

Where can we find `eval_medical_llm.py` and `main.py`

Fine-Tune a gemma model for question answering

Weird Performance Issue with Gemma-7b compared to Gemma-2b with Qlora

What is the actual context size of mistralai/Mixtral-8x7B-Instruct-v0.1 model

Very different results with float16. [Actually, gemma-7b-it does not work with float16]