Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
11
Dmytro Dzhulgakov
dzhulgakov
Follow
BobaZooba's profile picture
kristileilani's profile picture
rasar00's profile picture
6 followers
·
10 following
dzhulgakov
dzhulgakov
AI & ML interests
None yet
Organizations
dzhulgakov
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
deepseek-ai/DeepSeek-V3.1
2 months ago
Add tools to the end of the system prompt
#20 opened 2 months ago by
dzhulgakov
New activity in
moonshotai/Kimi-K2-Instruct
4 months ago
Adjust number of reserved tokens to match the model
#15 opened 4 months ago by
dzhulgakov
New activity in
deepseek-ai/DeepSeek-V3
10 months ago
Bug in fp8_cast_bf16.py
1
#4 opened 10 months ago by
dzhulgakov
New activity in
deepseek-ai/DeepSeek-Coder-V2-Instruct
over 1 year ago
How important is the grouped_topk?
👀
1
#6 opened over 1 year ago by
dzhulgakov
New activity in
google/gemma-2-9b
over 1 year ago
Can't repro MMLU: sliding window attention implementation seems broken
3
#11 opened over 1 year ago by
dzhulgakov
New activity in
google/gemma-7b-it
over 1 year ago
Running sample code gives ma a shape error
1
#22 opened over 1 year ago by
dzhulgakov
New activity in
DiscoResearch/mixtral-7b-8expert
almost 2 years ago
Update modeling_moe_mistral.py
2
#1 opened almost 2 years ago by
bjoernp
commented
a paper
about 2 years ago
Mistral 7B
Paper
•
2310.06825
•
Published
Oct 10, 2023
•
55
•
8