Sherman Chann's picture

Sherman Chann

152334H

·

https://152334H.github.io

152334H

AI & ML interests

None yet

Organizations

commented 2 papers 10 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141 •

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141 •

commented 2 papers 12 months ago

DeepSpeak Dataset v1.0

Paper • 2408.05366 • Published Aug 9, 2024 • 14 •

OpenResearcher: Unleashing AI for Accelerated Scientific Research

Paper • 2408.06941 • Published Aug 13, 2024 • 33 •

New activity in meta-llama/Llama-3.1-405B 12 months ago

8-kv-heads

#21 opened about 1 year ago by

New activity in 152334H/miqu-1-70b-sf about 1 year ago

Adding Evaluation Results

#23 opened about 1 year ago by

leaderboard-pr-bot

commented 4 papers about 1 year ago

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27, 2024 • 59 •

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17, 2024 • 55 •

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17, 2024 • 24 •

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Paper • 2406.11069 • Published Jun 16, 2024 • 14 •

commented a paper over 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 623 •

New activity in 152334H/miqu-1-70b-sf over 1 year ago

vllm support?

#19 opened over 1 year ago by

Remove extra degrees of freedom by dequantizing the `q5_K_M`, `q4_K_M` and `q2_K` models together?

#18 opened over 1 year ago by

Sticking a restrictive license on a model that's not even yours to begin with?

#14 opened over 1 year ago by

2.4bpp exl2 waiting room

#3 opened over 1 year ago by

Model load fail

#13 opened over 1 year ago by

Chat template

#11 opened over 1 year ago by

Commerical Use?

#12 opened over 1 year ago by

more quantized versions？

#10 opened over 1 year ago by

New activity in miqudev/miqu-1-70b over 1 year ago

An interesting yet useless consideration over the fp16 being out or not.

#21 opened over 1 year ago by