Clarification on the layer hidden state source
#1
by
websterbei
- opened
Eagle 3 default layer indices for this model would be 2,18,33
The config specified 1, 17, 32
Want to clarify, using layer indexing from 0 to 35 for this model, is it using input hidden states to layer 1,17,32 or output from layer 1,17,32?
That’s just the different notation of the layers. In SGLang the input hidden_states of the layers (2,18,33) are used, while in TRTLLM we use the output hidden_states of layers (1,17,32). Essentially they are the same
Got it, the hitrate seems to drop a lot with multi-gpu + trt-llm, any clue?