flash attention 2
#3
by
Itaykatzir
- opened
Does it support flash attention 2?
Can you give an example of usage?
We use torch.scaled_dot_product_attention, which under the hood will use FA2 if the input conditions are met.