flash attention 2

by Itaykatzir - opened Apr 11

Apr 11

Does it support flash attention 2?
Can you give an example of usage?

NVIDIA org May 15

We use torch.scaled_dot_product_attention, which under the hood will use FA2 if the input conditions are met.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment