AlpinDale
|
614ca6b0bf
feat: support logits soft capping with flash attention backend
|
4 月之前 |
AlpinDale
|
6b1fdd07bd
chore: add isort and refactor formatting script and utils
|
4 月之前 |
AlpinDale
|
22305c91e9
refactor _prepare_model_input_tensor and attn metadata builder for most backends
|
5 月之前 |
AlpinDale
|
9d7beaa5b9
chore: separate kv_scale into k_scale and v_scale
|
5 月之前 |
AlpinDale
|
2105e4fd6b
feat: correctly invoke prefill & decode kernels for cross-attention
|
5 月之前 |
AlpinDale
|
405bb74612
Control plane comms refactor (#573)
|
5 月之前 |
AlpinDale
|
6a57861fca
feat: initial XPU support via intel_extension_for_pytorch (#571)
|
5 月之前 |