.. |
backends
|
92bcbdf975
fix: megacore setting for TPU v5e-litepod
|
6 months ago |
ops
|
d8f9f0ec16
fix: prefix prefill kernels for fp32 data type
|
6 months ago |
__init__.py
|
a94de94c44
refactor: combine the prefill and decode into a single API (#553)
|
7 months ago |
layer.py
|
2105e4fd6b
feat: correctly invoke prefill & decode kernels for cross-attention
|
6 months ago |
selector.py
|
b6e60143e7
Flashinfer for prefill phase (#580)
|
7 months ago |