AlpinDale ca6b69966d fix: explicitly end_forward() calls to flashinfer пре 7 месеци
..
backends ca6b69966d fix: explicitly end_forward() calls to flashinfer пре 7 месеци
ops 7d79c0e726 chore: use nvml query to avoid accidental cuda initialization пре 7 месеци
__init__.py a94de94c44 refactor: combine the prefill and decode into a single API (#553) пре 7 месеци
layer.py 7e66e8f899 fix: only add `Attention.kv_scale` if kv cache quant is enabled пре 7 месеци
selector.py b6e60143e7 Flashinfer for prefill phase (#580) пре 7 месеци