1
0
sgsdxzy fcfb72af24 Support arbitrary model in GGUF. (#381) 8 сар өмнө
..
attention 1270b5567e triton compile error for flash_attn 8 сар өмнө
common fcfb72af24 Support arbitrary model in GGUF. (#381) 8 сар өмнө
distributed b1caee23a6 cache the p2p access check for memory saving 8 сар өмнө
endpoints b1caee23a6 cache the p2p access check for memory saving 8 сар өмнө
engine bd0ddf1cfe feat: EETQ quantization (#408) 8 сар өмнө
executor 373e0d3c01 fix neuron 8 сар өмнө
kv_quant e42a78381a feat: switch from pylint to ruff (#322) 10 сар өмнө
lora fe17712f29 fully working chunked prefill 8 сар өмнө
modeling fcfb72af24 Support arbitrary model in GGUF. (#381) 8 сар өмнө
processing fe17712f29 fully working chunked prefill 8 сар өмнө
spec_decode 4d33ce60da feat: Triton flash attention backend for ROCm (#407) 8 сар өмнө
task_handler 6e0761ba5d make init_distributed_environment compatible with init_process_group 8 сар өмнө
transformers_utils c18bf116da fix stop strings not being excluded from outputs 8 сар өмнө
__init__.py c2aaaefd57 allow out-of-tree model registry 9 сар өмнө
py.typed 1c988a48b2 fix logging and add py.typed 1 жил өмнө