sgsdxzy b28011e86e fix: shard exl2 weights more evenly between ranks (#437) vor 9 Monaten
..
guided_decoding d8c4193704 feat: Speculative Decoding using a draft model (#432) vor 9 Monaten
layers b28011e86e fix: shard exl2 weights more evenly between ranks (#437) vor 9 Monaten
models a3b1602391 fix: rope scaling for cohere and qwen (#436) vor 9 Monaten
__init__.py 0f1399c135 feat: attention refactor part 2 vor 9 Monaten
hf_downloader.py 58b0616dd3 feat: support sharded ggufs (#420) vor 9 Monaten
loader.py 589fe0c73e fix: split the exl2 weight loading and SQ+ init (#423) vor 9 Monaten
neuron_loader.py d1786645a3 fix formatting vor 9 Monaten
sampling_metadata.py f67b5be198 chore: port sampler+metadata changes from main to dev (#427) vor 9 Monaten
utils.py d1786645a3 fix formatting vor 9 Monaten