sgsdxzy
|
214151b04c
fix: max_num_batched_tokens for chunked_prefill (#412)
|
9 months ago |
sgsdxzy
|
fcfb72af24
Support arbitrary model in GGUF. (#381)
|
9 months ago |
AlpinDale
|
50c2434267
move megatron to a top-level directory
|
9 months ago |
AlpinDale
|
7528e0ce3e
make detokenization optional
|
9 months ago |
AlpinDale
|
23a1114e4f
enable hf_transfer if installed
|
9 months ago |
AlpinDale
|
071269e406
feat: FP8 E4M3 KV Cache (#405)
|
9 months ago |
AlpinDale
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
10 months ago |
AlpinDale
|
609710b940
LockFile -> SoftLockFile
|
10 months ago |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
10 months ago |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
10 months ago |
AlpinDale
|
89c32b40ec
chore: add new imatrix quants (#320)
|
10 months ago |
AlpinDale
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
11 months ago |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
11 months ago |
AlpinDale
|
842912d022
feat: on-the-fly gguf conversion (#250)
|
1 year ago |
AlpinDale
|
8b6790d504
fix: gguf config not recognized
|
1 year ago |
AlpinDale
|
4faf78ba29
fix: grab correct quant config from revisions (#246)
|
1 year ago |
AlpinDale
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 year ago |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 year ago |
AlpinDale
|
f013d714c0
chore: merge dev branch into main (#177)
|
1 year ago |
AlpinDale
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
AlpinDale
|
887e03669a
feat: add exllamav2 for GPTQ (#99)
|
1 year ago |
AlpinDale
|
74604eb252
fix: pylint complaints (#91)
|
1 year ago |
AlpinDale
|
efc6f7fbec
chore: reformats (#90)
|
1 year ago |
AlpinDale
|
f04588203e
feat: mistral AWQ support and file blacklisting
|
1 year ago |
AlpinDale
|
0495c50a3e
GPTQ+exllama support (#21)
|
1 year ago |
AlpinDale
|
779148bfc3
fix missing import in llama modeling
|
1 year ago |
AlpinDale
|
303c782c79
fix initialization code
|
1 year ago |
AlpinDale
|
d9c1d4f6e5
add awq support
|
1 year ago |
AlpinDale
|
39beed0b87
Revert "Refactor AWQ support."
|
1 year ago |
AlpinDale
|
d09e27f5d4
Refactor AWQ support.
|
1 year ago |