Commit History

Autor SHA1 Mensaxe Data
  sgsdxzy 214151b04c fix: max_num_batched_tokens for chunked_prefill (#412) hai 9 meses
  sgsdxzy fcfb72af24 Support arbitrary model in GGUF. (#381) hai 9 meses
  AlpinDale 50c2434267 move megatron to a top-level directory hai 9 meses
  AlpinDale 7528e0ce3e make detokenization optional hai 9 meses
  AlpinDale 23a1114e4f enable hf_transfer if installed hai 9 meses
  AlpinDale 071269e406 feat: FP8 E4M3 KV Cache (#405) hai 9 meses
  AlpinDale 41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ hai 10 meses
  AlpinDale 609710b940 LockFile -> SoftLockFile hai 10 meses
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) hai 10 meses
  AlpinDale e42a78381a feat: switch from pylint to ruff (#322) hai 10 meses
  AlpinDale 89c32b40ec chore: add new imatrix quants (#320) hai 10 meses
  AlpinDale c41462cfcd feat: exllamav2 quantization (#305) hai 11 meses
  AlpinDale c2d77b1822 chore: logging refactor (#302) hai 11 meses
  AlpinDale 842912d022 feat: on-the-fly gguf conversion (#250) hai 1 ano
  AlpinDale 8b6790d504 fix: gguf config not recognized hai 1 ano
  AlpinDale 4faf78ba29 fix: grab correct quant config from revisions (#246) hai 1 ano
  AlpinDale c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) hai 1 ano
  AlpinDale 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) hai 1 ano
  AlpinDale f013d714c0 chore: merge dev branch into main (#177) hai 1 ano
  AlpinDale 2755a48d51 merge dev branch into main (#153) hai 1 ano
  AlpinDale 887e03669a feat: add exllamav2 for GPTQ (#99) hai 1 ano
  AlpinDale 74604eb252 fix: pylint complaints (#91) hai 1 ano
  AlpinDale efc6f7fbec chore: reformats (#90) hai 1 ano
  AlpinDale f04588203e feat: mistral AWQ support and file blacklisting hai 1 ano
  AlpinDale 0495c50a3e GPTQ+exllama support (#21) hai 1 ano
  AlpinDale 779148bfc3 fix missing import in llama modeling hai 1 ano
  AlpinDale 303c782c79 fix initialization code hai 1 ano
  AlpinDale d9c1d4f6e5 add awq support hai 1 ano
  AlpinDale 39beed0b87 Revert "Refactor AWQ support." hai 1 ano
  AlpinDale d09e27f5d4 Refactor AWQ support. hai 1 ano