提交历史

作者 SHA1 备注 提交日期
  AlpinDale 7e1d2c9feb fix: add images/ to gitignore 7 月之前
  AlpinDale 8d77c69cbd feat: support image processor and add llava example 7 月之前
  AlpinDale 00acf371f9 rocm: fused topk softmax 7 月之前
  AlpinDale 78de98463b feat: return max_model_len in /v1/models 7 月之前
  AlpinDale 8c61fb9c19 fix: prevent LLM.encode() to be used with causal models 7 月之前
  AlpinDale 5fecc6b025 when was this deprecated? 7 月之前
  AlpinDale 690110a051 feat: bitsandbytes quantization 7 月之前
  AlpinDale 0307da9e15 refactor: bitsandbytes -> autoquant 7 月之前
  AlpinDale f2c6791527 feat: update cutlass fp8 configs 7 月之前
  AlpinDale 54f4f1e7f3 allow the cutlass kernels to take scales that reside on the GPU 7 月之前
  AlpinDale 52474b8fa9 build: parallelize all build extensions 7 月之前
  AlpinDale 67084aca5b do not build cutlass kernels if cuda version is too low 7 月之前
  AlpinDale b029a544ff optimize eager mode host time with numpy 7 月之前
  AlpinDale ced1b36b8b feat: support head size of 192 7 月之前
  AlpinDale 4ab4c5c87c oops 7 月之前
  AlpinDale 9e79a15b9f fix: ignore warnings for sparseml 7 月之前
  AlpinDale d45c846c8c do not build sm_90a for cuda 11 7 月之前
  AlpinDale 08f639b8aa remove duplicate seq_lens_tensor 7 月之前
  AlpinDale 072aec1062 automatically detect sparseml models 7 月之前
  AlpinDale 5cedee9024 fix gemma with gptq marlin 7 月之前
  AlpinDale 9d19811d4f avoid the nee dto pass `None` values to `Sequence.inputs` 7 月之前
  AlpinDale f2b7a42c4e fix: async cancels in merge_async_iterators for python>=3.9 7 月之前
  AlpinDale 9099040472 feat: cross-attention kv caching support 7 月之前
  AlpinDale b2fd915c35 improve p2p access check 7 月之前
  AlpinDale 7194047318 remove vllm-nccl 7 月之前
  AlpinDale 6785d78d82 fix: do not expose EOS token in the API 7 月之前
  AlpinDale 90ceab32ff refactor: consolidate prompt args to LLM engines 7 月之前
  AlpinDale e4ea3da1ad fix: tensor parallel with embedding model 7 月之前
  AlpinDale f40b809d3b allow using v2 block manager with sliding window 7 月之前
  AlpinDale 2649f3f14e aqlm works on pascal 7 月之前