Histórico de Commits

Autor SHA1 Mensagem Data
  AlpinDale 0f4a9ee77b quantized lm_head (#582) há 6 meses atrás
  AlpinDale cf472315cc refactor: isolate FP8 from mixtral há 7 meses atrás
  AlpinDale ae04f57ec1 feat: Pipeline Parallel support (#581) há 7 meses atrás
  AlpinDale dd378ea063 feat: MLPSpeculator with tensor parallel há 7 meses atrás
  AlpinDale 3a0fdf7b9b chore: remove `image_input_type` from VLM config há 7 meses atrás
  AlpinDale 63b735bc2a chore: optimize v2 block manager to match the performance of v1 há 7 meses atrás
  AlpinDale de7e6919c0 feat: support tied weights and input scale for MLPSpeculator há 7 meses atrás
  AlpinDale 9da2448964 fix: ensure worker model loop is always stopped at the right time há 7 meses atrás
  AlpinDale ca6b69966d fix: explicitly end_forward() calls to flashinfer há 7 meses atrás
  AlpinDale 3b2666314d fix: add chunking mechanism to fused_moe há 7 meses atrás
  AlpinDale 70ecdefc4e fix: Ray ActorDiedError not available on older ray versions há 7 meses atrás
  AlpinDale 7253e9052d feat: integrate typical acceptance sampling for spec decoding há 7 meses atrás
  AlpinDale 7d79c0e726 chore: use nvml query to avoid accidental cuda initialization há 7 meses atrás
  AlpinDale ddb3323f94 refactor: have w8a8 compressed tensors use `process_weights_after_load` for fp8 há 7 meses atrás
  AlpinDale 17f7089e26 fix: `get_min_capability` for all quants há 7 meses atrás
  AlpinDale 0a6db357d8 fix: use safetensor keys instead of adapter_config.json to find unexpected modules há 7 meses atrás
  AlpinDale 4f87a14998 chore: allow base64 embeddings há 7 meses atrás
  AlpinDale aea0b52e52 fix: torchvision version for rocm há 7 meses atrás
  AlpinDale 4cdc810b1c fix: minor TP issues with vision models há 7 meses atrás
  AlpinDale 336eb4dbf8 fix: raise error in moe kernel if it receives more than 65k tokens há 7 meses atrás
  AlpinDale bcc60a6555 chore: optimize SequenceStatus.is_finished by switching to IntEnum há 7 meses atrás
  AlpinDale 7b04361934 fix: support getting `eos_token_id` from the config file há 7 meses atrás
  AlpinDale b8a19ba27f chore: extend aphrodite metrics logging api há 7 meses atrás
  AlpinDale 301ec7c77d fix: pad slot id in tpu runner há 7 meses atrás
  AlpinDale d0ff3fd59e fix: tpu sampler output há 7 meses atrás
  AlpinDale b6e60143e7 Flashinfer for prefill phase (#580) há 7 meses atrás
  AlpinDale 7e66e8f899 fix: only add `Attention.kv_scale` if kv cache quant is enabled há 7 meses atrás
  AlpinDale bbde979ecd DeepSeek-V2 (#579) há 7 meses atrás
  AlpinDale 272c64ab88 chore: allow loading fp8 models with fused qkv/mlp há 7 meses atrás
  AlpinDale 772a126c08 chore: simplify fp8 weight loading há 7 meses atrás