Cronologia Commit

Autore SHA1 Messaggio Data
  AlpinDale 1441326ac8 fix: cleanup minicpm-v and port na_vit model 6 mesi fa
  AlpinDale e3f07b22c3 feat: support for QQQ W4A8 quantization (#612) 6 mesi fa
  AlpinDale d0378c617f fix: logit processor exceeding vocab size 6 mesi fa
  AlpinDale 269e9aabda fix: set readonly=True for non-root TPU devices 6 mesi fa
  AlpinDale 705e50f4bd fix: broadcasting logic for multi_modal_kwargs 6 mesi fa
  AlpinDale 6b1fdd07bd chore: add isort and refactor formatting script and utils 6 mesi fa
  AlpinDale 9fcf331f1b feat: add yaml config parsing (#610) 6 mesi fa
  AlpinDale d8ddc230b6 fix: formatting 6 mesi fa
  ewof 544c020eb6 chore: sort args (#608) 6 mesi fa
  AlpinDale ce00ca628c fix: wrap all outlines imports 6 mesi fa
  AlpinDale 758aef4c17 fix: conditionally import outlines.caching 6 mesi fa
  AlpinDale 848731f527 chore: add punica sizes for mistral nemo 6 mesi fa
  AlpinDale 406647ad1f fix: remove artifact 6 mesi fa
  AlpinDale 1d8616e4f7 fix: massively improve throughput with high number of prompts 6 mesi fa
  AlpinDale 869ad77843 fix: remove scaled_fp8_quant_kernel padding footgun 6 mesi fa
  AlpinDale c85ae34877 chore: bump openvino toolkit to pre-release 6 mesi fa
  AlpinDale dc1b59df9c fix: compiler warnings for _C and _moe 6 mesi fa
  AlpinDale d8a51d05a7 fix: seeded gens with pipeline parallel 6 mesi fa
  AlpinDale 9d66a933f2 fix: paligemma mmp 6 mesi fa
  AlpinDale eef647deab fix: greedy decoding in TPU 6 mesi fa
  AlpinDale fb22ae6d49 chore: tune int8 kernels for ada lovelace 6 mesi fa
  AlpinDale 49a2836d61 fix: divide-by-zero warnings in marlin kernels 6 mesi fa
  AlpinDale 04efb16716 fix: unused variables in awq gemm kernel 6 mesi fa
  AlpinDale 8d3fb94679 feat: add allowed_token_ids 6 mesi fa
  AlpinDale 4abbbdad78 chore: make triton fully optional 6 mesi fa
  AlpinDale 2a042fd7b4 fix: remove timm as a hardcoded requirement 6 mesi fa
  AlpinDale fbec255dc1 chore: enable tpu tensor parallel in async engine 6 mesi fa
  AlpinDale e8d34d75e6 fix: deprecation warnings in squeezellm quant_cuda_kernel 6 mesi fa
  AlpinDale c81023a90a fix: reduce unnecessary compute when logprobs=None 6 mesi fa
  AlpinDale 682a9db0ed chore: tune fp8 kernels for ada lovelace cards 6 mesi fa