Commit History

Author SHA1 Message Date
  AlpinDale 9aaeb5d349 add speculative config and arg for later 9 months ago
  AlpinDale df7ae8ce01 fix spec_decode and block imports 9 months ago
  AlpinDale d3351c75f1 fix minor cuda version mismatch with runtime 9 months ago
  AlpinDale a304f76d89 feat: Intel CPU support (#403) 9 months ago
  AlpinDale fa083286e3 Speculative Decoding Part 4: Lookahead scheduling (#402) 9 months ago
  AlpinDale aa1db50131 simplify tokenizer.py 9 months ago
  AlpinDale d68fad5a79 feat: add optimized layernorm kernels (#398) 9 months ago
  AlpinDale 95faf27d2b fix build for 7.5 9 months ago
  AlpinDale ea26c91e52 proper typing 9 months ago
  sgsdxzy 47370d2ad5 Fix cohere for command-r+ (#394) 9 months ago
  AlpinDale 2c512a0824 CMake build system (#395) 9 months ago
  AlpinDale 3abc641d68 directly use in forward pass 9 months ago
  sgsdxzy 255c2f1d67 small fixes (#393) 9 months ago
  AlpinDale 04c38f2b91 cache tokenizer len 9 months ago
  AlpinDale c6fc4d2c90 fix case when API request to top_k is 0 9 months ago
  AlpinDale 0f0ec6832b rccl path for ROCm 9 months ago
  AlpinDale a472276ef3 fix codespell path 9 months ago
  AlpinDale 98a565d5ec do not use codespell on kernels 9 months ago
  AlpinDale c3c374396b logprobs fixes 9 months ago
  AlpinDale d49187c231 this is kinda dumb if you ask me 9 months ago
  AlpinDale aa244761ed formatting and typing 9 months ago
  AlpinDale 41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ 9 months ago
  AlpinDale 10e708726e enable multi-node inference 9 months ago
  AlpinDale 7533d4d458 optional vision language config for neuron 9 months ago
  AlpinDale f845a661dd Chunked Prefill Part 2: data update 9 months ago
  AlpinDale bd44122b8e add qwen2moe support (needs transformers git) 9 months ago
  AlpinDale eff5eb16c5 ruff 9 months ago
  AlpinDale 753f6dc51b add v2 block manager 9 months ago
  AlpinDale 211f040107 formatting 9 months ago
  AlpinDale aa23ca6ba9 add dbrx support 9 months ago