Commit History

Author SHA1 Message Date
  AlpinDale d3351c75f1 fix minor cuda version mismatch with runtime 9 months ago
  AlpinDale a304f76d89 feat: Intel CPU support (#403) 9 months ago
  AlpinDale fa083286e3 Speculative Decoding Part 4: Lookahead scheduling (#402) 9 months ago
  AlpinDale aa1db50131 simplify tokenizer.py 9 months ago
  AlpinDale d68fad5a79 feat: add optimized layernorm kernels (#398) 9 months ago
  AlpinDale 95faf27d2b fix build for 7.5 9 months ago
  AlpinDale ea26c91e52 proper typing 9 months ago
  sgsdxzy 47370d2ad5 Fix cohere for command-r+ (#394) 9 months ago
  AlpinDale 2c512a0824 CMake build system (#395) 9 months ago
  AlpinDale 3abc641d68 directly use in forward pass 9 months ago
  sgsdxzy 255c2f1d67 small fixes (#393) 10 months ago
  AlpinDale 04c38f2b91 cache tokenizer len 10 months ago
  AlpinDale c6fc4d2c90 fix case when API request to top_k is 0 10 months ago
  AlpinDale 0f0ec6832b rccl path for ROCm 10 months ago
  AlpinDale a472276ef3 fix codespell path 10 months ago
  AlpinDale 98a565d5ec do not use codespell on kernels 10 months ago
  AlpinDale c3c374396b logprobs fixes 10 months ago
  AlpinDale d49187c231 this is kinda dumb if you ask me 10 months ago
  AlpinDale aa244761ed formatting and typing 10 months ago
  AlpinDale 41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ 10 months ago
  AlpinDale 10e708726e enable multi-node inference 10 months ago
  AlpinDale 7533d4d458 optional vision language config for neuron 10 months ago
  AlpinDale f845a661dd Chunked Prefill Part 2: data update 10 months ago
  AlpinDale bd44122b8e add qwen2moe support (needs transformers git) 10 months ago
  AlpinDale eff5eb16c5 ruff 10 months ago
  AlpinDale 753f6dc51b add v2 block manager 10 months ago
  AlpinDale 211f040107 formatting 10 months ago
  AlpinDale aa23ca6ba9 add dbrx support 10 months ago
  AlpinDale 19659949f1 fix nccl path 10 months ago
  AlpinDale ec5f25e17f support gemma 1.1 models with approximate gelu 10 months ago