Commit History

Author SHA1 Message Date
  AlpinDale c577c31aaa feat: tree attention 9 months ago
  50h100a f67b5be198 chore: port sampler+metadata changes from main to dev (#427) 9 months ago
  AlpinDale 8c67b37131 fix docstrings 9 months ago
  AlpinDale fe17712f29 fully working chunked prefill 9 months ago
  AlpinDale 50c2434267 move megatron to a top-level directory 9 months ago
  AlpinDale 071269e406 feat: FP8 E4M3 KV Cache (#405) 9 months ago
  AlpinDale f845a661dd Chunked Prefill Part 2: data update 10 months ago
  AlpinDale 5f851e45e5 ruff 10 months ago
  AlpinDale 41f5af0426 add python nccl wrapper, remove cupy 10 months ago
  AlpinDale 7b9c08afae vision model support 10 months ago
  AlpinDale 0f1399c135 feat: attention refactor part 2 10 months ago
  AlpinDale 2319b411ce refactor: neuron support 10 months ago
  AlpinDale 15308ffb5b compute logits in model_runner 10 months ago
  AlpinDale 78d66f16d1 Chunked Prefill Part 1 (#384) 10 months ago
  AlpinDale 9181fa0396 feat: Triton kernels for sampling (#383) 10 months ago
  AlpinDale 4b99ac15b7 fix: do not deepcopy metadata 10 months ago
  AlpinDale 17b034613d chore: make metadata a dataclass (#377) 10 months ago
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 10 months ago
  50h100a b9e0ae87c5 fix fine-grained seeding. 10 months ago
  AlpinDale e42a78381a feat: switch from pylint to ruff (#322) 10 months ago
  sgsdxzy 50c0875c32 chore: log total memory usage (#316) 10 months ago
  AlpinDale c2d77b1822 chore: logging refactor (#302) 11 months ago
  AlpinDale 9810daa699 feat: INT8 KV Cache (#298) 11 months ago
  AlpinDale ac82b67f75 feat: naive context shift and various QoL changes (#289) 11 months ago
  AlpinDale 4d04ade9ef feat: fine-grained seeds (#279) 11 months ago
  AlpinDale 697c06c4f5 fix: LoRA support for mixtral (#276) 11 months ago
  AlpinDale 4b80b42362 fix: memory leaks due to nccl cuda graphs (#275) 11 months ago
  AlpinDale ea0f57b233 feat: allow further support for non-cuda devices (#247) 1 year ago
  AlpinDale 1a94ccf3cf fix: prefix cache fail with lora (#239) 1 year ago
  AlpinDale c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 year ago