Commit History

Автор SHA1 Съобщение Дата
  AlpinDale 9bdf8d5bfa mamba: enable continuous batching for mamba kernels (#1055) преди 1 месец
  AlpinDale a985143768 core: add cuda graph support for encoder-decoder models (#1051) преди 1 месец
  AlpinDale 4593a3b306 chore: remove dead code from triton sampling kernels (#1049) преди 1 месец
  AlpinDale 638c08d9dc fix: clean shutdown issues (#1047) преди 1 месец
  AlpinDale 65a59bbb6b cpu: raise error if using encoder-decoder models (#1027) преди 1 месец
  AlpinDale f1ea7711bd core: do not compile ScalarType for torch < 2.4.0 (#938) преди 1 месец
  AlpinDale 22a4cd4595 core: fix spec decode metrics and envs circular import (#889) преди 1 месец
  AlpinDale 901900854e chore: consolidate environment variables within one file (#882) преди 1 месец
  AlpinDale 9fc6473b18 server: log the process occupying our port (#866) преди 2 месеца
  AlpinDale 0f1af04cf5 frontend: minor logging improvements (#787) преди 3 месеца
  AlpinDale 0256ed236b feat: windows support (#790) преди 3 месеца
  50h100a 371d57af82 filesize-driven progress bar for loading tensors преди 3 месеца
  AlpinDale 0b8b407b6d feat: support profiling with multiple multi-modal inputs per prompt (#712) преди 4 месеца
  AlpinDale 5d37ec1016 suppress tpu import warning (#696) преди 4 месеца
  AlpinDale 4fe371b7fa fix: allow passing float for GiB arguments (#690) преди 4 месеца
  AlpinDale 3f712cd287 feat: add progress bar for loading individual weight modules (#640) преди 4 месеца
  AlpinDale 7df7b8ca53 optimization: reduce end-to-end overhead from python obj allocation (#666) преди 4 месеца
  AlpinDale 62111fab17 feat: allow serving encoder-decoder models in the API server (#664) преди 4 месеца
  AlpinDale 0e5bb11503 fix: make `merge_async_iterators.is_cancelled()` optional (#656) преди 4 месеца
  AlpinDale a2344d3617 fix: move zeromq rpc frontend to IPC instead of TCP (#652) преди 4 месеца
  AlpinDale 31f82da8bd chore: deduplicate nvlink check to cuda platform (#643) преди 4 месеца
  AlpinDale 77c4fbd5c9 fix: better async request cancellation (#641) преди 4 месеца
  AlpinDale 308501daa5 fix: default api port and attention selector (#634) преди 5 месеца
  AlpinDale a0e446a17d feat: initial encoder-decoder support with BART model (#633) преди 5 месеца
  AlpinDale f1d0b77c92 [0.6.0] Release Candidate (#481) преди 5 месеца
  AlpinDale 9d81716bfd [v0.5.3] Release Candidate (#388) преди 8 месеца
  AlpinDale e3252edd07 fix: remove event and stream, add typing (#382) преди 10 месеца
  AlpinDale 33b3786175 fix: cache neuron checks (#379) преди 10 месеца
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) преди 10 месеца
  AlpinDale e53842bd5d fix: cuda home detection for fp8 kv cache преди 10 месеца