AlpinDale ac79d115b3 add guards for prefix caching, fp8, chunked, etc vor 7 Monaten
..
guided_decoding 5b0c11d190 support pipeline parallel pynccl groups vor 7 Monaten
layers e8b7f53321 allow prompt token IDs in the logits processor api vor 7 Monaten
model_loader 7d0884de9a fix mistral v0.3 weight loading vor 7 Monaten
models ac79d115b3 add guards for prefix caching, fp8, chunked, etc vor 7 Monaten
__init__.py 9d81716bfd [v0.5.3] Release Candidate (#388) vor 10 Monaten
pooling_metadata.py be8154a8a0 feat: proper embeddings API with e5-mistral-7b support vor 7 Monaten
sampling_metadata.py 35ae01d7ba refactor: attention metadata term vor 7 Monaten
utils.py 9d81716bfd [v0.5.3] Release Candidate (#388) vor 10 Monaten