.. |
guided_decoding
|
5b0c11d190
support pipeline parallel pynccl groups
|
vor 7 Monaten |
layers
|
e8b7f53321
allow prompt token IDs in the logits processor api
|
vor 7 Monaten |
model_loader
|
7d0884de9a
fix mistral v0.3 weight loading
|
vor 7 Monaten |
models
|
ac79d115b3
add guards for prefix caching, fp8, chunked, etc
|
vor 7 Monaten |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 10 Monaten |
pooling_metadata.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
vor 7 Monaten |
sampling_metadata.py
|
35ae01d7ba
refactor: attention metadata term
|
vor 7 Monaten |
utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 10 Monaten |