AlpinDale
|
05e45aeb53
fix: dtype mismatch for paligemma
|
5 meses atrás |
AlpinDale
|
bf4f113ef1
feat: add paligemma vision model support
|
5 meses atrás |
AlpinDale
|
0f4a9ee77b
quantized lm_head (#582)
|
5 meses atrás |
AlpinDale
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 meses atrás |
AlpinDale
|
b6ff0623a6
chore: clean up branding
|
6 meses atrás |
AlpinDale
|
c5d8028668
fix: no need to redefine supports_vision and supports_lora in model class
|
6 meses atrás |
AlpinDale
|
56e0b8223c
chore: add base class for LoRA-supported models
|
6 meses atrás |
AlpinDale
|
656459fd84
make fp8_e4m3 work on nvidia
|
6 meses atrás |
AlpinDale
|
eaa06fdd14
fix some f-strings
|
6 meses atrás |
AlpinDale
|
50b7c13db0
refactor: attention selector (#552)
|
6 meses atrás |
AlpinDale
|
b178ae4b4a
chore: generalize linear_method to be quant_method (#540)
|
6 meses atrás |
AlpinDale
|
fca911ee0a
vLLM Upstream Sync (#526)
|
7 meses atrás |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
9 meses atrás |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
10 meses atrás |
AlpinDale
|
da223153c6
feat&fix: cohere support and missing GPU blocks (#333)
|
11 meses atrás |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
11 meses atrás |
AlpinDale
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
11 meses atrás |
AlpinDale
|
e31c6f0b45
feat: refactor modeling logic and support more models (#274)
|
11 meses atrás |