AlpinDale
|
93bc863591
feat: Machete Kernels for Hopper GPUs (#842)
|
1 month ago |
AlpinDale
|
0256ed236b
feat: windows support (#790)
|
2 months ago |
AlpinDale
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
9 months ago |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
9 months ago |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 months ago |
AlpinDale
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
11 months ago |
AlpinDale
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
AlpinDale
|
9b317aa26a
feat: finish up tests and workflows (#87)
|
1 year ago |
AlpinDale
|
a6a4220fa6
feat: refactor megatron and quants (#57)
|
1 year ago |
henk717
|
0b2b62fe96
Micromamba Runtime (#54)
|
1 year ago |
AlpinDale
|
76b2e4a445
Merge dev branch into main (#7)
|
1 year ago |
AlpinDale
|
b188d1093b
test: throughput
|
1 year ago |
AlpinDale
|
908091008e
readme: typo
|
1 year ago |
AlpinDale
|
6dfca19dda
fix: gpt-j loading
|
1 year ago |
AlpinDale
|
35ec43f478
fix: remove aria2 for now
|
1 year ago |
AlpinDale
|
a69f1ecf51
chore: qol improvements
|
1 year ago |
AlpinDale
|
16df1763c8
fix: typos in the attention file
|
1 year ago |
AlpinDale
|
b8f4337c5b
chore: various fixes
|
1 year ago |
AlpinDale
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |
AlpinDale
|
beb966180b
fix: various typo and import error fixes
|
1 year ago |
AlpinDale
|
3c3944153c
feat: add generic attention and FP32 dtype kernels
|
1 year ago |
AlpinDale
|
b6804de3c7
chore: add pycache to gitignore
|
1 year ago |
AlpinDale
|
2e86d50e19
feat: draft for ray
|
1 year ago |