AlpinDale 62b2c4119d feat: re-write GPTQ and refactor exllama kernels (#152) 1 year ago
..
quantization 62b2c4119d feat: re-write GPTQ and refactor exllama kernels (#152) 1 year ago
__init__.py 07aa2a492f upstream: add option to specify tokenizer 1 year ago
activation.py 5dbd5f8c30 fix: quant TP (#129) 1 year ago
attention.py 653da510d1 chore: rewrite InputMetadata (#143) 1 year ago
layernorm.py 1aab8a7d6f feat: speedup compilation times by 3x (#130) 1 year ago
linear.py 62b2c4119d feat: re-write GPTQ and refactor exllama kernels (#152) 1 year ago
rotary_embedding.py e386032ae8 fix: rope duplication (#142) 1 year ago
sampler.py 653da510d1 chore: rewrite InputMetadata (#143) 1 year ago
sampler_mirostat.py 653da510d1 chore: rewrite InputMetadata (#143) 1 year ago
vocab_parallel_embedding.py e7b6a2d5a0 chore: tensor parallel refactors part 2 (#116) 1 year ago