Tri Dao
|
3f7d5786ba
Pass alibi slopes to flash_attn_with_kvcache during generation
|
пре 11 месеци |
Tri Dao
|
2c7d7b7396
Implement norm head for Baichuan2
|
пре 11 месеци |
Tri Dao
|
c3b2196652
Add Alibi to MHA, test with Baichuan-13B
|
пре 11 месеци |
Tri Dao
|
dfe29f5e2b
[Gen] Don't use ft_attention, use flash_attn_with_kvcache instead
|
пре 1 година |
Tri Dao
|
8a733cbd53
[Gen] Fix calling update_graph_cache in tests
|
пре 1 година |
Tri Dao
|
a86442f0f3
[Gen] Use flash_attn_with_kvcache in generation
|
пре 1 година |
Tri Dao
|
9795159082
[Rotary] Set device before launching Triton kernel to avoid error
|
пре 1 година |
Tri Dao
|
913922cac5
[Gen] Refactor decoding function
|
пре 1 година |
Tri Dao
|
798858f9f1
Fix test_baichuan
|
пре 1 година |
GAOXinyu
|
a8c35b4f57
FEAT: add codes which supporting for baichuan-inc/Baichuan-7B (#425)
|
пре 1 година |