Tri Dao
|
f1a73d0740
Run isort and black on python files
|
hai 1 ano |
Xuechen Li
|
bb4cded17b
support when num_heads is not divisible by world_size; resolves #459 (#461)
|
hai 1 ano |
Tri Dao
|
93383bd55b
[TP] Implement TensorParallel without sequence parallel
|
hai 1 ano |
Tri Dao
|
c6ecd40a59
Tweak CrossEntropyLoss to take process_group in init
|
hai 1 ano |
Tri Dao
|
b4018a5028
Implement Tensor Parallel for GPT model
|
%!s(int64=2) %!d(string=hai) anos |
Tri Dao
|
226a1b721d
Implement TensorParallel for FusedDense and FusedDenseGeluDense
|
%!s(int64=2) %!d(string=hai) anos |