The files here are from the NVIDIA [Megatron-LM](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/core) repository, but only with inference-related code.