--- outline: deep --- # Installation with Neuron Aphrodite supports inference with AWS Trainium/Inferentia chips. At the moment Paged Attention is not supported in Neuron SDK, but naive continuous batching is supported in transformers-neuronx. Data types currently supported in Neuron SDK are FP16 and BF16. ## Requirements - Linux - Python 3.8 - 3.11 - Accelerator: NeuronCore_v2 (in trn1/inf2 instances) - PyTorch 2.0.1/2.1.1 - AWS Neuron SDK 2.16/2.17 ## Building from Source The following instructions are for Neuron SDK 2.16 and above. ### Launch Trn1/Inf2 instances Here are the steps to launch trn1/inf2 instances, in order to install [PyTorch Neuron (“torch-neuronx”) Setup on Ubuntu 22.04 LTS](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-setup/pytorch/neuronx/ubuntu/torch-neuronx-ubuntu22.html). - Follow the instructions at [launch an Amazon EC2 Instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html#ec2-launch-instance). - Refer to these pages for more info about instance sizes and pricing: [Trainium1](https://aws.amazon.com/ec2/instance-types/trn1/), [Inferentia2](https://aws.amazon.com/ec2/instance-types/inf2/). - Select Ubuntu Server 22.02 TLS AMI. - When launching, adjust your primary EBS volume size to a minimum of 512GB. - After launching, follow the instructions in [Connect to your instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html). ### Install drivers and tools If [Deep Learning AMI Neuron](https://docs.aws.amazon.com/dlami/latest/devguide/appendix-ami-release-notes.html) is installed, this step is unnecessary. Otherwise, follow this: ```sh # Configure Linux for Neuron repository updates . /etc/os-release sudo tee /etc/apt/sources.list.d/neuron.list > /dev/null <