Aphrodite supports basic model inference on Intel Datacenter GPUs.
docker build -f Dockerfile.xpu -t aphrodite-xpu --shm-size=4g .
docker run -it \
--rm \
--network=host \
--ipc=host \
--device /dev/dri \
-v /dev/dri/by-path:/dev/dri/by-path \
aphrodite-xpu
First, install the required driver and intel OneAPI 2024.1 or later. Second, install Python packages for Aphrodite XPU backend:
source /opt/intel/oneapi/setvars.sh
pip install -U pip
pip install -v -r requirements-xpu.txt
Finally, build:
APHRODITE_TARGET_DEVICE=xpu python setup.py develop
Currently, only FP16 data type is supported.