Jelajahi Sumber

fix: guard for lora + chunked prefill

AlpinDale 7 bulan lalu
induk
melakukan
ee174ea4fd
1 mengubah file dengan 2 tambahan dan 0 penghapusan
  1. 2 0
      aphrodite/common/config.py

+ 2 - 0
aphrodite/common/config.py

@@ -1171,6 +1171,8 @@ class LoRAConfig:
                 "Due to limitations of the custom LoRA CUDA kernel, "
                 "max_num_batched_tokens must be <= 65528 when "
                 "LoRA is enabled.")
+        if scheduler_config.chunked_prefill_enabled:
+            raise ValueError("LoRA is not supported with chunked prefill yet.")
 
 
 @dataclass