浏览代码

fix: guard for lora + chunked prefill

AlpinDale 7 月之前
父节点
当前提交
ee174ea4fd
共有 1 个文件被更改,包括 2 次插入0 次删除
  1. 2 0
      aphrodite/common/config.py

+ 2 - 0
aphrodite/common/config.py

@@ -1171,6 +1171,8 @@ class LoRAConfig:
                 "Due to limitations of the custom LoRA CUDA kernel, "
                 "Due to limitations of the custom LoRA CUDA kernel, "
                 "max_num_batched_tokens must be <= 65528 when "
                 "max_num_batched_tokens must be <= 65528 when "
                 "LoRA is enabled.")
                 "LoRA is enabled.")
+        if scheduler_config.chunked_prefill_enabled:
+            raise ValueError("LoRA is not supported with chunked prefill yet.")
 
 
 
 
 @dataclass
 @dataclass