README 395 B

123456789
  1. This directory contains tuned configurations for different settings of the fused_moe kernel.
  2. For different settings of
  3. - E (number of experts)
  4. - N (intermediate size)
  5. - device_name (torch.cuda.get_device_name())
  6. the JSON file contains a mapping from M (batch size) to the chosen configuration.
  7. Mixtral has intermediate size N = 14336, i.e. for TP2 we have
  8. N = 7168 and for TP4 we have N = 3584.