Skip to content

vllm.model_executor.models.granitemoeshared

Inference-only GraniteMoeShared model.

The architecture is the same as granitemoe but with the addition of shared experts.