vllm.v1.attention.backends.registry ¶
Attention backend registry
AttentionBackendEnum ¶
Bases: Enum
Enumeration of all supported attention backends.
The enum value is the default class path, but this can be overridden at runtime using register_backend().
To get the actual backend class (respecting overrides), use: backend.get_class()
Source code in vllm/v1/attention/backends/registry.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
clear_override ¶
get_class ¶
get_class() -> type[AttentionBackend]
Get the backend class (respects overrides).
Returns:
| Type | Description |
|---|---|
type[AttentionBackend] | The backend class |
Raises:
| Type | Description |
|---|---|
ImportError | If the backend class cannot be imported |
ValueError | If Backend.CUSTOM is used without being registered |
Source code in vllm/v1/attention/backends/registry.py
get_path ¶
Get the class path for this backend (respects overrides).
Returns:
| Type | Description |
|---|---|
str | The fully qualified class path string |
Raises:
| Type | Description |
|---|---|
ValueError | If Backend.CUSTOM is used without being registered |
Source code in vllm/v1/attention/backends/registry.py
MambaAttentionBackendEnum ¶
Bases: Enum
Enumeration of all supported mamba attention backends.
The enum value is the default class path, but this can be overridden at runtime using register_backend().
To get the actual backend class (respecting overrides), use: backend.get_class()
Source code in vllm/v1/attention/backends/registry.py
clear_override ¶
get_class ¶
get_class() -> type[AttentionBackend]
Get the backend class (respects overrides).
Returns:
| Type | Description |
|---|---|
type[AttentionBackend] | The backend class |
Raises:
| Type | Description |
|---|---|
ImportError | If the backend class cannot be imported |
ValueError | If Backend.CUSTOM is used without being registered |
Source code in vllm/v1/attention/backends/registry.py
get_path ¶
Get the class path for this backend (respects overrides).
Returns:
| Type | Description |
|---|---|
str | The fully qualified class path string |
Raises:
| Type | Description |
|---|---|
ValueError | If Backend.CUSTOM is used without being registered |
Source code in vllm/v1/attention/backends/registry.py
_AttentionBackendEnumMeta ¶
Bases: EnumMeta
Metaclass for AttentionBackendEnum to provide better error messages.
Source code in vllm/v1/attention/backends/registry.py
__getitem__ ¶
__getitem__(name: str)
Get backend by name with helpful error messages.
Source code in vllm/v1/attention/backends/registry.py
register_backend ¶
register_backend(
backend: AttentionBackendEnum
| MambaAttentionBackendEnum,
class_path: str | None = None,
is_mamba: bool = False,
) -> Callable[[type], type]
Register or override a backend implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backend | AttentionBackendEnum | MambaAttentionBackendEnum | The AttentionBackendEnum member to register | required |
class_path | str | None | Optional class path. If not provided and used as decorator, will be auto-generated from the class. | None |
Returns:
| Type | Description |
|---|---|
Callable[[type], type] | Decorator function if class_path is None, otherwise a no-op |
Examples:
Override an existing attention backend¶
@register_backend(AttentionBackendEnum.FLASH_ATTN) class MyCustomFlashAttn: ...
Override an existing mamba attention backend¶
@register_backend(MambaAttentionBackendEnum.LINEAR, is_mamba=True) class MyCustomMambaAttn: ...
Register a custom third-party attention backend¶
@register_backend(AttentionBackendEnum.CUSTOM) class MyCustomBackend: ...
Direct registration¶
register_backend( AttentionBackendEnum.CUSTOM, "my.module.MyCustomBackend" )