vllm.inputs.data ¶
DecoderInputs module-attribute ¶
DecoderInputs: TypeAlias = TokenInputs | MultiModalInputs
A processed decoder prompt from InputPreprocessor which can be passed to InputProcessor for encoder-decoder models.
DecoderOnlyInputs module-attribute ¶
DecoderOnlyInputs: TypeAlias = (
TokenInputs | EmbedsInputs | MultiModalInputs
)
A processed prompt from InputPreprocessor which can be passed to InputProcessor for decoder-only models.
DecoderOnlyPrompt module-attribute ¶
DecoderOnlyPrompt: TypeAlias = (
str
| TextPrompt
| list[int]
| TokensPrompt
| EmbedsPrompt
)
Schema of a prompt for a decoder-only model:
- A text prompt (string or
TextPrompt) - A tokenized prompt (list of token IDs, or
TokensPrompt) - An embeddings prompt (
EmbedsPrompt)
For encoder-decoder models, passing a singleton prompt is shorthand for passing ExplicitEncoderDecoderPrompt(encoder_prompt=prompt, decoder_prompt=None).
DecoderPrompt module-attribute ¶
DecoderPrompt: TypeAlias = (
str | TextPrompt | list[int] | TokensPrompt
)
Schema of a prompt for the decoder part of an encoder-decoder model:
- A text prompt (string or
TextPrompt) - A tokenized prompt (list of token IDs, or
TokensPrompt)
Note
Multi-modal inputs are not supported for decoder prompts.
EncoderDecoderPrompt module-attribute ¶
EncoderDecoderPrompt: TypeAlias = (
EncoderPrompt | ExplicitEncoderDecoderPrompt
)
Schema for a prompt for an encoder-decoder model.
You can pass a singleton encoder prompt, in which case the decoder prompt is considered to be None (i.e., infer automatically).
EncoderInputs module-attribute ¶
EncoderInputs: TypeAlias = (
TokenInputs | MultiModalEncDecInputs
)
A processed encoder prompt from InputPreprocessor which can be passed to InputProcessor for encoder-decoder models.
EncoderPrompt module-attribute ¶
EncoderPrompt: TypeAlias = (
str | TextPrompt | list[int] | TokensPrompt
)
Schema of a prompt for the encoder part of a encoder-decoder model:
- A text prompt (string or
TextPrompt) - A tokenized prompt (list of token IDs, or
TokensPrompt)
ProcessorInputs module-attribute ¶
ProcessorInputs: TypeAlias = (
DecoderOnlyInputs | EncoderDecoderInputs
)
A processed prompt from InputPreprocessor which can be passed to InputProcessor.
PromptType module-attribute ¶
PromptType: TypeAlias = (
DecoderOnlyPrompt | EncoderDecoderPrompt
)
Schema for any prompt, regardless of model type.
This is the input format accepted by most LLM APIs.
SingletonInputs module-attribute ¶
SingletonInputs: TypeAlias = (
DecoderOnlyInputs | MultiModalEncDecInputs
)
The inputs for a single encoder/decoder prompt.
SingletonPrompt module-attribute ¶
SingletonPrompt: TypeAlias = (
DecoderOnlyPrompt | EncoderPrompt | DecoderPrompt
)
Schema for a single prompt. This is as opposed to a data structure which encapsulates multiple prompts, such as ExplicitEncoderDecoderPrompt.
DataPrompt ¶
Bases: _PromptOptions
Represents generic inputs that are converted to PromptType by IO processor plugins.
Source code in vllm/inputs/data.py
EmbedsInputs ¶
EmbedsPrompt ¶
Bases: _PromptOptions
Schema for a prompt provided via token embeddings.
Source code in vllm/inputs/data.py
prompt instance-attribute ¶
prompt: NotRequired[str]
The prompt text corresponding to the token embeddings, if available.
EncoderDecoderInputs ¶
Bases: TypedDict
A processed pair of encoder and decoder singleton prompts. InputPreprocessor which can be passed to InputProcessor for encoder-decoder models.
Source code in vllm/inputs/data.py
ExplicitEncoderDecoderPrompt ¶
Bases: TypedDict
Schema for a pair of encoder and decoder singleton prompts.
Note
This schema is not valid for decoder-only models.
Source code in vllm/inputs/data.py
decoder_prompt instance-attribute ¶
decoder_prompt: DecoderPrompt | None
The prompt for the decoder part of the model.
Passing None will cause the prompt to be inferred automatically.
encoder_prompt instance-attribute ¶
encoder_prompt: EncoderPrompt
The prompt for the encoder part of the model.
StreamingInput dataclass ¶
Input data for a streaming generation request.
This is used with generate() to support multi-turn streaming sessions where inputs are provided via an async generator.
Source code in vllm/inputs/data.py
TextPrompt ¶
TokenInputs ¶
TokensPrompt ¶
Bases: _PromptOptions
Schema for a tokenized prompt.
Source code in vllm/inputs/data.py
prompt instance-attribute ¶
prompt: NotRequired[str]
The prompt text corresponding to the token IDs, if available.
prompt_token_ids instance-attribute ¶
A list of token IDs to pass to the model.
token_type_ids instance-attribute ¶
token_type_ids: NotRequired[list[int]]
A list of token type IDs to pass to the cross encoder model.
_InputOptions ¶
Bases: TypedDict
Additional options available to all input types.
Source code in vllm/inputs/data.py
cache_salt instance-attribute ¶
cache_salt: NotRequired[str]
Optional cache salt to be used for prefix caching.
_PromptOptions ¶
Bases: TypedDict
Additional options available to all SingletonPrompt.
Source code in vllm/inputs/data.py
cache_salt instance-attribute ¶
cache_salt: NotRequired[str]
Optional cache salt to be used for prefix caching.
mm_processor_kwargs instance-attribute ¶
mm_processor_kwargs: NotRequired[dict[str, Any] | None]
Optional multi-modal processor kwargs to be forwarded to the multimodal input mapper & processor. Note that if multiple modalities have registered mappers etc for the model being considered, we attempt to pass the mm_processor_kwargs to each of them.
multi_modal_data instance-attribute ¶
multi_modal_data: NotRequired[MultiModalDataDict | None]
Optional multi-modal data to pass to the model, if the model supports it.
multi_modal_uuids instance-attribute ¶
multi_modal_uuids: NotRequired[MultiModalUUIDDict]
Optional user-specified UUIDs for multimodal items, mapped by modality. Lists must match the number of items per modality and may contain None. For None entries, the hasher will compute IDs automatically; non-None entries override the default hashes for caching, and MUST be unique per multimodal item.
embeds_inputs ¶
embeds_inputs(
prompt_embeds: Tensor, cache_salt: str | None = None
) -> EmbedsInputs
Construct EmbedsInputs from optional values.
Source code in vllm/inputs/data.py
token_inputs ¶
token_inputs(
prompt_token_ids: list[int],
cache_salt: str | None = None,
) -> TokenInputs
Construct TokenInputs from optional values.