Codebook#
- pydantic model vision_architectures.layers.codebook.CodebookConfig[source]#
Bases:
CustomBaseModelShow JSON schema
{ "title": "CodebookConfig", "type": "object", "properties": { "num_vectors": { "description": "Number of vectors in the codebook", "title": "Num Vectors", "type": "integer" }, "dim": { "description": "Dimension of each vector in the codebook", "title": "Dim", "type": "integer" }, "revive_dead_vectors_after_n_steps": { "default": 100, "description": "Number of steps after which a vector is declared dead and is revived (0 means never revive)", "title": "Revive Dead Vectors After N Steps", "type": "integer" }, "ema_decay": { "anyOf": [ { "type": "number" }, { "type": "null" } ], "default": 0.99, "description": "EMA decay rate for updating codebook vectors", "title": "Ema Decay" } }, "required": [ "num_vectors", "dim" ] }
- Config:
arbitrary_types_allowed: bool = True
extra: str = ignore
validate_default: bool = True
validate_assignment: bool = True
validate_return: bool = True
- Fields:
- Validators:
-
field num_vectors:
int[Required]# Number of vectors in the codebook
- Validated by:
-
field dim:
int[Required]# Dimension of each vector in the codebook
- Validated by:
-
field revive_dead_vectors_after_n_steps:
int= 100# Number of steps after which a vector is declared dead and is revived (0 means never revive)
- Validated by:
-
field ema_decay:
float|None= 0.99# EMA decay rate for updating codebook vectors
- Validated by:
- property use_ema: bool#
- class vision_architectures.layers.codebook.Codebook(config={}, **kwargs)[source]#
Bases:
Module,PyTorchModelHubMixinCodebook that can be used for vector quantization. This implementation maintains the vectors in distributed settings. It also supports exponential moving average (EMA) updates of the codebook vectors as well as reviving dead vectors.
- __init__(config={}, **kwargs)[source]#
Initialize the Codebook.
- Parameters:
config (
CodebookConfig) – Additional keyword arguments for configuration.**kwargs – Additional keyword arguments for configuration.
- calculate_perplexity(indices)#
Calculate perplexity of the codebook usage.
- Parameters:
indices (
Tensor) – Indices of the codebook vectors chosen for each input vector.- Return type:
Tensor- Returns:
Perplexity of the codebook usage.
- calculate_losses(x, z)[source]#
Calculate codebook and commitment losses.
- Parameters:
x (
Tensor) – Input vectors. Should be of shape (BS, C) where BS is a combination of batch and spatial/temporal dimensions.z (
Tensor) – Quantized vectors. Should be of shape (BS, C).
- Returns:
Codebook loss. commitment_loss: Commitment loss.
- Return type:
codebook_loss
- quantize(x)[source]#
Quantize the input vectors using the codebook and return along with losses and perplexity.
- Parameters:
x (
Tensor) – Input vectors to be quantized. Should be of shape (BS, C) where BS is a combination of batch and spatial/temporal dimensions.- Returns:
Quantized vectors of shape (BS, C). codebook_loss: Codebook loss. commitment_loss: Commitment loss. perplexity: Perplexity of the codebook usage.
- Return type:
z
- revive_dead_vectors()[source]#
Revive dead vectors in the codebook by replacing them with noised commonly used vectors.
- forward(x, channels_first=None)[source]#
Quantize the input tensor using the codebook. Update the codebook vectors if using EMA. Revive dead vectors if applicable..
- Parameters:
x (
Tensor) – Input tensor to be quantized. Should be of shape (B, …, C) if channels_first is False, (B, C, …) if channels_first is True or None with ndim != 3, else (B, T, C) if channels_first is None and ndim == 3.channels_first (
Optional[bool]) – Whether the input tensor has channels as the first dimension after batch dimension.
- Returns:
Quantized tensor of the same shape as input. codebook_loss: Codebook loss. commitment_loss: Commitment loss. perplexity: Perplexity of the codebook usage.
- Return type:
z