Latent Space#

pydantic model vision_architectures.layers.latent_space.LatentEncoderConfig[source]#

Bases: CNNBlockConfig

Show JSON schema
{
   "title": "LatentEncoderConfig",
   "type": "object",
   "properties": {
      "in_channels": {
         "description": "Number of input channels",
         "title": "In Channels",
         "type": "integer"
      },
      "out_channels": {
         "description": "Number of output channels",
         "title": "Out Channels",
         "type": "integer"
      },
      "kernel_size": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "items": {
                  "type": "integer"
               },
               "type": "array"
            }
         ],
         "description": "Kernel size for the convolution",
         "title": "Kernel Size"
      },
      "padding": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "items": {
                  "type": "integer"
               },
               "type": "array"
            },
            {
               "type": "string"
            }
         ],
         "default": "same",
         "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.",
         "title": "Padding"
      },
      "stride": {
         "default": 1,
         "description": "Stride for the convolution",
         "title": "Stride",
         "type": "integer"
      },
      "conv_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the convolution layer",
         "title": "Conv Kwargs",
         "type": "object"
      },
      "transposed": {
         "default": false,
         "description": "Whether to perform ConvTranspose instead of Conv",
         "title": "Transposed",
         "type": "boolean"
      },
      "normalization": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "batchnorm3d",
         "description": "Normalization layer type.",
         "title": "Normalization"
      },
      "normalization_pre_args": {
         "default": [],
         "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.",
         "items": {},
         "title": "Normalization Pre Args",
         "type": "array"
      },
      "normalization_post_args": {
         "default": [],
         "description": "Arguments for the normalization layer after providing the dimension.",
         "items": {},
         "title": "Normalization Post Args",
         "type": "array"
      },
      "normalization_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the normalization layer",
         "title": "Normalization Kwargs",
         "type": "object"
      },
      "activation": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "relu",
         "description": "Activation function type.",
         "title": "Activation"
      },
      "activation_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the activation function.",
         "title": "Activation Kwargs",
         "type": "object"
      },
      "sequence": {
         "default": "CNA",
         "description": "Sequence of operations in the block.",
         "enum": [
            "C",
            "AC",
            "CA",
            "CD",
            "CN",
            "DC",
            "NC",
            "ACD",
            "ACN",
            "ADC",
            "ANC",
            "CAD",
            "CAN",
            "CDA",
            "CDN",
            "CNA",
            "CND",
            "DAC",
            "DCA",
            "DCN",
            "DNC",
            "NAC",
            "NCA",
            "NCD",
            "NDC",
            "ACDN",
            "ACND",
            "ADCN",
            "ADNC",
            "ANCD",
            "ANDC",
            "CADN",
            "CAND",
            "CDAN",
            "CDNA",
            "CNAD",
            "CNDA",
            "DACN",
            "DANC",
            "DCAN",
            "DCNA",
            "DNAC",
            "DNCA",
            "NACD",
            "NADC",
            "NCAD",
            "NCDA",
            "NDAC",
            "NDCA"
         ],
         "title": "Sequence",
         "type": "string"
      },
      "drop_prob": {
         "default": 0.0,
         "description": "Dropout probability.",
         "title": "Drop Prob",
         "type": "number"
      },
      "init_low_var": {
         "default": false,
         "description": "Whether to initialize weights such that output variance is low",
         "title": "Init Low Var",
         "type": "boolean"
      }
   },
   "required": [
      "in_channels",
      "out_channels",
      "kernel_size"
   ]
}

Config:
  • arbitrary_types_allowed: bool = True

  • extra: str = ignore

  • validate_default: bool = True

  • validate_assignment: bool = True

  • validate_return: bool = True

Fields:
Validators:
field init_low_var: bool = False#

Whether to initialize weights such that output variance is low

Validated by:
validator validate_before  »  all fields[source]#

Base class method for validating data before creating the model.

property dim#
property latent_dim#
pydantic model vision_architectures.layers.latent_space.LatentDecoderConfig[source]#

Bases: CNNBlockConfig

Show JSON schema
{
   "title": "LatentDecoderConfig",
   "type": "object",
   "properties": {
      "in_channels": {
         "description": "Number of input channels",
         "title": "In Channels",
         "type": "integer"
      },
      "out_channels": {
         "description": "Number of output channels",
         "title": "Out Channels",
         "type": "integer"
      },
      "kernel_size": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "items": {
                  "type": "integer"
               },
               "type": "array"
            }
         ],
         "description": "Kernel size for the convolution",
         "title": "Kernel Size"
      },
      "padding": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "items": {
                  "type": "integer"
               },
               "type": "array"
            },
            {
               "type": "string"
            }
         ],
         "default": "same",
         "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.",
         "title": "Padding"
      },
      "stride": {
         "default": 1,
         "description": "Stride for the convolution",
         "title": "Stride",
         "type": "integer"
      },
      "conv_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the convolution layer",
         "title": "Conv Kwargs",
         "type": "object"
      },
      "transposed": {
         "default": false,
         "description": "Whether to perform ConvTranspose instead of Conv",
         "title": "Transposed",
         "type": "boolean"
      },
      "normalization": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "batchnorm3d",
         "description": "Normalization layer type.",
         "title": "Normalization"
      },
      "normalization_pre_args": {
         "default": [],
         "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.",
         "items": {},
         "title": "Normalization Pre Args",
         "type": "array"
      },
      "normalization_post_args": {
         "default": [],
         "description": "Arguments for the normalization layer after providing the dimension.",
         "items": {},
         "title": "Normalization Post Args",
         "type": "array"
      },
      "normalization_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the normalization layer",
         "title": "Normalization Kwargs",
         "type": "object"
      },
      "activation": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "relu",
         "description": "Activation function type.",
         "title": "Activation"
      },
      "activation_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the activation function.",
         "title": "Activation Kwargs",
         "type": "object"
      },
      "sequence": {
         "default": "CNA",
         "description": "Sequence of operations in the block.",
         "enum": [
            "C",
            "AC",
            "CA",
            "CD",
            "CN",
            "DC",
            "NC",
            "ACD",
            "ACN",
            "ADC",
            "ANC",
            "CAD",
            "CAN",
            "CDA",
            "CDN",
            "CNA",
            "CND",
            "DAC",
            "DCA",
            "DCN",
            "DNC",
            "NAC",
            "NCA",
            "NCD",
            "NDC",
            "ACDN",
            "ACND",
            "ADCN",
            "ADNC",
            "ANCD",
            "ANDC",
            "CADN",
            "CAND",
            "CDAN",
            "CDNA",
            "CNAD",
            "CNDA",
            "DACN",
            "DANC",
            "DCAN",
            "DCNA",
            "DNAC",
            "DNCA",
            "NACD",
            "NADC",
            "NCAD",
            "NCDA",
            "NDAC",
            "NDCA"
         ],
         "title": "Sequence",
         "type": "string"
      },
      "drop_prob": {
         "default": 0.0,
         "description": "Dropout probability.",
         "title": "Drop Prob",
         "type": "number"
      }
   },
   "required": [
      "in_channels",
      "out_channels",
      "kernel_size"
   ]
}

Config:
  • arbitrary_types_allowed: bool = True

  • extra: str = ignore

  • validate_default: bool = True

  • validate_assignment: bool = True

  • validate_return: bool = True

Fields:

Validators:
validator validate_before  »  all fields[source]#

Base class method for validating data before creating the model.

property latent_dim#
property dim#
pydantic model vision_architectures.layers.latent_space.GaussianLatentSpaceConfig[source]#

Bases: CustomBaseModel

Show JSON schema
{
   "title": "GaussianLatentSpaceConfig",
   "type": "object",
   "properties": {}
}

Config:
  • arbitrary_types_allowed: bool = True

  • extra: str = ignore

  • validate_default: bool = True

  • validate_assignment: bool = True

  • validate_return: bool = True

Validators:

class vision_architectures.layers.latent_space.LatentEncoder(config={}, checkpointing_level=0, **kwargs)[source]#

Bases: Module

__init__(config={}, checkpointing_level=0, **kwargs)[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

init_low_var(bias_constant=-1.0)[source]#
forward(x, prior_mu=None, prior_log_var=None, return_log_var=False, max_mu=100.0, max_log_var=10.0, channels_first=True)[source]#

Get latent space representation of the input by mapping it to the latent dimension and then extracting the mean and standard deviation of the latent space. If a prior distribution is provided, the input is expected to predict the deviation from the prior. If it is not provided, one can think of it as the deviation from a standard normal distribution is being predicted. The output is the mean and standard deviation of the latent space.

Parameters:
  • x (Tensor) – The input feature tensor.

  • prior_mu (Optional[Tensor]) – The mean of the prior distribution. If None, it is assumed to be the mean of a standard normal distribution. Defaults to None.

  • prior_log_var (Optional[Tensor]) – The log-variance of the prior distribution. If None, it is assumed to be log-variance of a standard normal distribution. Defaults to None.

  • return_log_var (bool) – Whether to return the log-variance too. Defaults to False.

  • max_mu (float) – Clamps sigma to the minimum and maximum values allowed i.e. to the range [-max_mu, max_mu]. Defaults to 100.0.

  • max_log_var (float) – Clamps log-variance to the minimum and maximum values allowed i.e. to the range [-max_log_var, max_log_var]. Defaults to 10.0, which corresponds to a variance from 0.000045 (std=0.006737) to 22026.465 (std=148.413).

  • channels_first (bool) – Whether the inputs are in channels first format (B, C, …) or not (B, …, C).. Defaults to True.

Returns:

The mean of the latent space. z_sigma: The standard deviation of the latent space.

Return type:

z_mu

class vision_architectures.layers.latent_space.LatentDecoder(config={}, checkpointing_level=0, **kwargs)[source]#

Bases: Module

__init__(config={}, checkpointing_level=0, **kwargs)[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(z, channels_first=True)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class vision_architectures.layers.latent_space.GaussianLatentSpace(config={}, **kwargs)[source]#

Bases: Module

__init__(config={}, **kwargs)[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

sample(z_mu, z_sigma, force_sampling=False)[source]#
static kl_divergence(z_mu, z_sigma, prior_mu=None, prior_sigma=None, reduction='allsum', channels_first=True)[source]#
forward(z_mu, z_sigma, prior_mu=None, prior_sigma=None, kl_divergence_reduction='allsum', force_sampling=False, channels_first=True)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.