CNN#

pydantic model vision_architectures.blocks.cnn.CNNBlockConfig[source]#

Bases: CustomBaseModel

Show JSON schema
{
   "title": "CNNBlockConfig",
   "type": "object",
   "properties": {
      "in_channels": {
         "description": "Number of input channels",
         "title": "In Channels",
         "type": "integer"
      },
      "out_channels": {
         "description": "Number of output channels",
         "title": "Out Channels",
         "type": "integer"
      },
      "kernel_size": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "items": {
                  "type": "integer"
               },
               "type": "array"
            }
         ],
         "description": "Kernel size for the convolution",
         "title": "Kernel Size"
      },
      "padding": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "items": {
                  "type": "integer"
               },
               "type": "array"
            },
            {
               "type": "string"
            }
         ],
         "default": "same",
         "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.",
         "title": "Padding"
      },
      "stride": {
         "default": 1,
         "description": "Stride for the convolution",
         "title": "Stride",
         "type": "integer"
      },
      "conv_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the convolution layer",
         "title": "Conv Kwargs",
         "type": "object"
      },
      "transposed": {
         "default": false,
         "description": "Whether to perform ConvTranspose instead of Conv",
         "title": "Transposed",
         "type": "boolean"
      },
      "normalization": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "batchnorm3d",
         "description": "Normalization layer type.",
         "title": "Normalization"
      },
      "normalization_pre_args": {
         "default": [],
         "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.",
         "items": {},
         "title": "Normalization Pre Args",
         "type": "array"
      },
      "normalization_post_args": {
         "default": [],
         "description": "Arguments for the normalization layer after providing the dimension.",
         "items": {},
         "title": "Normalization Post Args",
         "type": "array"
      },
      "normalization_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the normalization layer",
         "title": "Normalization Kwargs",
         "type": "object"
      },
      "activation": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "relu",
         "description": "Activation function type.",
         "title": "Activation"
      },
      "activation_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the activation function.",
         "title": "Activation Kwargs",
         "type": "object"
      },
      "sequence": {
         "default": "CNA",
         "description": "Sequence of operations in the block.",
         "enum": [
            "C",
            "AC",
            "CA",
            "CD",
            "CN",
            "DC",
            "NC",
            "ACD",
            "ACN",
            "ADC",
            "ANC",
            "CAD",
            "CAN",
            "CDA",
            "CDN",
            "CNA",
            "CND",
            "DAC",
            "DCA",
            "DCN",
            "DNC",
            "NAC",
            "NCA",
            "NCD",
            "NDC",
            "ACDN",
            "ACND",
            "ADCN",
            "ADNC",
            "ANCD",
            "ANDC",
            "CADN",
            "CAND",
            "CDAN",
            "CDNA",
            "CNAD",
            "CNDA",
            "DACN",
            "DANC",
            "DCAN",
            "DCNA",
            "DNAC",
            "DNCA",
            "NACD",
            "NADC",
            "NCAD",
            "NCDA",
            "NDAC",
            "NDCA"
         ],
         "title": "Sequence",
         "type": "string"
      },
      "drop_prob": {
         "default": 0.0,
         "description": "Dropout probability.",
         "title": "Drop Prob",
         "type": "number"
      }
   },
   "required": [
      "in_channels",
      "out_channels",
      "kernel_size"
   ]
}

Config:
  • arbitrary_types_allowed: bool = True

  • extra: str = ignore

  • validate_default: bool = True

  • validate_assignment: bool = True

  • validate_return: bool = True

Fields:
Validators:
field in_channels: int [Required]#

Number of input channels

Validated by:
field out_channels: int [Required]#

Number of output channels

Validated by:
field kernel_size: int | tuple[int, ...] [Required]#

Kernel size for the convolution

Validated by:
field padding: int | tuple[int, ...] | str = 'same'#

Padding for the convolution. Can be ‘same’ or an integer/tuple of integers.

Validated by:
field stride: int = 1#

Stride for the convolution

Validated by:
field conv_kwargs: dict[str, Any] = {}#

Additional keyword arguments for the convolution layer

Validated by:
field transposed: bool = False#

Whether to perform ConvTranspose instead of Conv

Validated by:
field normalization: str | None = 'batchnorm3d'#

Normalization layer type.

Validated by:
field normalization_pre_args: list = []#

Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.

Validated by:
field normalization_post_args: list = []#

Arguments for the normalization layer after providing the dimension.

Validated by:
field normalization_kwargs: dict = {}#

Additional keyword arguments for the normalization layer

Validated by:
field activation: str | None = 'relu'#

Activation function type.

Validated by:
field activation_kwargs: dict = {}#

Additional keyword arguments for the activation function.

Validated by:
field sequence: Literal['C', 'AC', 'CA', 'CD', 'CN', 'DC', 'NC', 'ACD', 'ACN', 'ADC', 'ANC', 'CAD', 'CAN', 'CDA', 'CDN', 'CNA', 'CND', 'DAC', 'DCA', 'DCN', 'DNC', 'NAC', 'NCA', 'NCD', 'NDC', 'ACDN', 'ACND', 'ADCN', 'ADNC', 'ANCD', 'ANDC', 'CADN', 'CAND', 'CDAN', 'CDNA', 'CNAD', 'CNDA', 'DACN', 'DANC', 'DCAN', 'DCNA', 'DNAC', 'DNCA', 'NACD', 'NADC', 'NCAD', 'NCDA', 'NDAC', 'NDCA'] = 'CNA'#

Sequence of operations in the block.

Validated by:
field drop_prob: float = 0.0#

Dropout probability.

Validated by:
validator validate  »  all fields[source]#

Base method for validating the model after creation.

pydantic model vision_architectures.blocks.cnn.MultiResCNNBlockConfig[source]#

Bases: CNNBlockConfig

Show JSON schema
{
   "title": "MultiResCNNBlockConfig",
   "type": "object",
   "properties": {
      "in_channels": {
         "description": "Number of input channels",
         "title": "In Channels",
         "type": "integer"
      },
      "out_channels": {
         "description": "Number of output channels",
         "title": "Out Channels",
         "type": "integer"
      },
      "kernel_size": {
         "default": 3,
         "description": "Kernel size for the convolution. Only kernel_size=3 is supported for MultiResCNNBlock.",
         "title": "Kernel Size",
         "type": "integer"
      },
      "padding": {
         "const": "same",
         "default": "same",
         "description": "Padding for the convolution. Only 'same' is supported for MultiResCNNBlock.",
         "title": "Padding",
         "type": "string"
      },
      "stride": {
         "default": 1,
         "description": "Stride for the convolution",
         "title": "Stride",
         "type": "integer"
      },
      "conv_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the convolution layer",
         "title": "Conv Kwargs",
         "type": "object"
      },
      "transposed": {
         "default": false,
         "description": "Whether to perform ConvTranspose instead of Conv",
         "title": "Transposed",
         "type": "boolean"
      },
      "normalization": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "batchnorm3d",
         "description": "Normalization layer type.",
         "title": "Normalization"
      },
      "normalization_pre_args": {
         "default": [],
         "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.",
         "items": {},
         "title": "Normalization Pre Args",
         "type": "array"
      },
      "normalization_post_args": {
         "default": [],
         "description": "Arguments for the normalization layer after providing the dimension.",
         "items": {},
         "title": "Normalization Post Args",
         "type": "array"
      },
      "normalization_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the normalization layer",
         "title": "Normalization Kwargs",
         "type": "object"
      },
      "activation": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "relu",
         "description": "Activation function type.",
         "title": "Activation"
      },
      "activation_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Additional keyword arguments for the activation function.",
         "title": "Activation Kwargs",
         "type": "object"
      },
      "sequence": {
         "default": "CNA",
         "description": "Sequence of operations in the block.",
         "enum": [
            "C",
            "AC",
            "CA",
            "CD",
            "CN",
            "DC",
            "NC",
            "ACD",
            "ACN",
            "ADC",
            "ANC",
            "CAD",
            "CAN",
            "CDA",
            "CDN",
            "CNA",
            "CND",
            "DAC",
            "DCA",
            "DCN",
            "DNC",
            "NAC",
            "NCA",
            "NCD",
            "NDC",
            "ACDN",
            "ACND",
            "ADCN",
            "ADNC",
            "ANCD",
            "ANDC",
            "CADN",
            "CAND",
            "CDAN",
            "CDNA",
            "CNAD",
            "CNDA",
            "DACN",
            "DANC",
            "DCAN",
            "DCNA",
            "DNAC",
            "DNCA",
            "NACD",
            "NADC",
            "NCAD",
            "NCDA",
            "NDAC",
            "NDCA"
         ],
         "title": "Sequence",
         "type": "string"
      },
      "drop_prob": {
         "default": 0.0,
         "description": "Dropout probability.",
         "title": "Drop Prob",
         "type": "number"
      },
      "kernel_sizes": {
         "default": [
            3,
            5,
            7
         ],
         "description": "Kernel sizes for each conv layer.",
         "items": {
            "anyOf": [
               {
                  "type": "integer"
               },
               {
                  "items": {
                     "type": "integer"
                  },
                  "type": "array"
               }
            ]
         },
         "title": "Kernel Sizes",
         "type": "array"
      },
      "filter_ratios": {
         "default": [
            1,
            2,
            3
         ],
         "description": "Ratio of filters to out_channels for each conv layer. Will be scaled to sum to 1.",
         "items": {
            "type": "number"
         },
         "title": "Filter Ratios",
         "type": "array"
      }
   },
   "required": [
      "in_channels",
      "out_channels"
   ]
}

Config:
  • arbitrary_types_allowed: bool = True

  • extra: str = ignore

  • validate_default: bool = True

  • validate_assignment: bool = True

  • validate_return: bool = True

Fields:
Validators:
field kernel_sizes: tuple[int | tuple[int, ...], ...] = (3, 5, 7)#

Kernel sizes for each conv layer.

Validated by:
field filter_ratios: tuple[float, ...] = (1, 2, 3)#

Ratio of filters to out_channels for each conv layer. Will be scaled to sum to 1.

Validated by:
field padding: Literal['same'] = 'same'#

Padding for the convolution. Only ‘same’ is supported for MultiResCNNBlock.

Validated by:
field kernel_size: int = 3#

Kernel size for the convolution. Only kernel_size=3 is supported for MultiResCNNBlock.

Validated by:
validator scale_filter_ratios  »  filter_ratios[source]#
validator validate  »  all fields[source]#

Base method for validating the model after creation.

class vision_architectures.blocks.cnn.CNNBlock3D(config={}, checkpointing_level=0, **kwargs)[source]#

Bases: _CNNBlock

A block to perform a sequence of convolution, activation, normalization, and dropout operations. This class is designed for 3D input eg. medical images, videos etc.

__init__(config={}, checkpointing_level=0, **kwargs)[source]#

Initialize the CNNBlock3D block. Activation checkpointing level 1.

Parameters:
  • config (CNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.

  • checkpointing_level (int) – The level of checkpointing to use for activation checkpointing. Refer to ActivationCheckpointing for more details.

  • **kwargs – Additional keyword arguments for configuration.

class vision_architectures.blocks.cnn.CNNBlock2D(config={}, checkpointing_level=0, **kwargs)[source]#

Bases: _CNNBlock

A block to perform a sequence of convolution, activation, normalization, and dropout operations. This class is designed for 2D input eg. natural images etc.

__init__(config={}, checkpointing_level=0, **kwargs)[source]#

Initialize the CNNBlock2D block. Activation checkpointing level 1.

Parameters:
  • config (CNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.

  • checkpointing_level (int) – The level of checkpointing to use for activation checkpointing. Refer to ActivationCheckpointing for more details.

  • **kwargs – Additional keyword arguments for configuration.

class vision_architectures.blocks.cnn.MultiResCNNBlock3D(config={}, checkpointing_level=0, **kwargs)[source]#

Bases: _MultiResCNNBlock

__init__(config={}, checkpointing_level=0, **kwargs)[source]#

Initialize the MultiResCNNBlock3D block. Activation checkpointing level 2.

Parameters:
  • config (MultiResCNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.

  • checkpointing_level (int) – The level of checkpointing to use for activation checkpointing. Refer to ActivationCheckpointing for more details.

  • **kwargs – Additional keyword arguments for configuration.

class vision_architectures.blocks.cnn.MultiResCNNBlock2D(config={}, checkpointing_level=0, **kwargs)[source]#

Bases: _MultiResCNNBlock

__init__(config={}, checkpointing_level=0, **kwargs)[source]#

Initialize the MultiResCNNBlock2D block. Activation checkpointing level 2.

Parameters:
  • config (MultiResCNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.

  • checkpointing_level (int) – The level of checkpointing to use for activation checkpointing. Refer to ActivationCheckpointing for more details.

  • **kwargs – Additional keyword arguments for configuration.

class vision_architectures.blocks.cnn.TensorSplittingConv(conv, num_splits, optimize_num_splits=True)[source]#

Bases: Module

Convolution layer that operates on splits of a tensor on desired device and concatenates the results to give a lossless output. This is useful for large input tensors that cause intermediate buffers in the conv layer that don’t fit in memory. Works for both 2D and 3D convolutions.

__init__(conv, num_splits, optimize_num_splits=True)[source]#

Initialize the TensorSplittingConv layer.

Parameters:
  • conv (Module) – Convolution layer to be used for splitting. Must be either nn.Conv2d or nn.Conv3d.

  • num_splits (int | tuple[int, ...]) – Number of splits for each spatial dimension. If an int is provided, it will be used for all spatial dimensions. If a tuple is provided, it must have the same length as the number of spatial dimensions.

  • optimize_num_splits (bool) – Whether to optimize the number of splits based on the input shape. An example of optimization is provided below. Defaults to True.

get_receptive_field()#

Calculate the receptive field of the convolution layer.

Return type:

tuple[int, ...]

get_edge_context()#

Calculate the context size required to eliminate edge effects when merging the conv outputs into one.

get_input_shape(input_shape)[source]#

Get the input shape of the convolution layer. This function removes any unnecesary dimensions and ensures that the input shape is of length equal to the number of spatial dimensions.

Parameters:

input_shape (tuple[int, ...] | Size | Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.

Return type:

tuple[int, ...]

Returns:

Tuple of the input shape for the convolution layer, with only the spatial dimensions.

get_optimized_num_splits(input_shape)[source]#

Optimize the number of splits for each dimension based on the input shape and number of splits

Example

Let’s say input shape is (110, 110) and num_splits is (12, 12). The input will first be padded to (120, 120) and then split into splits of size (10+overlap, 10+overlap) each. However, if you notice, the padding that was also equal to 10, and therefore was completely unnecessary as the same result can be achieved by using num_splits = (11, 11) and reducing 144-121=23 splits to be processed.

Parameters:

input_shape (tuple[int, ...] | Size | Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.

Return type:

tuple[int, ...]

Returns:

Tuple of optimized number of splits for each dimension.

pad_input_for_divisibility(x, num_splits=None)[source]#

Pad the input at the end of every spatial dimension such that it is perfectly divisible by the number of splits.

Parameters:
  • x (Tensor) – Input tensor of shape (batch_size, in_channels, [z], y, x).

  • num_splits (Optional[tuple[int, ...]]) – Number of splits for each spatial dimension. If None, the default num_splits will be used.

Return type:

Tensor

Returns:

Padded input tensor of shape (batch_size, in_channels, [z], y, x).

get_split_size(input_shape, num_splits=None)[source]#

Calculate the split size for each dimension based on the input shape and number of splits.

Parameters:

input_shape (tuple[int, ...] | Size | Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.

Return type:

tuple[int, ...]

Returns:

Tuple of split sizes for each dimension.

get_split_stride(input_shape, num_splits=None)[source]#

Calculate the split stride for each dimension based on the input shape and context size.

Parameters:
  • input_shape (tuple[int, ...] | Size | Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.

  • num_splits (Optional[tuple[int, ...]]) – Number of splits for each spatial dimension. If None, the default num_splits will be used.

Return type:

tuple[int, ...]

Returns:

Tuple of split strides for each dimension.

pad_input_for_context(x)[source]#

Pad the input with the context size for consistent merging.

Parameters:

x (Tensor) – Input tensor of shape (batch_size, in_channels, [z], y, x).

Return type:

Tensor

Returns:

Padded input tensor of shape (batch_size, in_channels, [z], y, x).

forward(x, channels_first=True)[source]#
Forward pass through the convolution layer with tensor splitting parallelism. Main convolution occurs on it’s

device, but the output is built on the input tensor’s device.

Parameters:

x (Tensor) – Tensor of shape (B, C, Z, Y, X) or (B, Z, Y, X, C) representing the input features.

Return type:

Tensor

Returns:

Tensor of shape (B, C, Z, Y, X) or (B, Z, Y, X, C) representing the output features.

extra_repr()[source]#

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

vision_architectures.blocks.cnn.add_tsp_to_module(module, num_splits_2d=None, num_splits_3d=None, strict=True)[source]#

Recursively add TensorSplittingConv to the module for all Conv2d and Conv3d layers.

Parameters:
  • module (Module) – The module to modify.

  • num_splits_2d (Union[int, tuple[int, int], None]) – Number of splits for 2D convolutions. If None, 2D convolutions will not be modified.

  • num_splits_3d (Union[int, tuple[int, int, int], None]) – Number of splits for 3D convolutions. If None, 3D convolutions will not be modified.

  • strict (bool) – Whether to raise an error if a conversion fails. If False, it will log the error and continue.

Return type:

Module

Returns:

The modified module with TensorSplittingConv layers.

Raises:
  • ValueError – If both num_splits_2d and num_splits_3d are None.

  • Exception – If a conversion fails and strict is True.

vision_architectures.blocks.cnn.remove_tsp_from_module(module)[source]#

Recursively remove TensorSplittingConv from the module and replace it with the original convolution layer.

Parameters:

module (Module) – The module to modify.

Return type:

Module

Returns:

The modified module with TensorSplittingConv layers replaced by the original convolution layers.