CNN#
- pydantic model vision_architectures.blocks.cnn.CNNBlockConfig[source]#
Bases:
CustomBaseModelShow JSON schema
{ "title": "CNNBlockConfig", "type": "object", "properties": { "in_channels": { "description": "Number of input channels", "title": "In Channels", "type": "integer" }, "out_channels": { "description": "Number of output channels", "title": "Out Channels", "type": "integer" }, "kernel_size": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" } ], "description": "Kernel size for the convolution", "title": "Kernel Size" }, "padding": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" }, { "type": "string" } ], "default": "same", "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.", "title": "Padding" }, "stride": { "default": 1, "description": "Stride for the convolution", "title": "Stride", "type": "integer" }, "conv_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the convolution layer", "title": "Conv Kwargs", "type": "object" }, "transposed": { "default": false, "description": "Whether to perform ConvTranspose instead of Conv", "title": "Transposed", "type": "boolean" }, "normalization": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "batchnorm3d", "description": "Normalization layer type.", "title": "Normalization" }, "normalization_pre_args": { "default": [], "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.", "items": {}, "title": "Normalization Pre Args", "type": "array" }, "normalization_post_args": { "default": [], "description": "Arguments for the normalization layer after providing the dimension.", "items": {}, "title": "Normalization Post Args", "type": "array" }, "normalization_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the normalization layer", "title": "Normalization Kwargs", "type": "object" }, "activation": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "relu", "description": "Activation function type.", "title": "Activation" }, "activation_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the activation function.", "title": "Activation Kwargs", "type": "object" }, "sequence": { "default": "CNA", "description": "Sequence of operations in the block.", "enum": [ "C", "AC", "CA", "CD", "CN", "DC", "NC", "ACD", "ACN", "ADC", "ANC", "CAD", "CAN", "CDA", "CDN", "CNA", "CND", "DAC", "DCA", "DCN", "DNC", "NAC", "NCA", "NCD", "NDC", "ACDN", "ACND", "ADCN", "ADNC", "ANCD", "ANDC", "CADN", "CAND", "CDAN", "CDNA", "CNAD", "CNDA", "DACN", "DANC", "DCAN", "DCNA", "DNAC", "DNCA", "NACD", "NADC", "NCAD", "NCDA", "NDAC", "NDCA" ], "title": "Sequence", "type": "string" }, "drop_prob": { "default": 0.0, "description": "Dropout probability.", "title": "Drop Prob", "type": "number" } }, "required": [ "in_channels", "out_channels", "kernel_size" ] }
- Config:
arbitrary_types_allowed: bool = True
extra: str = ignore
validate_default: bool = True
validate_assignment: bool = True
validate_return: bool = True
- Fields:
- Validators:
validate»all fields
-
field in_channels:
int[Required]# Number of input channels
- Validated by:
-
field out_channels:
int[Required]# Number of output channels
- Validated by:
-
field kernel_size:
int|tuple[int,...] [Required]# Kernel size for the convolution
- Validated by:
-
field padding:
int|tuple[int,...] |str= 'same'# Padding for the convolution. Can be ‘same’ or an integer/tuple of integers.
- Validated by:
-
field stride:
int= 1# Stride for the convolution
- Validated by:
-
field conv_kwargs:
dict[str,Any] = {}# Additional keyword arguments for the convolution layer
- Validated by:
-
field transposed:
bool= False# Whether to perform ConvTranspose instead of Conv
- Validated by:
-
field normalization:
str|None= 'batchnorm3d'# Normalization layer type.
- Validated by:
-
field normalization_pre_args:
list= []# Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.
- Validated by:
-
field normalization_post_args:
list= []# Arguments for the normalization layer after providing the dimension.
- Validated by:
-
field normalization_kwargs:
dict= {}# Additional keyword arguments for the normalization layer
- Validated by:
-
field activation:
str|None= 'relu'# Activation function type.
- Validated by:
-
field activation_kwargs:
dict= {}# Additional keyword arguments for the activation function.
- Validated by:
-
field sequence:
Literal['C','AC','CA','CD','CN','DC','NC','ACD','ACN','ADC','ANC','CAD','CAN','CDA','CDN','CNA','CND','DAC','DCA','DCN','DNC','NAC','NCA','NCD','NDC','ACDN','ACND','ADCN','ADNC','ANCD','ANDC','CADN','CAND','CDAN','CDNA','CNAD','CNDA','DACN','DANC','DCAN','DCNA','DNAC','DNCA','NACD','NADC','NCAD','NCDA','NDAC','NDCA'] = 'CNA'# Sequence of operations in the block.
- Validated by:
-
field drop_prob:
float= 0.0# Dropout probability.
- Validated by:
- pydantic model vision_architectures.blocks.cnn.MultiResCNNBlockConfig[source]#
Bases:
CNNBlockConfigShow JSON schema
{ "title": "MultiResCNNBlockConfig", "type": "object", "properties": { "in_channels": { "description": "Number of input channels", "title": "In Channels", "type": "integer" }, "out_channels": { "description": "Number of output channels", "title": "Out Channels", "type": "integer" }, "kernel_size": { "default": 3, "description": "Kernel size for the convolution. Only kernel_size=3 is supported for MultiResCNNBlock.", "title": "Kernel Size", "type": "integer" }, "padding": { "const": "same", "default": "same", "description": "Padding for the convolution. Only 'same' is supported for MultiResCNNBlock.", "title": "Padding", "type": "string" }, "stride": { "default": 1, "description": "Stride for the convolution", "title": "Stride", "type": "integer" }, "conv_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the convolution layer", "title": "Conv Kwargs", "type": "object" }, "transposed": { "default": false, "description": "Whether to perform ConvTranspose instead of Conv", "title": "Transposed", "type": "boolean" }, "normalization": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "batchnorm3d", "description": "Normalization layer type.", "title": "Normalization" }, "normalization_pre_args": { "default": [], "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.", "items": {}, "title": "Normalization Pre Args", "type": "array" }, "normalization_post_args": { "default": [], "description": "Arguments for the normalization layer after providing the dimension.", "items": {}, "title": "Normalization Post Args", "type": "array" }, "normalization_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the normalization layer", "title": "Normalization Kwargs", "type": "object" }, "activation": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "relu", "description": "Activation function type.", "title": "Activation" }, "activation_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the activation function.", "title": "Activation Kwargs", "type": "object" }, "sequence": { "default": "CNA", "description": "Sequence of operations in the block.", "enum": [ "C", "AC", "CA", "CD", "CN", "DC", "NC", "ACD", "ACN", "ADC", "ANC", "CAD", "CAN", "CDA", "CDN", "CNA", "CND", "DAC", "DCA", "DCN", "DNC", "NAC", "NCA", "NCD", "NDC", "ACDN", "ACND", "ADCN", "ADNC", "ANCD", "ANDC", "CADN", "CAND", "CDAN", "CDNA", "CNAD", "CNDA", "DACN", "DANC", "DCAN", "DCNA", "DNAC", "DNCA", "NACD", "NADC", "NCAD", "NCDA", "NDAC", "NDCA" ], "title": "Sequence", "type": "string" }, "drop_prob": { "default": 0.0, "description": "Dropout probability.", "title": "Drop Prob", "type": "number" }, "kernel_sizes": { "default": [ 3, 5, 7 ], "description": "Kernel sizes for each conv layer.", "items": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" } ] }, "title": "Kernel Sizes", "type": "array" }, "filter_ratios": { "default": [ 1, 2, 3 ], "description": "Ratio of filters to out_channels for each conv layer. Will be scaled to sum to 1.", "items": { "type": "number" }, "title": "Filter Ratios", "type": "array" } }, "required": [ "in_channels", "out_channels" ] }
- Config:
arbitrary_types_allowed: bool = True
extra: str = ignore
validate_default: bool = True
validate_assignment: bool = True
validate_return: bool = True
- Fields:
- Validators:
validate»all fields
-
field kernel_sizes:
tuple[int|tuple[int,...],...] = (3, 5, 7)# Kernel sizes for each conv layer.
- Validated by:
-
field filter_ratios:
tuple[float,...] = (1, 2, 3)# Ratio of filters to out_channels for each conv layer. Will be scaled to sum to 1.
- Validated by:
-
field padding:
Literal['same'] = 'same'# Padding for the convolution. Only ‘same’ is supported for MultiResCNNBlock.
- Validated by:
-
field kernel_size:
int= 3# Kernel size for the convolution. Only kernel_size=3 is supported for MultiResCNNBlock.
- Validated by:
- validator scale_filter_ratios » filter_ratios[source]#
- class vision_architectures.blocks.cnn.CNNBlock3D(config={}, checkpointing_level=0, **kwargs)[source]#
Bases:
_CNNBlockA block to perform a sequence of convolution, activation, normalization, and dropout operations. This class is designed for 3D input eg. medical images, videos etc.
- __init__(config={}, checkpointing_level=0, **kwargs)[source]#
Initialize the CNNBlock3D block. Activation checkpointing level 1.
- Parameters:
config (
CNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.checkpointing_level (
int) – The level of checkpointing to use for activation checkpointing. Refer toActivationCheckpointingfor more details.**kwargs – Additional keyword arguments for configuration.
- class vision_architectures.blocks.cnn.CNNBlock2D(config={}, checkpointing_level=0, **kwargs)[source]#
Bases:
_CNNBlockA block to perform a sequence of convolution, activation, normalization, and dropout operations. This class is designed for 2D input eg. natural images etc.
- __init__(config={}, checkpointing_level=0, **kwargs)[source]#
Initialize the CNNBlock2D block. Activation checkpointing level 1.
- Parameters:
config (
CNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.checkpointing_level (
int) – The level of checkpointing to use for activation checkpointing. Refer toActivationCheckpointingfor more details.**kwargs – Additional keyword arguments for configuration.
- class vision_architectures.blocks.cnn.MultiResCNNBlock3D(config={}, checkpointing_level=0, **kwargs)[source]#
Bases:
_MultiResCNNBlock- __init__(config={}, checkpointing_level=0, **kwargs)[source]#
Initialize the MultiResCNNBlock3D block. Activation checkpointing level 2.
- Parameters:
config (
MultiResCNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.checkpointing_level (
int) – The level of checkpointing to use for activation checkpointing. Refer toActivationCheckpointingfor more details.**kwargs – Additional keyword arguments for configuration.
- class vision_architectures.blocks.cnn.MultiResCNNBlock2D(config={}, checkpointing_level=0, **kwargs)[source]#
Bases:
_MultiResCNNBlock- __init__(config={}, checkpointing_level=0, **kwargs)[source]#
Initialize the MultiResCNNBlock2D block. Activation checkpointing level 2.
- Parameters:
config (
MultiResCNNBlockConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.checkpointing_level (
int) – The level of checkpointing to use for activation checkpointing. Refer toActivationCheckpointingfor more details.**kwargs – Additional keyword arguments for configuration.
- class vision_architectures.blocks.cnn.TensorSplittingConv(conv, num_splits, optimize_num_splits=True)[source]#
Bases:
ModuleConvolution layer that operates on splits of a tensor on desired device and concatenates the results to give a lossless output. This is useful for large input tensors that cause intermediate buffers in the conv layer that don’t fit in memory. Works for both 2D and 3D convolutions.
- __init__(conv, num_splits, optimize_num_splits=True)[source]#
Initialize the TensorSplittingConv layer.
- Parameters:
conv (
Module) – Convolution layer to be used for splitting. Must be either nn.Conv2d or nn.Conv3d.num_splits (
int|tuple[int,...]) – Number of splits for each spatial dimension. If an int is provided, it will be used for all spatial dimensions. If a tuple is provided, it must have the same length as the number of spatial dimensions.optimize_num_splits (
bool) – Whether to optimize the number of splits based on the input shape. An example of optimization is provided below. Defaults to True.
- get_receptive_field()#
Calculate the receptive field of the convolution layer.
- Return type:
tuple[int,...]
- get_edge_context()#
Calculate the context size required to eliminate edge effects when merging the conv outputs into one.
- get_input_shape(input_shape)[source]#
Get the input shape of the convolution layer. This function removes any unnecesary dimensions and ensures that the input shape is of length equal to the number of spatial dimensions.
- Parameters:
input_shape (
tuple[int,...] |Size|Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.- Return type:
tuple[int,...]- Returns:
Tuple of the input shape for the convolution layer, with only the spatial dimensions.
- get_optimized_num_splits(input_shape)[source]#
Optimize the number of splits for each dimension based on the input shape and number of splits
Example
Let’s say input shape is (110, 110) and num_splits is (12, 12). The input will first be padded to (120, 120) and then split into splits of size (10+overlap, 10+overlap) each. However, if you notice, the padding that was also equal to 10, and therefore was completely unnecessary as the same result can be achieved by using num_splits = (11, 11) and reducing 144-121=23 splits to be processed.
- Parameters:
input_shape (
tuple[int,...] |Size|Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.- Return type:
tuple[int,...]- Returns:
Tuple of optimized number of splits for each dimension.
- pad_input_for_divisibility(x, num_splits=None)[source]#
Pad the input at the end of every spatial dimension such that it is perfectly divisible by the number of splits.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, in_channels, [z], y, x).num_splits (
Optional[tuple[int,...]]) – Number of splits for each spatial dimension. If None, the default num_splits will be used.
- Return type:
Tensor- Returns:
Padded input tensor of shape (batch_size, in_channels, [z], y, x).
- get_split_size(input_shape, num_splits=None)[source]#
Calculate the split size for each dimension based on the input shape and number of splits.
- Parameters:
input_shape (
tuple[int,...] |Size|Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.- Return type:
tuple[int,...]- Returns:
Tuple of split sizes for each dimension.
- get_split_stride(input_shape, num_splits=None)[source]#
Calculate the split stride for each dimension based on the input shape and context size.
- Parameters:
input_shape (
tuple[int,...] |Size|Tensor) – Shape of the input tensor. If a tensor is provided, its shape will be used.num_splits (
Optional[tuple[int,...]]) – Number of splits for each spatial dimension. If None, the default num_splits will be used.
- Return type:
tuple[int,...]- Returns:
Tuple of split strides for each dimension.
- pad_input_for_context(x)[source]#
Pad the input with the context size for consistent merging.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, in_channels, [z], y, x).- Return type:
Tensor- Returns:
Padded input tensor of shape (batch_size, in_channels, [z], y, x).
- forward(x, channels_first=True)[source]#
- Forward pass through the convolution layer with tensor splitting parallelism. Main convolution occurs on it’s
device, but the output is built on the input tensor’s device.
- Parameters:
x (
Tensor) – Tensor of shape (B, C, Z, Y, X) or (B, Z, Y, X, C) representing the input features.- Return type:
Tensor- Returns:
Tensor of shape (B, C, Z, Y, X) or (B, Z, Y, X, C) representing the output features.
- vision_architectures.blocks.cnn.add_tsp_to_module(module, num_splits_2d=None, num_splits_3d=None, strict=True)[source]#
Recursively add TensorSplittingConv to the module for all Conv2d and Conv3d layers.
- Parameters:
module (
Module) – The module to modify.num_splits_2d (
Union[int,tuple[int,int],None]) – Number of splits for 2D convolutions. If None, 2D convolutions will not be modified.num_splits_3d (
Union[int,tuple[int,int,int],None]) – Number of splits for 3D convolutions. If None, 3D convolutions will not be modified.strict (
bool) – Whether to raise an error if a conversion fails. If False, it will log the error and continue.
- Return type:
Module- Returns:
The modified module with TensorSplittingConv layers.
- Raises:
ValueError – If both num_splits_2d and num_splits_3d are None.
Exception – If a conversion fails and strict is True.
- vision_architectures.blocks.cnn.remove_tsp_from_module(module)[source]#
Recursively remove TensorSplittingConv from the module and replace it with the original convolution layer.
- Parameters:
module (
Module) – The module to modify.- Return type:
Module- Returns:
The modified module with TensorSplittingConv layers replaced by the original convolution layers.