UperNet3D#
- pydantic model vision_architectures.nets.upernet_3d.UPerNet3DFusionConfig[source]#
Bases:
CNNBlockConfigShow JSON schema
{ "title": "UPerNet3DFusionConfig", "type": "object", "properties": { "in_channels": { "default": null, "description": "Calculated based on other parameters", "title": "In Channels", "type": "null" }, "out_channels": { "default": null, "description": "Calculated based on other parameters", "title": "Out Channels", "type": "null" }, "kernel_size": { "default": 3, "description": "Kernel size for the convolutional layers", "title": "Kernel Size", "type": "integer" }, "padding": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" }, { "type": "string" } ], "default": "same", "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.", "title": "Padding" }, "stride": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" } ], "default": 1, "description": "Stride for the convolution", "title": "Stride" }, "conv_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the convolution layer", "title": "Conv Kwargs", "type": "object" }, "transposed": { "default": false, "description": "Whether to perform ConvTranspose instead of Conv", "title": "Transposed", "type": "boolean" }, "normalization": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "batchnorm3d", "description": "Normalization layer type.", "title": "Normalization" }, "normalization_pre_args": { "default": [], "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.", "items": {}, "title": "Normalization Pre Args", "type": "array" }, "normalization_post_args": { "default": [], "description": "Arguments for the normalization layer after providing the dimension.", "items": {}, "title": "Normalization Post Args", "type": "array" }, "normalization_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the normalization layer", "title": "Normalization Kwargs", "type": "object" }, "activation": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "relu", "description": "Activation function type.", "title": "Activation" }, "activation_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the activation function.", "title": "Activation Kwargs", "type": "object" }, "sequence": { "default": "CNA", "description": "Sequence of operations in the block.", "enum": [ "C", "AC", "CA", "CD", "CN", "DC", "NC", "ACD", "ACN", "ADC", "ANC", "CAD", "CAN", "CDA", "CDN", "CNA", "CND", "DAC", "DCA", "DCN", "DNC", "NAC", "NCA", "NCD", "NDC", "ACDN", "ACND", "ADCN", "ADNC", "ANCD", "ANDC", "CADN", "CAND", "CDAN", "CDNA", "CNAD", "CNDA", "DACN", "DANC", "DCAN", "DCNA", "DNAC", "DNCA", "NACD", "NADC", "NCAD", "NCDA", "NDAC", "NDCA" ], "title": "Sequence", "type": "string" }, "drop_prob": { "default": 0.0, "description": "Dropout probability.", "title": "Drop Prob", "type": "number" }, "num_features": { "description": "Number of input feature maps", "title": "Num Features", "type": "integer" }, "dim": { "description": "Dimension of the fused feature map", "title": "Dim", "type": "integer" }, "fused_shape": { "anyOf": [ { "maxItems": 3, "minItems": 3, "prefixItems": [ { "type": "integer" }, { "type": "integer" }, { "type": "integer" } ], "type": "array" }, { "type": "null" } ], "default": null, "description": "Shape of the fused feature map. It can also be provided during runtime. If None, highest input resolution is used.", "title": "Fused Shape" }, "interpolation_mode": { "default": "trilinear", "description": "Interpolation mode for the FPN block.", "title": "Interpolation Mode", "type": "string" } }, "required": [ "num_features", "dim" ] }
- Config:
arbitrary_types_allowed: bool = True
extra: str = ignore
validate_default: bool = True
validate_assignment: bool = True
validate_return: bool = True
- Fields:
- Validators:
-
field num_features:
int[Required]# Number of input feature maps
- Validated by:
-
field kernel_size:
int= 3# Kernel size for the convolutional layers
- Validated by:
-
field dim:
int[Required]# Dimension of the fused feature map
- Validated by:
-
field fused_shape:
tuple[int,int,int] |None= None# Shape of the fused feature map. It can also be provided during runtime. If None, highest input resolution is used.
- Validated by:
-
field interpolation_mode:
str= 'trilinear'# Interpolation mode for the FPN block.
- Validated by:
-
field in_channels:
None= None# Calculated based on other parameters
- Validated by:
-
field out_channels:
None= None# Calculated based on other parameters
- Validated by:
- pydantic model vision_architectures.nets.upernet_3d.UPerNet3DConfig[source]#
Bases:
FPN3DConfigShow JSON schema
{ "title": "UPerNet3DConfig", "type": "object", "properties": { "blocks": { "description": "List of configs for the FPN blocks.", "items": { "$ref": "#/$defs/FPN3DBlockConfig" }, "title": "Blocks", "type": "array" }, "fusion": { "$ref": "#/$defs/UPerNet3DFusionConfig", "description": "Configuration for the UPerNet3D fusion block" }, "enabled_outputs": { "default": [ "object" ], "description": "Select which outputs to enable", "items": { "enum": [ "object", "part", "scene", "material", "texture" ], "type": "string" }, "title": "Enabled Outputs", "type": "array", "uniqueItems": true }, "num_objects": { "anyOf": [ { "type": "integer" }, { "type": "null" } ], "default": null, "description": "Number of object classes", "title": "Num Objects" } }, "$defs": { "FPN3DBlockConfig": { "properties": { "in_channels": { "default": null, "description": "Calculated based on other parameters", "title": "In Channels", "type": "null" }, "out_channels": { "default": null, "description": "Calculated based on other parameters", "title": "Out Channels", "type": "null" }, "kernel_size": { "default": 3, "description": "Kernel size for the convolutional layers in the FPN block.", "title": "Kernel Size", "type": "integer" }, "padding": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" }, { "type": "string" } ], "default": "same", "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.", "title": "Padding" }, "stride": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" } ], "default": 1, "description": "Stride for the convolution", "title": "Stride" }, "conv_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the convolution layer", "title": "Conv Kwargs", "type": "object" }, "transposed": { "default": false, "description": "Whether to perform ConvTranspose instead of Conv", "title": "Transposed", "type": "boolean" }, "normalization": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "batchnorm3d", "description": "Normalization layer type.", "title": "Normalization" }, "normalization_pre_args": { "default": [], "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.", "items": {}, "title": "Normalization Pre Args", "type": "array" }, "normalization_post_args": { "default": [], "description": "Arguments for the normalization layer after providing the dimension.", "items": {}, "title": "Normalization Post Args", "type": "array" }, "normalization_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the normalization layer", "title": "Normalization Kwargs", "type": "object" }, "activation": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "relu", "description": "Activation function type.", "title": "Activation" }, "activation_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the activation function.", "title": "Activation Kwargs", "type": "object" }, "sequence": { "default": "CNA", "description": "Sequence of operations in the block.", "enum": [ "C", "AC", "CA", "CD", "CN", "DC", "NC", "ACD", "ACN", "ADC", "ANC", "CAD", "CAN", "CDA", "CDN", "CNA", "CND", "DAC", "DCA", "DCN", "DNC", "NAC", "NCA", "NCD", "NDC", "ACDN", "ACND", "ADCN", "ADNC", "ANCD", "ANDC", "CADN", "CAND", "CDAN", "CDNA", "CNAD", "CNDA", "DACN", "DANC", "DCAN", "DCNA", "DNAC", "DNCA", "NACD", "NADC", "NCAD", "NCDA", "NDAC", "NDCA" ], "title": "Sequence", "type": "string" }, "drop_prob": { "default": 0.0, "description": "Dropout probability.", "title": "Drop Prob", "type": "number" }, "dim": { "description": "Input channel dimension of the FPN block.", "title": "Dim", "type": "integer" }, "skip_conn_dim": { "description": "Input channel dimension of the skip connection.", "title": "Skip Conn Dim", "type": "integer" }, "interpolation_mode": { "default": "trilinear", "description": "Interpolation mode for the FPN block.", "title": "Interpolation Mode", "type": "string" }, "merge_method": { "default": "add", "description": "Merge method for the FPN block.", "enum": [ "add", "concat" ], "title": "Merge Method", "type": "string" } }, "required": [ "dim", "skip_conn_dim" ], "title": "FPN3DBlockConfig", "type": "object" }, "UPerNet3DFusionConfig": { "properties": { "in_channels": { "default": null, "description": "Calculated based on other parameters", "title": "In Channels", "type": "null" }, "out_channels": { "default": null, "description": "Calculated based on other parameters", "title": "Out Channels", "type": "null" }, "kernel_size": { "default": 3, "description": "Kernel size for the convolutional layers", "title": "Kernel Size", "type": "integer" }, "padding": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" }, { "type": "string" } ], "default": "same", "description": "Padding for the convolution. Can be 'same' or an integer/tuple of integers.", "title": "Padding" }, "stride": { "anyOf": [ { "type": "integer" }, { "items": { "type": "integer" }, "type": "array" } ], "default": 1, "description": "Stride for the convolution", "title": "Stride" }, "conv_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the convolution layer", "title": "Conv Kwargs", "type": "object" }, "transposed": { "default": false, "description": "Whether to perform ConvTranspose instead of Conv", "title": "Transposed", "type": "boolean" }, "normalization": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "batchnorm3d", "description": "Normalization layer type.", "title": "Normalization" }, "normalization_pre_args": { "default": [], "description": "Arguments for the normalization layer before providing the dimension. Useful when using GroupNorm layers are being used to specify the number of groups.", "items": {}, "title": "Normalization Pre Args", "type": "array" }, "normalization_post_args": { "default": [], "description": "Arguments for the normalization layer after providing the dimension.", "items": {}, "title": "Normalization Post Args", "type": "array" }, "normalization_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the normalization layer", "title": "Normalization Kwargs", "type": "object" }, "activation": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": "relu", "description": "Activation function type.", "title": "Activation" }, "activation_kwargs": { "additionalProperties": true, "default": {}, "description": "Additional keyword arguments for the activation function.", "title": "Activation Kwargs", "type": "object" }, "sequence": { "default": "CNA", "description": "Sequence of operations in the block.", "enum": [ "C", "AC", "CA", "CD", "CN", "DC", "NC", "ACD", "ACN", "ADC", "ANC", "CAD", "CAN", "CDA", "CDN", "CNA", "CND", "DAC", "DCA", "DCN", "DNC", "NAC", "NCA", "NCD", "NDC", "ACDN", "ACND", "ADCN", "ADNC", "ANCD", "ANDC", "CADN", "CAND", "CDAN", "CDNA", "CNAD", "CNDA", "DACN", "DANC", "DCAN", "DCNA", "DNAC", "DNCA", "NACD", "NADC", "NCAD", "NCDA", "NDAC", "NDCA" ], "title": "Sequence", "type": "string" }, "drop_prob": { "default": 0.0, "description": "Dropout probability.", "title": "Drop Prob", "type": "number" }, "num_features": { "description": "Number of input feature maps", "title": "Num Features", "type": "integer" }, "dim": { "description": "Dimension of the fused feature map", "title": "Dim", "type": "integer" }, "fused_shape": { "anyOf": [ { "maxItems": 3, "minItems": 3, "prefixItems": [ { "type": "integer" }, { "type": "integer" }, { "type": "integer" } ], "type": "array" }, { "type": "null" } ], "default": null, "description": "Shape of the fused feature map. It can also be provided during runtime. If None, highest input resolution is used.", "title": "Fused Shape" }, "interpolation_mode": { "default": "trilinear", "description": "Interpolation mode for the FPN block.", "title": "Interpolation Mode", "type": "string" } }, "required": [ "num_features", "dim" ], "title": "UPerNet3DFusionConfig", "type": "object" } }, "required": [ "blocks", "fusion" ] }
- Config:
arbitrary_types_allowed: bool = True
extra: str = ignore
validate_default: bool = True
validate_assignment: bool = True
validate_return: bool = True
- Fields:
- Validators:
validate»all fieldsvalidate_before»all fields
-
field fusion:
UPerNet3DFusionConfig[Required]# Configuration for the UPerNet3D fusion block
- Validated by:
-
field enabled_outputs:
set[Literal['object','part','scene','material','texture']] = {'object'}# Select which outputs to enable
- Validated by:
-
field num_objects:
int|None= None# Number of object classes
- Validated by:
- class vision_architectures.nets.upernet_3d.UPerNet3DFusion(config={}, checkpointing_level=0, **kwargs)[source]#
Bases:
ModuleFusion block for UPerNet3D. This class is designed for 3D input eg. medical images, videos etc.
- __init__(config={}, checkpointing_level=0, **kwargs)[source]#
Initialize the UPerNet3DFusion block.
- Parameters:
config (
UPerNet3DFusionConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.checkpointing_level (
int) – The level of checkpointing to use for activation checkpointing. Refer toActivationCheckpointingfor more details.**kwargs – Additional keyword arguments for configuration.
- concat_features(features, fused_shape=None)[source]#
Concatenate features from different resolutions and interpolate them to the same size.
- Parameters:
features (
list[Tensor]) – A list of channels-first 3D multi-scale features of shapes [(b, dim, d1, h1, w1), (b, dim, d2, h2, w2), …] where d1 > d2 > …fused_shape (
Optional[tuple[int,int,int]]) – Shape to which all feature maps will be interpolated. If None, value entered in the config is used. If that is None too, the shape of the largest feature map is used.
- Return type:
Tensor- Returns:
A feature map with spatial resolution of
fused_shapeand concatenated channels.
- fuse_features(concatenated_features)[source]#
Fuse features from different resolutions.
- Parameters:
concatenated_features (
Tensor) – A channels-first feature map with spatial resolution offused_shapeand concatenated channels.- Return type:
Tensor- Returns:
A fused 3D feature map.
- forward(features, fused_shape=None, channels_first=True)[source]#
Collect and fuse all of the multi-scale features.
- Parameters:
features (
list[Tensor]) – A list of 3D multi-scale features of shapes [(b, [dim], d1, h1, w1, [dim]), (b, [dim], d2, h2, w2, [dim]), …] where d1 > d2 > …fused_shape (
Optional[tuple[int,int,int]]) – Shape to which all feature maps will be interpolated. If None, value entered in the config is used. If that is None too, the shape of the largest feature map is used.channels_first (
bool) – Whether the inputs are in channels first format (B, C, …) or not (B, …, C). This is assumed for all the features.
- Return type:
Tensor- Returns:
A fused 3D feature map.
- class vision_architectures.nets.upernet_3d.UPerNet3D(config={}, checkpointing_level=0, **kwargs)[source]#
Bases:
Module,PyTorchModelHubMixinImplementation of the UPerNet3D architecture. This class is designed for 3D input eg. medical images, videos etc.
- __init__(config={}, checkpointing_level=0, **kwargs)[source]#
Initialize the UPerNet3D architecture.
- Parameters:
config (
UPerNet3DConfig) – An instance of the Config class that contains all the configuration parameters. It can also be passed as a dictionary and the instance will be created automatically.checkpointing_level (
int) – The level of checkpointing to use for activation checkpointing. Refer toActivationCheckpointingfor more details.**kwargs – Additional keyword arguments for configuration.
- forward(features, fusion_shape=None, channels_first=True)[source]#
Return different outputs from the UPerNet3D architecture as per the paper.
- Parameters:
features (
list[Tensor]) – List of feature maps from the FPN. Tensor of shape (B, C, Z, Y, X) or (B, Z, Y, X, C) representing the input features.fusion_shape (
Optional[tuple[int,int,int]]) – Desired output shape for the feature fusion. If None and not specified in the config, the highest shape of the highest resolution feature map is used.channels_first (
bool) – Whether the inputs are in channels first format (B, C, …) or not (B, …, C).
- Return type:
dict[str,Tensor]- Returns:
A dictionary of outputs for each output type. Tensor of shape (B, C, Z, Y, X) or (B, Z, Y, X, C) representing the output features.