oneflow.nn¶

These are the basic building blocks for graphs:

oneflow.nn

Parameter

Containers ¶

`Module`	Base class for all neural network modules.
`Sequential`	A sequential container.
`ModuleList`	Holds submodules in a list.
`ModuleDict`	Holds submodules in a dictionary.
`ParameterList`	Holds parameters in a list.
`ParameterDict`	Holds parameters in a dictionary.

nn.Module ¶

`add_module`	Adds a child module to the current module.
`apply`	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`buffers`	Returns an iterator over module buffers.
`children`	Returns an iterator over immediate children modules.
`cpu`	Moves all model parameters and buffers to the CPU.
`cuda`	Moves all model parameters and buffers to the GPU.
`double`	Casts all floating point parameters and buffers to `double` datatype.
`train`	Sets the module in training mode.
`eval`	Sets the module in evaluation mode.
`extra_repr`	Set the extra representation of the module
`float`	Casts all floating point parameters and buffers to `float` datatype.
`forward`
`load_state_dict`	Copies parameters and buffers from `state_dict` into this module and its descendants.
`modules`	Returns an iterator over all modules in the network.
`named_buffers`	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`	Returns an iterator over module parameters.
`register_buffer`	Adds a buffer to the module.
`register_forward_hook`	Registers a forward hook on the module.
`register_forward_pre_hook`	Registers a forward pre-hook on the module.
`register_backward_hook`	Registers a backward hook on the module.
`register_full_backward_hook`	Registers a backward hook on the module.
`register_state_dict_pre_hook`	These hooks will be called with arguments: `self`, `prefix`, and `keep_vars` before calling `state_dict` on `self`.
`register_parameter`	Adds a parameter to the module.
`requires_grad_`	Change if autograd should record operations on parameters in this module.
`state_dict`	Returns a dictionary containing a whole state of the module.
`to`	Moves and/or casts the parameters and buffers.
`zero_grad`	Sets gradients of all model parameters to zero.

Containers

Convolution Layers ¶

`nn.Conv1d`	Applies a 1D convolution over an input signal composed of several input planes.
`nn.Conv2d`	Applies a 2D convolution over an input signal composed of several input planes.
`nn.Conv3d`	Applies a 3D convolution over an input signal composed of several input planes.
`nn.ConvTranspose1d`	Applies a 1D transposed convolution operator over an input image composed of several input planes.
`nn.ConvTranspose2d`	Applies a 2D transposed convolution operator over an input image composed of several input planes.
`nn.ConvTranspose3d`	Applies a 3D transposed convolution operator over an input image composed of several input planes.
`nn.Unfold`	The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.Unfold.html.
`nn.Fold`	The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.Fold.html.

Pooling Layers ¶

`nn.MaxPool1d`	Applies a 1D max pooling over an input signal composed of several input planes.
`nn.MaxPool2d`	Applies a 2D max pooling over an input signal composed of several input planes.
`nn.MaxPool3d`	Applies a 3D max pooling over an input signal composed of several input planes.
`nn.MaxUnpool1d`	Computes a partial inverse of `MaxPool1d`.
`nn.MaxUnpool2d`	Computes a partial inverse of `MaxPool2d`.
`nn.MaxUnpool3d`	Computes a partial inverse of `MaxPool3d`.
`nn.AdaptiveAvgPool1d`	Applies a 1D adaptive average pooling over an input signal composed of several input planes.
`nn.AdaptiveAvgPool2d`	Applies a 2D adaptive average pooling over an input signal composed of several input planes.
`nn.AdaptiveAvgPool3d`	Applies a 3D adaptive average pooling over an input signal composed of several input planes.
`nn.AdaptiveMaxPool1d`	Applies a 1D adaptive max pooling over an input signal composed of several input planes.
`nn.AdaptiveMaxPool2d`	Applies a 2D adaptive max pooling over an input signal composed of several input planes.
`nn.AdaptiveMaxPool3d`	Applies a 3D adaptive max pooling over an input signal composed of several input planes.
`nn.AvgPool1d`	Applies a 1D average pooling over an input signal composed of several input planes.
`nn.AvgPool2d`	Performs the 2d-average pooling on the input.
`nn.AvgPool3d`	Applies a 3D average pooling over an input signal composed of several input planes.

Padding Layers ¶

`nn.ConstantPad1d`	Pads the input tensor boundaries with a constant value.
`nn.ConstantPad2d`	This operator pads the input with constant value that user specifies.
`nn.ConstantPad3d`	Pads the input tensor boundaries with a constant value.
`nn.ReflectionPad1d`	This operator pads the input tensor using the reflection of the input boundary.
`nn.ReflectionPad2d`	This operator pads the input tensor using the reflection of the input boundary.
`nn.ReplicationPad1d`	Pads the input tensor using replication of the input boundary.
`nn.ReplicationPad2d`	Pads the input tensor using the replication of the input boundary.
`nn.ZeroPad2d`	Pads the input tensor boundaries with zero.

Non-linear Activations (weighted sum, nonlinearity)¶

`nn.ELU`	Applies the element-wise function
`nn.Hardshrink`	The Hardshrink activation.
`nn.Hardsigmoid`	Applies the element-wise function:
`nn.Hardswish`	Applies the hardswish function, element-wise, as described in the paper Searching for MobileNetV3.
`nn.Hardtanh`	Applies the HardTanh function element-wise
`nn.LeakyReLU`	Applies the element-wise function:
`nn.LogSigmoid`	Applies the element-wise function:
`nn.PReLU`	Applies the element-wise function:
`nn.ReLU`	Applies the rectified linear unit function element-wise:
`nn.ReLU6`	Applies the element-wise function:
`nn.SELU`	Applies the element-wise function:
`nn.CELU`	Applies the element-wise function:
`nn.GELU`	The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.GELU.html.
`nn.QuickGELU`	Applies GELU approximation that is fast but somewhat inaccurate.
`nn.SquareReLU`	Applies the relu^2 activation introduced in https://arxiv.org/abs/2109.08668v2
`nn.SiLU`	SiLU(Swish) activation:
`nn.Sigmoid`	Applies the element-wise function:
`nn.Mish`	Applies the element-wise function:
`nn.Softplus`	Applies the element-wise function:
`nn.Softshrink`	The Softshrink activation.
`nn.Softsign`	The SoftSign activation.
`nn.Tanh`	This operator computes the hyperbolic tangent value of Tensor.
`nn.Threshold`	The Threshold Activation.
`nn.GLU`	The GLU activation.

Non-linear Activations (other)¶

`nn.Softmax`	Applies the Softmax function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0,1] and sum to 1.
`nn.LogSoftmax`	Applies the LogSoftmax function to an n-dimensional input Tensor.

Normalization Layers ¶

`nn.BatchNorm1d`	Applies Batch Normalization over a 2D or 3D input (a mini-batch of 1D inputs with optional additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .
`nn.BatchNorm2d`	Applies Batch Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .
`nn.BatchNorm3d`	Applies Batch Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .
`nn.SyncBatchNorm`	Applies Batch Normalization over a N-Dimensional input (a mini-batch of [N-2]D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .
`nn.FusedBatchNorm1d`	Applies Fused Batch Normalization over a 2D or 3D input, the formula is:
`nn.FusedBatchNorm2d`	Applies Fused Batch Normalization over a 4D input, the formula is:
`nn.FusedBatchNorm3d`	Applies Fused Batch Normalization over a 5D input, the formula is:
`nn.GroupNorm`	Applies Group Normalization over a mini-batch of inputs as described in the paper Group Normalization
`nn.InstanceNorm1d`	Applies Instance Normalization over a 3D input (a mini-batch of 1D inputs with optional additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.
`nn.InstanceNorm2d`	Applies Instance Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.
`nn.InstanceNorm3d`	Applies Instance Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.
`nn.LayerNorm`	Applies Layer Normalization over a mini-batch of inputs as described in the paper Layer Normalization
`nn.RMSLayerNorm`	Construct a layernorm module in the T5 style.
`nn.RMSNorm`	Applies Root Mean Square Layer Normalization over a mini-batch of inputs as described in the paper Root Mean Square Layer Normalization

Recurrent Layers ¶

`nn.RNN`	Applies a multi-layer Elman RNN with tanhtanh or text{ReLU}ReLU non-linearity to an input sequence.
`nn.LSTM`	Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.
`nn.GRU`	Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence.
`nn.RNNCell`	An Elman RNN cell with tanh or ReLU non-linearity.
`nn.LSTMCell`	A long short-term memory (LSTM) cell.
`nn.GRUCell`	A gated recurrent unit (GRU) cell

Linear Layers ¶

`nn.Identity`	A placeholder identity operator that is argument-insensitive.
`nn.Linear`	Applies a linear transformation to the incoming data: \(y = xA^T + b\)

Dropout Layers ¶

`nn.Dropout`	During training, randomly zeroes some of the elements of the input tensor with probability `p` using samples from a Bernoulli distribution.
`nn.Dropout1d`	Randomly zero out entire channels (a channel is a 1D feature map, e.g., the \(j\)-th channel of the \(i\)-th sample in the batched input is a 1D tensor :math:` ext{input}[i, j]`).
`nn.Dropout2d`	Randomly zero out entire channels (a channel is a 2D feature map, e.g., the \(j\)-th channel of the \(i\)-th sample in the batched input is a 2D tensor :math:` ext{input}[i, j]`).
`nn.Dropout3d`	Randomly zero out entire channels (a channel is a 3D feature map, e.g., the \(j\)-th channel of the \(i\)-th sample in the batched input is a 3D tensor :math:` ext{input}[i, j]`).

Sparse Layers ¶

nn.Embedding

A simple lookup table that stores embeddings of a fixed dictionary and size.

Distance Functions ¶

`nn.CosineSimilarity`	Returns cosine similarity between \(x_1\) and \(x_2\), computed along dim.
`nn.PairwiseDistance`	Computes the pairwise distance between vectors \(v_1\), \(v_2\) using the p-norm:

Loss Functions ¶

`nn.BCELoss`	This operator computes the binary cross entropy loss.
`nn.BCEWithLogitsLoss`	This operator combines the Sigmoid and BCELoss together.
`nn.CTCLoss`	The Connectionist Temporal Classification loss.
`nn.CombinedMarginLoss`	The operation implements “margin_softmax” in InsightFace: https://github.com/deepinsight/insightface/blob/master/recognition/arcface_mxnet/train.py The implementation of margin_softmax in InsightFace is composed of multiple operators.
`nn.CrossEntropyLoss`	The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.CrossEntropyLoss.html.
`nn.KLDivLoss`	The Kullback-Leibler divergence loss measure
`nn.L1Loss`	This operator computes the L1 Loss between each element in input and target.
`nn.MSELoss`	Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input \(x\) and target \(y\).
`nn.MarginRankingLoss`	Creates a criterion that measures the loss given inputs \(x1\), \(x2\), two 1D mini-batch Tensors, and a label 1D mini-batch tensor \(y\) (containing 1 or -1).
`nn.NLLLoss`	The negative log likelihood loss.
`nn.SmoothL1Loss`	Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.
`nn.TripletMarginLoss`	Creates a criterion that measures the triplet loss given an input tensors \(x1\), \(x2\), \(x3\) and a margin with a value greater than \(0\).

Vision Layers ¶

`nn.PixelShuffle`	alias of `oneflow.nn.modules.pixelshuffle.PixelShufflev2`
`nn.Upsample`	Upsamples a given multi-channel 1D (temporal), 2D (spatial) or 3D (volumetric) data.
`nn.UpsamplingBilinear2d`	Applies a 2D bilinear upsampling to an input signal composed of several input channels.
`nn.UpsamplingNearest2d`	Applies a 2D nearest neighbor upsampling to an input signal composed of several input channels.

DataParallel Layers (multi-GPU, distributed)¶

nn.parallel.DistributedDataParallel

Data loading and preprocessing Layers ¶

`nn.COCOReader`
`nn.CoinFlip`	Generates random boolean values following a bernoulli distribution.
`nn.CropMirrorNormalize`	Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.
`nn.OFRecordBytesDecoder`	This operator reads an tensor as bytes.
`nn.OFRecordImageDecoder`
`nn.OFRecordImageDecoderRandomCrop`
`nn.OFRecordRawDecoder`
`nn.OFRecordReader`

Quantization Aware Training ¶

`nn.MinMaxObserver`	Compute the quantization parameters of the input tensor.
`nn.MovingAverageMinMaxObserver`	Compute the quantization parameters based on the moving average of the input tensor’s min and max values.
`nn.FakeQuantization`	Simulate the quantize and dequantize operations in training time.
`nn.QatConv1d`	A Conv1d module attached with nn.MinMaxObserver, nn.MovingAverageMinMaxObserver and nn.FakeQuantization modules for weight and input, used for quantization aware training.
`nn.QatConv2d`	A Conv2d module attached with nn.MinMaxObserver, nn.MovingAverageMinMaxObserver and nn.FakeQuantization modules for weight and input, used for quantization aware training.
`nn.QatConv3d`	A Conv3d module attached with nn.MinMaxObserver, nn.MovingAverageMinMaxObserver and nn.FakeQuantization modules for weight and input, used for quantization aware training.

Utilities ¶

From the oneflow.nn.utils module

`clip_grad_norm_`	Clips gradient norm of an iterable of parameters.
`clip_grad_value_`	Clips gradient of an iterable of parameters at specified value.
`weight_norm`	Applies weight normalization to a parameter in the given module.
`remove_weight_norm`	Removes the weight normalization reparameterization from a module.

Utility functions in other modules

`nn.utils.rnn.PackedSequence`	The interface is consistent with PyTorch.
`nn.utils.rnn.pack_padded_sequence`	The interface is consistent with PyTorch.
`nn.utils.rnn.pad_packed_sequence`	The interface is consistent with PyTorch.
`nn.utils.rnn.pad_sequence`	The interface is consistent with PyTorch.
`nn.utils.rnn.pack_sequence`	Packs a list of variable length Tensors

nn.Flatten

Flattens a contiguous range of dims into a tensor.

Quantized Functions ¶

Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than floating point precision.

`nn.FakeQuantization`	Simulate the quantize and dequantize operations in training time.
`nn.MinMaxObserver`	Compute the quantization parameters of the input tensor.
`nn.MovingAverageMinMaxObserver`	Compute the quantization parameters based on the moving average of the input tensor’s min and max values.
`nn.Quantization`	Simulate the quantize operation in inference time.