oneflow.nn

These are the basic building blocks for graphs:

Containers

Module

Base class for all neural network modules.

Sequential

A sequential container.

ModuleList

Holds submodules in a list.

ModuleDict

Holds submodules in a dictionary.

ParameterList

Holds parameters in a list.

ParameterDict

Holds parameters in a dictionary.

nn.Module

add_module

Adds a child module to the current module.

apply

Applies fn recursively to every submodule (as returned by .children()) as well as self.

buffers

Returns an iterator over module buffers.

children

Returns an iterator over immediate children modules.

cpu

Moves all model parameters and buffers to the CPU.

cuda

Moves all model parameters and buffers to the GPU.

double

Casts all floating point parameters and buffers to double datatype.

train

Sets the module in training mode.

eval

Sets the module in evaluation mode.

extra_repr

Set the extra representation of the module

float

Casts all floating point parameters and buffers to float datatype.

forward

load_state_dict

Copies parameters and buffers from state_dict into this module and its descendants.

modules

Returns an iterator over all modules in the network.

named_buffers

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

named_children

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

named_modules

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

named_parameters

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

parameters

Returns an iterator over module parameters.

register_buffer

Adds a buffer to the module.

register_forward_hook

Registers a forward hook on the module.

register_forward_pre_hook

Registers a forward pre-hook on the module.

register_backward_hook

Registers a backward hook on the module.

register_full_backward_hook

Registers a backward hook on the module.

register_state_dict_pre_hook

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self.

register_parameter

Adds a parameter to the module.

requires_grad_

Change if autograd should record operations on parameters in this module.

state_dict

Returns a dictionary containing a whole state of the module.

to

Moves and/or casts the parameters and buffers.

zero_grad

Sets gradients of all model parameters to zero.

Containers

Convolution Layers

nn.Conv1d

Applies a 1D convolution over an input signal composed of several input planes.

nn.Conv2d

Applies a 2D convolution over an input signal composed of several input planes.

nn.Conv3d

Applies a 3D convolution over an input signal composed of several input planes.

nn.ConvTranspose1d

Applies a 1D transposed convolution operator over an input image composed of several input planes.

nn.ConvTranspose2d

Applies a 2D transposed convolution operator over an input image composed of several input planes.

nn.ConvTranspose3d

Applies a 3D transposed convolution operator over an input image composed of several input planes.

nn.Unfold

The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.Unfold.html.

nn.Fold

The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.Fold.html.

Pooling Layers

nn.MaxPool1d

Applies a 1D max pooling over an input signal composed of several input planes.

nn.MaxPool2d

Applies a 2D max pooling over an input signal composed of several input planes.

nn.MaxPool3d

Applies a 3D max pooling over an input signal composed of several input planes.

nn.MaxUnpool1d

Computes a partial inverse of MaxPool1d.

nn.MaxUnpool2d

Computes a partial inverse of MaxPool2d.

nn.MaxUnpool3d

Computes a partial inverse of MaxPool3d.

nn.AdaptiveAvgPool1d

Applies a 1D adaptive average pooling over an input signal composed of several input planes.

nn.AdaptiveAvgPool2d

Applies a 2D adaptive average pooling over an input signal composed of several input planes.

nn.AdaptiveAvgPool3d

Applies a 3D adaptive average pooling over an input signal composed of several input planes.

nn.AdaptiveMaxPool1d

Applies a 1D adaptive max pooling over an input signal composed of several input planes.

nn.AdaptiveMaxPool2d

Applies a 2D adaptive max pooling over an input signal composed of several input planes.

nn.AdaptiveMaxPool3d

Applies a 3D adaptive max pooling over an input signal composed of several input planes.

nn.AvgPool1d

Applies a 1D average pooling over an input signal composed of several input planes.

nn.AvgPool2d

Performs the 2d-average pooling on the input.

nn.AvgPool3d

Applies a 3D average pooling over an input signal composed of several input planes.

Padding Layers

nn.ConstantPad1d

Pads the input tensor boundaries with a constant value.

nn.ConstantPad2d

This operator pads the input with constant value that user specifies.

nn.ConstantPad3d

Pads the input tensor boundaries with a constant value.

nn.ReflectionPad1d

This operator pads the input tensor using the reflection of the input boundary.

nn.ReflectionPad2d

This operator pads the input tensor using the reflection of the input boundary.

nn.ReplicationPad1d

Pads the input tensor using replication of the input boundary.

nn.ReplicationPad2d

Pads the input tensor using the replication of the input boundary.

nn.ZeroPad2d

Pads the input tensor boundaries with zero.

Non-linear Activations (weighted sum, nonlinearity)

nn.ELU

Applies the element-wise function

nn.Hardshrink

The Hardshrink activation.

nn.Hardsigmoid

Applies the element-wise function:

nn.Hardswish

Applies the hardswish function, element-wise, as described in the paper Searching for MobileNetV3.

nn.Hardtanh

Applies the HardTanh function element-wise

nn.LeakyReLU

Applies the element-wise function:

nn.LogSigmoid

Applies the element-wise function:

nn.PReLU

Applies the element-wise function:

nn.ReLU

Applies the rectified linear unit function element-wise:

nn.ReLU6

Applies the element-wise function:

nn.SELU

Applies the element-wise function:

nn.CELU

Applies the element-wise function:

nn.GELU

The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.GELU.html.

nn.QuickGELU

Applies GELU approximation that is fast but somewhat inaccurate.

nn.SquareReLU

Applies the relu^2 activation introduced in https://arxiv.org/abs/2109.08668v2

nn.SiLU

SiLU(Swish) activation:

nn.Sigmoid

Applies the element-wise function:

nn.Mish

Applies the element-wise function:

nn.Softplus

Applies the element-wise function:

nn.Softshrink

The Softshrink activation.

nn.Softsign

The SoftSign activation.

nn.Tanh

This operator computes the hyperbolic tangent value of Tensor.

nn.Threshold

The Threshold Activation.

nn.GLU

The GLU activation.

Non-linear Activations (other)

nn.Softmax

Applies the Softmax function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0,1] and sum to 1.

nn.LogSoftmax

Applies the LogSoftmax function to an n-dimensional input Tensor.

Normalization Layers

nn.BatchNorm1d

Applies Batch Normalization over a 2D or 3D input (a mini-batch of 1D inputs with optional additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

nn.BatchNorm2d

Applies Batch Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

nn.BatchNorm3d

Applies Batch Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

nn.SyncBatchNorm

Applies Batch Normalization over a N-Dimensional input (a mini-batch of [N-2]D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

nn.FusedBatchNorm1d

Applies Fused Batch Normalization over a 2D or 3D input, the formula is:

nn.FusedBatchNorm2d

Applies Fused Batch Normalization over a 4D input, the formula is:

nn.FusedBatchNorm3d

Applies Fused Batch Normalization over a 5D input, the formula is:

nn.GroupNorm

Applies Group Normalization over a mini-batch of inputs as described in the paper Group Normalization

nn.InstanceNorm1d

Applies Instance Normalization over a 3D input (a mini-batch of 1D inputs with optional additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.

nn.InstanceNorm2d

Applies Instance Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.

nn.InstanceNorm3d

Applies Instance Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.

nn.LayerNorm

Applies Layer Normalization over a mini-batch of inputs as described in the paper Layer Normalization

nn.RMSLayerNorm

Construct a layernorm module in the T5 style.

nn.RMSNorm

Applies Root Mean Square Layer Normalization over a mini-batch of inputs as described in the paper Root Mean Square Layer Normalization

Recurrent Layers

nn.RNN

Applies a multi-layer Elman RNN with tanhtanh or text{ReLU}ReLU non-linearity to an input sequence.

nn.LSTM

Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.

nn.GRU

Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence.

nn.RNNCell

An Elman RNN cell with tanh or ReLU non-linearity.

nn.LSTMCell

A long short-term memory (LSTM) cell.

nn.GRUCell

A gated recurrent unit (GRU) cell

Linear Layers

nn.Identity

A placeholder identity operator that is argument-insensitive.

nn.Linear

Applies a linear transformation to the incoming data: \(y = xA^T + b\)

Dropout Layers

nn.Dropout

During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution.

nn.Dropout1d

Randomly zero out entire channels (a channel is a 1D feature map, e.g., the \(j\)-th channel of the \(i\)-th sample in the batched input is a 1D tensor :math:` ext{input}[i, j]`).

nn.Dropout2d

Randomly zero out entire channels (a channel is a 2D feature map, e.g., the \(j\)-th channel of the \(i\)-th sample in the batched input is a 2D tensor :math:` ext{input}[i, j]`).

nn.Dropout3d

Randomly zero out entire channels (a channel is a 3D feature map, e.g., the \(j\)-th channel of the \(i\)-th sample in the batched input is a 3D tensor :math:` ext{input}[i, j]`).

Sparse Layers

nn.Embedding

A simple lookup table that stores embeddings of a fixed dictionary and size.

Distance Functions

nn.CosineSimilarity

Returns cosine similarity between \(x_1\) and \(x_2\), computed along dim.

nn.PairwiseDistance

Computes the pairwise distance between vectors \(v_1\), \(v_2\) using the p-norm:

Loss Functions

nn.BCELoss

This operator computes the binary cross entropy loss.

nn.BCEWithLogitsLoss

This operator combines the Sigmoid and BCELoss together.

nn.CTCLoss

The Connectionist Temporal Classification loss.

nn.CombinedMarginLoss

The operation implements “margin_softmax” in InsightFace: https://github.com/deepinsight/insightface/blob/master/recognition/arcface_mxnet/train.py The implementation of margin_softmax in InsightFace is composed of multiple operators.

nn.CrossEntropyLoss

The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.CrossEntropyLoss.html.

nn.KLDivLoss

The Kullback-Leibler divergence loss measure

nn.L1Loss

This operator computes the L1 Loss between each element in input and target.

nn.MSELoss

Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input \(x\) and target \(y\).

nn.MarginRankingLoss

Creates a criterion that measures the loss given inputs \(x1\), \(x2\), two 1D mini-batch Tensors, and a label 1D mini-batch tensor \(y\) (containing 1 or -1).

nn.NLLLoss

The negative log likelihood loss.

nn.SmoothL1Loss

Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.

nn.TripletMarginLoss

Creates a criterion that measures the triplet loss given an input tensors \(x1\), \(x2\), \(x3\) and a margin with a value greater than \(0\).

Vision Layers

nn.PixelShuffle

alias of oneflow.nn.modules.pixelshuffle.PixelShufflev2

nn.Upsample

Upsamples a given multi-channel 1D (temporal), 2D (spatial) or 3D (volumetric) data.

nn.UpsamplingBilinear2d

Applies a 2D bilinear upsampling to an input signal composed of several input channels.

nn.UpsamplingNearest2d

Applies a 2D nearest neighbor upsampling to an input signal composed of several input channels.

Data loading and preprocessing Layers

nn.COCOReader

nn.CoinFlip

Generates random boolean values following a bernoulli distribution.

nn.CropMirrorNormalize

Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.

nn.OFRecordBytesDecoder

This operator reads an tensor as bytes.

nn.OFRecordImageDecoder

nn.OFRecordImageDecoderRandomCrop

nn.OFRecordRawDecoder

nn.OFRecordReader

Quantization Aware Training

nn.MinMaxObserver

Compute the quantization parameters of the input tensor.

nn.MovingAverageMinMaxObserver

Compute the quantization parameters based on the moving average of the input tensor’s min and max values.

nn.FakeQuantization

Simulate the quantize and dequantize operations in training time.

nn.QatConv1d

A Conv1d module attached with nn.MinMaxObserver, nn.MovingAverageMinMaxObserver and nn.FakeQuantization modules for weight and input, used for quantization aware training.

nn.QatConv2d

A Conv2d module attached with nn.MinMaxObserver, nn.MovingAverageMinMaxObserver and nn.FakeQuantization modules for weight and input, used for quantization aware training.

nn.QatConv3d

A Conv3d module attached with nn.MinMaxObserver, nn.MovingAverageMinMaxObserver and nn.FakeQuantization modules for weight and input, used for quantization aware training.

Utilities

From the oneflow.nn.utils module

clip_grad_norm_

Clips gradient norm of an iterable of parameters.

clip_grad_value_

Clips gradient of an iterable of parameters at specified value.

weight_norm

Applies weight normalization to a parameter in the given module.

remove_weight_norm

Removes the weight normalization reparameterization from a module.

Utility functions in other modules

nn.utils.rnn.PackedSequence

The interface is consistent with PyTorch.

nn.utils.rnn.pack_padded_sequence

The interface is consistent with PyTorch.

nn.utils.rnn.pad_packed_sequence

The interface is consistent with PyTorch.

nn.utils.rnn.pad_sequence

The interface is consistent with PyTorch.

nn.utils.rnn.pack_sequence

Packs a list of variable length Tensors

nn.Flatten

Flattens a contiguous range of dims into a tensor.

Quantized Functions

Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than floating point precision.

nn.FakeQuantization

Simulate the quantize and dequantize operations in training time.

nn.MinMaxObserver

Compute the quantization parameters of the input tensor.

nn.MovingAverageMinMaxObserver

Compute the quantization parameters based on the moving average of the input tensor’s min and max values.

nn.Quantization

Simulate the quantize operation in inference time.