oneflow.nn.functional¶
Functional operations for neural networks¶

oneflow.nn.functional.
conv1d
(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.conv1d.html.
Applies a 1D convolution over an input signal composed of several input planes.
See
Conv1d
for details and output shape. Parameters
input – input tensor of shape \((\text{minibatch} , \text{in_channels} , iW)\)
weight – filters of shape \((\text{out_channels} , \frac{\text{in_channels}}{\text{groups}} , iW)\)
bias – optional bias of shape \((\text{out_channels})\). Default: None.
stride – the stride of the convolving kernel. Can be a single number or a tuple (sW,). Default: 1
padding – implicit paddings on both sides of the input. Can be a single number or a tuple (padW,). Default: 0
dilation – the spacing between kernel elements. Can be a single number or a tuple (dW,). Default: 1
groups – split input into groups, \(\text{in_channels}\) should be divisible by the number of groups. Default: 1
For examples:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> inputs = flow.randn(33, 16, 30) >>> filters = flow.randn(20, 16, 5) >>> outputs = F.conv1d(inputs, filters)

oneflow.nn.functional.
conv2d
(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.conv2d.html.
Applies a 2D convolution over an input image composed of several input planes.
See
Conv2d
for details and output shape. Parameters
input – input tensor of shape \((\text{minibatch} , \text{in_channels} , iH , iW)\)
weight – filters of shape \((\text{out_channels} , \frac{\text{in_channels}}{\text{groups}} , kH , kW)\)
bias – optional bias of shape \((\text{out_channels})\). Default: None.
stride – the stride of the convolving kernel. Can be a single number or a tuple (sH, sW). Default: 1
padding – implicit paddings on both sides of the input. Can be a single number or a tuple (padH, padW). Default: 0
dilation – the spacing between kernel elements. Can be a single number or a tuple (dH, dW). Default: 1
groups – split input into groups, \(\text{in_channels}\) should be divisible by the number of groups. Default: 1
For examples:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> inputs = flow.randn(8, 4, 3, 3) >>> filters = flow.randn(1, 4, 5, 5) >>> outputs = F.conv2d(inputs, filters, padding=1)

oneflow.nn.functional.
conv3d
(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.conv3d.html.
Applies a 3D convolution over an input image composed of several input planes.
See
Conv3d
for details and output shape. Parameters
input – input tensor of shape \((\text{minibatch} , \text{in_channels} , iD , iH , iW)\)
weight – filters of shape \((\text{out_channels} , \frac{\text{in_channels}}{\text{groups}} , kD , kH , kW)\)
bias – optional bias of shape \((\text{out_channels})\). Default: None.
stride – the stride of the convolving kernel. Can be a single number or a tuple (sD, sH, sW). Default: 1
padding – implicit paddings on both sides of the input. Can be a single number or a tuple (padD, padH, padW). Default: 0
dilation – the spacing between kernel elements. Can be a single number or a tuple (dD, dH, dW). Default: 1
groups – split input into groups, \(\text{in_channels}\) should be divisible by the number of groups. Default: 1
For examples:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> inputs = flow.randn(20, 16, 50, 10, 20) >>> filters = flow.randn(33, 16, 3, 3, 3) >>> outputs = F.conv3d(inputs, filters)

oneflow.nn.functional.
conv_transpose1d
(input, weight, bias=None, stride=1, padding=0, output_padding=0, groups=1, dilation=1) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/stable/generated/torch.nn.functional.conv_transpose1d.html
Applies a 1D transposed convolution operator over an input signal composed of several input planes, sometimes also called “deconvolution”.
See
ConvTranspose1d
for details and output shape. Parameters
input – input tensor of shape \((\text{minibatch} , \text{in_channels} , iW)\)
weight – filters of shape \((\text{in_channels} , \frac{\text{out_channels}}{\text{groups}} , kW)\)
bias – optional bias of shape \((\text{out_channels})\). Default: None.
stride – the stride of the convolving kernel. Can be a single number or a tuple (sW,). Default: 1
padding – dilation * (kernel_size  1)  padding zeropadding will be added to both sides of each dimension in the input. Can be a single number or a tuple (padW,). Default: 0
output_padding – additional size added to one side of each dimension in the output shape. Can be a single number or a tuple (out_padW). Default: 0
groups – split input into groups, \(\text{in_channels}\) should be divisible by the number of groups. Default: 1
dilation – the spacing between kernel elements. Can be a single number or a tuple (dW,). Default: 1
For examples:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> inputs = flow.randn(20, 16, 50) >>> weights = flow.randn(16, 33, 5) >>> outputs = F.conv_transpose1d(inputs, weights)

oneflow.nn.functional.
conv_transpose2d
(input, weight, bias=None, stride=1, padding=0, output_padding=0, groups=1, dilation=1) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/stable/generated/torch.nn.functional.conv_transpose3d.html
Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called “deconvolution”.
See
ConvTranspose2d
for details and output shape. Parameters
input – input tensor of shape \((\text{minibatch} , \text{in_channels} , iH , iW)\)
weight – filters of shape \((\text{in_channels} , \frac{\text{out_channels}}{\text{groups}} , kH , kW)\)
bias – optional bias of shape \((\text{out_channels})\). Default: None.
stride – the stride of the convolving kernel. Can be a single number or a tuple (sH, sW). Default: 1
padding – dilation * (kernel_size  1)  padding zeropadding will be added to both sides of each dimension in the input. Can be a single number or a tuple (padH, padW). Default: 0
output_padding – additional size added to one side of each dimension in the output shape. Can be a single number or a tuple (out_padH, out_padW). Default: 0
groups – split input into groups, \(\text{in_channels}\) should be divisible by the number of groups. Default: 1
dilation – the spacing between kernel elements. Can be a single number or a tuple (dH, dW). Default: 1
For examples:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> inputs = flow.randn(1, 4, 5, 5) >>> weights = flow.randn(4, 8, 3, 3) >>> outputs = F.conv_transpose2d(inputs, weights, padding=1)

oneflow.nn.functional.
conv_transpose3d
(input, weight, bias=None, stride=1, padding=0, output_padding=0, groups=1, dilation=1) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/stable/generated/torch.nn.functional.conv_transpose3d.html
Applies a 3D transposed convolution operator over an input image composed of several input planes, sometimes also called “deconvolution”.
See
ConvTranspose3d
for details and output shape. Parameters
input – input tensor of shape \((\text{minibatch} , \text{in_channels} , iT , iH , iW)\)
weight – filters of shape \((\text{in_channels} , \frac{\text{out_channels}}{\text{groups}} , kT , kH , kW)\)
bias – optional bias of shape \((\text{out_channels})\). Default: None.
stride – the stride of the convolving kernel. Can be a single number or a tuple (sD, sH, sW). Default: 1
padding – dilation * (kernel_size  1)  padding zeropadding will be added to both sides of each dimension in the input. Can be a single number or a tuple (padT, padH, padW). Default: 0
output_padding – additional size added to one side of each dimension in the output shape. Can be a single number or a tuple (out_padT, out_padH, out_padW). Default: 0
groups – split input into groups, \(\text{in_channels}\) should be divisible by the number of groups. Default: 1
dilation – the spacing between kernel elements. Can be a single number or a tuple (dT, dH, dW). Default: 1
For examples:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> inputs = flow.randn(20, 16, 50, 10, 20) >>> weights = flow.randn(16, 33, 3, 3, 3) >>> outputs = F.conv_transpose3d(inputs, weights)

oneflow.nn.functional.
adaptive_avg_pool1d
(input, output_size) → Tensor¶ Applies a 1D adaptive average pooling over an input signal composed of several input planes.
See
AdaptiveAvgPool1d
for details and output shape. Parameters
input – the input tensor
output_size – the target output size (single integer)
For examples:
>>> import oneflow as flow >>> import numpy as np >>> arr = np.array([[[ 0.0558, 0.6875, 1.6544, 0.6226, 0.1018, 0.0502, 1.2538, 0.1491]]]) >>> input = flow.tensor(arr, dtype=flow.float32) >>> flow.nn.functional.adaptive_avg_pool1d(input, output_size=[4]) tensor([[[0.3158, 1.1385, 0.0760, 0.5524]]], dtype=oneflow.float32)

oneflow.nn.functional.
adaptive_avg_pool2d
(input, output_size) → Tensor¶ Applies a 2D adaptive average pooling over an input signal composed of several input planes.
See
AdaptiveAvgPool2d
for details and output shape. Parameters
input – the input tensor
output_size – the target output size (single integer or doubleinteger tuple)
For examples:
>>> import oneflow as flow >>> import numpy as np >>> arr = np.array([[[[ 0.1004, 0.0488, 1.0515, 0.9466],[ 0.4538, 0.2361, 1.3437, 0.398 ],[ 0.0558, 0.6875, 1.6544, 0.6226],[ 0.1018, 0.0502, 1.2538, 0.1491]]]]) >>> input = flow.tensor(arr, dtype=flow.float32) >>> outputs = flow.nn.functional.adaptive_avg_pool2d(input, (2, 2))

oneflow.nn.functional.
adaptive_avg_pool3d
(input, output_size) → Tensor¶ Applies a 3D adaptive average pooling over an input signal composed of several input planes.
See
AdaptiveAvgPool3d
for details and output shape. Parameters
input – the input tensor
output_size – the target output size (single integer or tripleinteger tuple)
For examples:
>>> import oneflow as flow >>> import numpy as np >>> input = flow.tensor(np.random.randn(1, 1, 4, 4, 4), dtype=flow.float32) >>> output = flow.nn.functional.adaptive_avg_pool3d(input, (2, 2, 2))

oneflow.nn.functional.
relu
()¶ Applies the rectified linear unit function elementwise. See
ReLU
for more details. Parameters
inplace – If set to
True
, will do this operation inplace. Default:False
For examples:
>>> import oneflow as flow >>> import numpy as np >>> ndarr = np.asarray([1, 2, 3]) >>> input = flow.Tensor(ndarr) >>> output = flow.relu(input) >>> output tensor([1., 0., 3.], dtype=oneflow.float32)

oneflow.nn.functional.
hardsigmoid
(x: Tensor) → Tensor¶ Applies the elementwise function
\[ext{Hardsigmoid}(x) = egin{cases} 0 & ext{if~} x \le 3, \ 1 & ext{if~} x \ge +3, \ x / 6 + 1 / 2 & ext{otherwise} \end{cases}\]See
Hardsigmoid
for more details.

oneflow.nn.functional.
hardshrink
()¶

oneflow.nn.functional.
hardswish
(x: Tensor) → Tensor¶ Applies the hardswish function, elementwise, as described in the paper:
\[ext{Hardswish}(x) = egin{cases} 0 & ext{if~} x \le 3, \ x & ext{if~} x \ge +3, \ x \cdot (x + 3) /6 & ext{otherwise} \end{cases}\]See
Hardswish
for more details.

oneflow.nn.functional.
hardtanh
(input, min_val= 1.0, max_val=1.0) → Tensor¶ Applies the HardTanh function elementwise. See
Hardtanh
for more details.

oneflow.nn.functional.
normalize
(input: Tensor, p: float = 2.0, dim: int = 0, epsilon: float = 1e12) → Tensor¶ Performs \(L_p\) normalization of inputs over specified dimension
For a tensor
input
of sizes \((n_0, ..., n_{dim}, ..., n_k)\), each \(n_{dim}\) element vector \(v\) along dimensiondim
is transformed as:\[v = \frac{v}{\max(\lVert v \rVert_p, \epsilon)}.\]With the default arguments it uses the Euclidean norm over vectors along dimension \(1\) for normalization.
But note that the gradient calculation of the input tensor has different results on different frameworks when input.shape[dim] = 1.
 Parameters
input (oneflow.Tensor) – input tensor of any shape
p (float) – the exponent value in the norm formulation. Default: 2
dim (int) – the dimension to reduce. Default: 1
eps (float) – small value to avoid division by zero. Default: 1e12
For example:
>>> import oneflow as flow >>> x = flow.tensor([[1, 2], [3, 4]], dtype=flow.float32) >>> out = flow.nn.functional.normalize(x, 2, 0) >>> out tensor([[0.3162, 0.4472], [0.9487, 0.8944]], dtype=oneflow.float32) >>> out = flow.nn.functional.normalize(x, 2, 1) >>> out tensor([[0.4472, 0.8944], [0.6000, 0.8000]], dtype=oneflow.float32)

oneflow.nn.functional.
layer_norm
(input, normalized_shape, weight=None, bias=None, eps=1e05) → Tensor¶ Applies Layer Normalization for last certain number of dimensions.
See
LayerNorm
for details.

oneflow.nn.functional.
leaky_relu
(x: Tensor, alpha: Float) → Tensor¶ Applies elementwise, :math:` ext{LeakyReLU}(x) = max(0, x) + ext{negative_slope} * min(0, x)`
See
LeakyReLU
for more details.

oneflow.nn.functional.
elu
(x: Tensor, alpha: Float) → Tensor¶  Applies elementwise,
:math:` ext{ELU}(x) = max(0,x) + min(0, lpha * (exp(x)  1))`.
See
ELU
for more details.For examples:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([0.5, 0, 0.5]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.nn.functional.elu(input, alpha=1.0) >>> out tensor([0.3935, 0.0000, 0.5000], dtype=oneflow.float32)

oneflow.nn.functional.
celu
(x: Tensor, alpha: Float = 1.0, inplace: bool = False) → Tensor¶ Applies the elementwise function:
\[\text{CELU}(x) = \max(0,x) + \min(0, \alpha * (\exp(x/\alpha)  1))\]See
CELU
for more details.For examples:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([0.5, 0, 0.5]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.nn.functional.celu(input, alpha=0.5) >>> out tensor([0.3161, 0.0000, 0.5000], dtype=oneflow.float32)

oneflow.nn.functional.
selu
(x: Tensor) → Tensor¶ Applies elementwise function :math:` ext{SELU}(x) = scale * (max(0,x) + min(0, lpha * (exp(x)  1)))`, with \(lpha=1.6732632423543772848170429916717\) and \(scale=1.0507009873554804934193349852946\).
See
SELU
for more details.For examples:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([1, 2, 3]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.nn.functional.selu(input) >>> out tensor([1.0507, 2.1014, 3.1521], dtype=oneflow.float32)

oneflow.nn.functional.
sigmoid
(input) → Tensor¶ Applies the elementwise function \(\text{Sigmoid}(x) = \frac{1}{1 + \exp(x)}\)
See
Sigmoid
for more details.For examples:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([0.81733328, 0.43621480, 0.10351428]) >>> input = flow.tensor(x, dtype=flow.float32) >>> out = flow.nn.functional.sigmoid(input) >>> out tensor([0.6937, 0.6074, 0.5259], dtype=oneflow.float32)

oneflow.nn.functional.
pad
()¶ Pads tensor.
 Padding size:
The padding size by which to pad some dimensions of
input
are described starting from the last dimension and moving forward. \(\left\lfloor\frac{\text{len(pad)}}{2}\right\rfloor\) dimensions ofinput
will be padded. For example, to pad only the last dimension of the input tensor, thenpad
has the form \((\text{padding_left}, \text{padding_right})\); to pad the last 2 dimensions of the input tensor, then use \((\text{padding_left}, \text{padding_right},\) \(\text{padding_top}, \text{padding_bottom})\); to pad the last 3 dimensions, use \((\text{padding_left}, \text{padding_right},\) \(\text{padding_top}, \text{padding_bottom}\) \(\text{padding_front}, \text{padding_back})\). Padding mode:
See
oneflow.nn.ConstantPad2d
,oneflow.nn.ReflectionPad2d
, andoneflow.nn.ReplicationPad2d
for concrete examples on how each of the padding modes works. Constant padding is implemented for arbitrary dimensions. Replicate padding is implemented for padding the last 3 dimensions of 5D input tensor, or the last 2 dimensions of 4D input tensor, or the last dimension of 3D input tensor. Reflect padding is only implemented for padding the last 2 dimensions of 4D input tensor, or the last dimension of 3D input tensor.
 Parameters
input (Tensor) – Ndimensional tensor
pad (tuple) – melements tuple, where \(\frac{m}{2} \leq\) input dimensions and \(m\) is even.
mode –
'constant'
,'reflect'
,'replicate'
or'circular'
. Default:'constant'
value – fill value for
'constant'
padding. Default:0
For example:
>>> import oneflow as flow >>> import numpy as np >>> pad = [2, 2, 1, 1] >>> input = flow.tensor(np.arange(18).reshape((1, 2, 3, 3)).astype(np.float32)) >>> output = flow.nn.functional.pad(input, pad, mode = "replicate") >>> output.shape oneflow.Size([1, 2, 5, 7]) >>> output tensor([[[[ 0., 0., 0., 1., 2., 2., 2.], [ 0., 0., 0., 1., 2., 2., 2.], [ 3., 3., 3., 4., 5., 5., 5.], [ 6., 6., 6., 7., 8., 8., 8.], [ 6., 6., 6., 7., 8., 8., 8.]], [[ 9., 9., 9., 10., 11., 11., 11.], [ 9., 9., 9., 10., 11., 11., 11.], [12., 12., 12., 13., 14., 14., 14.], [15., 15., 15., 16., 17., 17., 17.], [15., 15., 15., 16., 17., 17., 17.]]]], dtype=oneflow.float32)
See
oneflow.nn.ConstantPad2d
,oneflow.nn.ReflectionPad2d
, andoneflow.nn.ReplicationPad2d
for concrete examples on how each of the padding modes works.

oneflow.nn.functional.
prelu
(x: Tensor, alpha: Tensor) → Tensor¶ Applies the elementwise function:
\[prelu(x) = max(0,x) + alpha * min(0,x)\]For example:
>>> import numpy as np >>> import oneflow as flow >>> x = flow.tensor(np.asarray([[[[1, 2], [3, 4]]]]), dtype=flow.float32) >>> alpha = flow.nn.Parameter(flow.tensor([1], dtype=flow.float32).fill_(0.25)) >>> flow.nn.functional.prelu(x, alpha) tensor([[[[ 1.0000, 0.5000], [ 3.0000, 4.0000]]]], dtype=oneflow.float32, grad_fn=<prelu_backward>)
See
PReLU
for more details.

oneflow.nn.functional.
logsigmoid
(x: Tensor) → Tensor¶ Applies the elementwise function:
\[\text{logsigmoid}(x) = \log\left(\frac{ 1 }{ 1 + \exp(x)}\right)\]For example:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([0.5, 0, 0.5]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.nn.functional.logsigmoid(input) >>> out tensor([0.9741, 0.6931, 0.4741], dtype=oneflow.float32)
See
LogSigmoid
for more details.

oneflow.nn.functional.
log_softmax
(x: Tensor, dim: int) → Tensor¶ LogSoftmax is defined as:
\[\text{LogSoftmax}(x_{i}) = \log\left(\frac{\exp(x_i) }{ \sum_j \exp(x_j)} \right) = x_i  \log({ \sum_j \exp(x_j)})\]See
LogSoftmax
for more details.

oneflow.nn.functional.
gelu
(x: Tensor) → Tensor¶ The equation is:
\[out = 0.5 * x * (1 + tanh(\sqrt{\frac{2}{\pi}} * (x + 0.044715x^{3})))\]For example:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([0.5, 0, 0.5]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.gelu(input) >>> out tensor([0.1543, 0.0000, 0.3457], dtype=oneflow.float32)
See
GELU
for more details.

oneflow.nn.functional.
glu
(input: Tensor, dim: int) → Tensor¶ The equation is:
\[GLU(input) = GLU(a, b) = a \otimes sigmoid(b)\]Note
where input is split in half along dim to form a and b, ⊗ is the elementwise product between matrices.
For example:
>>> import oneflow as flow >>> import oneflow.nn as nn >>> x = flow.tensor([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=flow.float32) >>> y = nn.functional.glu(x) >>> y tensor([[0.9526, 1.9640], [4.9954, 5.9980]], dtype=oneflow.float32)
See
GLU
for more details.

oneflow.nn.functional.
softsign
(x: Tensor) → Tensor¶ The formula is:
\[softsign(x) = \frac{x}{1 + x}\]For example:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([1, 2, 3]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.nn.functional.softsign(input) >>> out tensor([0.5000, 0.6667, 0.7500], dtype=oneflow.float32)
See
Softsign
for more details.

oneflow.nn.functional.
softmax
(x: Tensor, dim: int) → Tensor¶ Softmax is defined as:
\[\begin{split}\text{Softmax}(x_{i}) = \frac{\\exp(x_i)}{\sum_j \exp(x_j)}\end{split}\]See
Softmax
for more details.

oneflow.nn.functional.
softplus
(x: Tensor, beta: double = 1, threshold: double = 20) → Tensor¶ Applies the elementwise function:
\[\text{Softplus}(x) = \frac{1}{\beta} * \log(1 + \exp(\beta * x))\]For numerical stability the implementation reverts to the linear function when \(input \times \beta > threshold\).
See
Softplus
for more details.

oneflow.nn.functional.
tanh
(x: Tensor) → Tensor¶ The equation is:
\[out = \frac{e^xe^{x}}{e^x+e^{x}}\]See
Tanh
for more details.

oneflow.nn.functional.
threshold
()¶

oneflow.nn.functional.
softshrink
()¶

oneflow.nn.functional.
silu
(x: Tensor) → Tensor¶ The formula is:
\[ext{silu}(x) = x * sigmoid(x)\]For example:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([1, 2, 3]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.silu(input) >>> out tensor([0.7311, 1.7616, 2.8577], dtype=oneflow.float32)
See
SiLU
for more details.

oneflow.nn.functional.
mish
(x: Tensor) → Tensor¶ Applies the elementwise function:
\[ext{mish}(x) = x * ext{tanh}( ext{softplus}(x))\]For example:
>>> import numpy as np >>> import oneflow as flow >>> x = np.array([1, 2, 3]).astype(np.float32) >>> input = flow.tensor(x) >>> out = flow.mish(input) >>> out tensor([0.8651, 1.9440, 2.9865], dtype=oneflow.float32)
See
Mish
for more details.

oneflow.nn.functional.
one_hot
(input, num_classes= 1, on_value=1, off_value=0)¶ This operator generates a onehot Tensor from input Tensor.
If input Tensor’s rank is N, the corresponding onehot Tensor’s rank is N+1.
 Parameters
input (Tensor) – The input Tensor.
num_classes (int) – The length of onehot Tensor.
on_value (Union[int, float], optional) – The fill value when x[i] == i. Defaults to 1.
off_value (Union[int, float], optional) – The fill value when x[i] != i. Defaults to 0.
Note
The data type of input tensor should be int32 or int64.
 Returns
oneflow.Tensor.
For example:
>>> import oneflow as flow >>> import numpy as np >>> input=flow.tensor(np.array([0, 3, 1, 2]).astype(np.int64), dtype=flow.int64) >>> out = flow.nn.functional.one_hot(input, num_classes=5) >>> out tensor([[1, 0, 0, 0, 0], [0, 0, 0, 1, 0], [0, 1, 0, 0, 0], [0, 0, 1, 0, 0]], dtype=oneflow.int64)

oneflow.nn.functional.
triplet_margin_loss
()¶ The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.triplet_margin_loss.html.
Creates a criterion that measures the triplet loss given an input tensors \(x1\), \(x2\), \(x3\) and a margin with a value greater than \(0\). This is used for measuring a relative similarity between samples. A triplet is composed by a, p and n (i.e., anchor, positive examples and negative examples respectively). The shapes of all input tensors should be \((N, D)\).
The distance swap is described in detail in the paper Learning shallow convolutional feature descriptors with triplet losses by V. Balntas, E. Riba et al.
The loss function for each sample in the minibatch is:
\[L(a, p, n) = \max \{d(a_i, p_i)  d(a_i, n_i) + {\rm margin}, 0\}\]where
\[d(x_i, y_i) = \left\lVert {\bf x}_i  {\bf y}_i \right\rVert_p\] Parameters
margin (float, optional) – Default: \(1\).
p (float, optional) – The norm degree for pairwise distance. Default: \(2.0\).
swap (bool, optional) – The distance swap is described in detail in the paper Learning shallow convolutional feature descriptors with triplet losses by V. Balntas, E. Riba et al. Default:
False
.reduction (string, optional) – Specifies the reduction to apply to the output:
'none'
'mean'
'sum'
.'none'
: no reduction will be applied,'mean'
: the sum of the output will be divided by the number of elements in the output,'sum'
: the output will be summed. Note:size_average
andreduce
are in the process of being deprecated, and in the meantime, specifying either of those two args will overridereduction
. Default:'mean'
 Shape:
Input: \((N, D)\) where \(D\) is the vector dimension.
Output: A Tensor of shape \((N)\) if
reduction
is'none'
, or a scalar otherwise.
For example:
>>> import oneflow as flow >>> import numpy as np >>> triplet_loss = flow.nn.TripletMarginLoss(margin=1.0, p=2) >>> anchor = np.array([[1, 1, 1],[1, 1, 1], [1, 1, 1]]) >>> positive = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> negative = np.array([[2, 2, 2], [2, 2, 2], [2, 2, 2]]) >>> output = triplet_loss(flow.Tensor(anchor), flow.Tensor(positive), flow.Tensor(negative)) >>> output tensor(6.2971, dtype=oneflow.float32)

oneflow.nn.functional.
dropout
(x: Tensor, p: float = 0.5, training: bool = True, generator: Generator = None, *, addend: Tensor) → Tensor¶ The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.dropout.html.
During training, randomly zeroes some of the elements of the input tensor with probability
p
using samples from a Bernoulli distribution. Parameters
x (Tensor) – A Tensor which will be applyed dropout.
p (float) – probability of an element to be zeroed. Default: 0.5
training (bool) – If is True it will apply dropout. Default: True
generator (Generator, optional) – A pseudorandom number generator for sampling
addend (Tensor, optional) – A Tensor add in result after dropout, it can be used in model’s residual connection structure. Default: None
 Shape:
Input: \((*)\). Input can be of any shape
Output: \((*)\). Output is of the same shape as input
For example:
Example 1:
>>> import numpy as np >>> import oneflow as flow >>> arr = np.array( ... [ ... [0.7797, 0.2264, 0.2458, 0.4163], ... [0.4299, 0.3626, 0.4892, 0.4141], ... [1.4115, 1.2183, 0.5503, 0.6520], ... ] ... ) >>> x = flow.tensor(arr, dtype=flow.float32) >>> y = flow.nn.functional.dropout(x, p=0) >>> arr = np.array( ... [ ... [0.7797, 0.2264, 0.2458, 0.4163], ... [0.4299, 0.3626, 0.4892, 0.4141], ... [1.4115, 1.2183, 0.5503, 0.6520], ... ] ... ) >>> x = flow.tensor(arr, dtype=flow.float32) >>> generator = flow.Generator() >>> y = flow.nn.functional.dropout(x, p=0.5, generator=generator)
Example 2:
>>> import numpy as np >>> import oneflow as flow >>> arr = np.array( ... [ ... [0.7797, 0.2264, 0.2458, 0.4163], ... [0.4299, 0.3626, 0.4892, 0.4141], ... [1.4115, 1.2183, 0.5503, 0.6520], ... ] ... ) >>> x = flow.tensor(arr, dtype=flow.float32) >>> addend = flow.ones((3, 4), dtype=flow.float32) >>> y = flow.nn.functional.dropout(x, p=0, addend=addend) >>> y tensor([[ 0.2203, 1.2264, 1.2458, 1.4163], [ 1.4299, 1.3626, 0.5108, 1.4141], [0.4115, 2.2183, 0.4497, 1.6520]], dtype=oneflow.float32)
See
Dropout
for details.

oneflow.nn.functional.
affine_grid
(theta, size: List[int], align_corners: bool = False)¶ The interface is consistent with PyTorch. The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.affine_grid.html.
Generates a 2D or 3D flow field (sampling grid), given a batch of affine matrices
theta
.Note
This function is often used in conjunction with
grid_sample()
to build Spatial Transformer Networks . Parameters
theta (Tensor) – input batch of affine matrices with shape (\(N, 2, 3\)) for 2D or (\(N, 3, 4\)) for 3D
size (oneflow.Size) – the target output image size. (\(N, C, H, W\) for 2D or \(N, C, D, H, W\) for 3D) Example: oneflow.Size((32, 3, 24, 24))
align_corners (bool) – if
True
, consider1
and1
to refer to the centers of the corner pixels rather than the image corners. Refer togrid_sample()
for a more complete description. A grid generated byaffine_grid()
should be passed togrid_sample()
with the same setting for this option. Default:False
 Returns
output Tensor of size (\(N, H, W, 2\))
 Return type
output (Tensor)
Examples:
>>> import oneflow as flow >>> import numpy as np >>> input = flow.tensor(np.arange(1., 7).reshape((1, 2, 3)), dtype=flow.float32) >>> output = flow.nn.functional.affine_grid(input, flow.Size([1, 1, 2, 2]), align_corners=True) >>> output tensor([[[[ 0., 3.], [ 2., 5.]], [[ 4., 7.], [ 6., 15.]]]], dtype=oneflow.float32)

oneflow.nn.functional.
grid_sample
(input, grid, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = False)¶ The interface is consistent with PyTorch. The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.grid_sample.html.
Given an
input
and a flowfieldgrid
, computes theoutput
usinginput
values and pixel locations fromgrid
.Currently, only spatial (4D) and volumetric (5D)
input
are supported.In the spatial (4D) case, for
input
with shape \((N, C, H_{in}, W_{in})\) andgrid
with shape \((N, H_{out}, W_{out}, 2)\), the output will have shape \((N, C, H_{out}, W_{out})\).For each output location
output[n, :, h, w]
, the size2 vectorgrid[n, h, w]
specifiesinput
pixel locationsx
andy
, which are used to interpolate the output valueoutput[n, :, h, w]
. In the case of 5D inputs,grid[n, d, h, w]
specifies thex
,y
,z
pixel locations for interpolatingoutput[n, :, d, h, w]
.mode
argument specifiesnearest
orbilinear
interpolation method to sample the input pixels.grid
specifies the sampling pixel locations normalized by theinput
spatial dimensions. Therefore, it should have most values in the range of[1, 1]
. For example, valuesx = 1, y = 1
is the lefttop pixel ofinput
, and valuesx = 1, y = 1
is the rightbottom pixel ofinput
.If
grid
has values outside the range of[1, 1]
, the corresponding outputs are handled as defined bypadding_mode
. Options arepadding_mode="zeros"
: use0
for outofbound grid locations,padding_mode="border"
: use border values for outofbound grid locations,padding_mode="reflection"
: use values at locations reflected by the border for outofbound grid locations. For location far away from the border, it will keep being reflected until becoming in bound, e.g., (normalized) pixel locationx = 3.5
reflects by border1
and becomesx' = 1.5
, then reflects by border1
and becomesx'' = 0.5
.
Note
This function is often used in conjunction with
affine_grid()
to build Spatial Transformer Networks .Note
NaN values in
grid
would be interpreted as1
. Parameters
input (Tensor) – input of shape \((N, C, H_{in}, W_{in})\) (4D case) or \((N, C, D_{in}, H_{in}, W_{in})\) (5D case)
grid (Tensor) – flowfield of shape \((N, H_{out}, W_{out}, 2)\) (4D case) or \((N, D_{out}, H_{out}, W_{out}, 3)\) (5D case)
mode (str) – interpolation mode to calculate output values
'bilinear'
'nearest'
'bicubic'
. Default:'bilinear'
Note:mode='bicubic'
supports only 4D input. Whenmode='bilinear'
and the input is 5D, the interpolation mode used internally will actually be trilinear. However, when the input is 4D, the interpolation mode will legitimately be bilinear.padding_mode (str) – padding mode for outside grid values
'zeros'
'border'
'reflection'
. Default:'zeros'
align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. If set to
True
, the extrema (1
and1
) are considered as referring to the center points of the input’s corner pixels. If set toFalse
, they are instead considered as referring to the corner points of the input’s corner pixels, making the sampling more resolution agnostic. This option parallels thealign_corners
option ininterpolate()
, and so whichever option is used here should also be used there to resize the input image before grid sampling. Default:False
 Returns
output Tensor
 Return type
output (Tensor)
Note
mode='bicubic'
is implemented using the cubic convolution algorithm with \(\alpha=0.75\). The constant \(\alpha\) might be different from packages to packages. For example, PIL and OpenCV use 0.5 and 0.75 respectively. This algorithm may “overshoot” the range of values it’s interpolating. For example, it may produce negative values or values greater than 255 when interpolating input in [0, 255]. Clamp the results with :func: flow.clamp to ensure they are within the valid range.Examples:
>>> import oneflow as flow >>> import numpy as np >>> input = flow.tensor(np.arange(1., 11).reshape((1, 1, 2, 5)), dtype=flow.float32) >>> np_grid = np.array( ... [[[0.9, 4.1], [0, 0.2000], [1, 1], [0.333, 1e6], [0.5, 1.0]], ... [[1.0, 0.5], [0, 0.3333], [1, 1], [0.200, 1e6], [1.5, 0.5]]] ... ).reshape(1, 2, 5, 2) >>> grid = flow.tensor(np_grid, dtype=flow.float32) >>> output = flow.nn.functional.grid_sample(input, grid, mode='nearest', padding_mode='zeros', ... align_corners=True) >>> output tensor([[[[0., 8., 5., 7., 9.], [1., 8., 5., 8., 0.]]]], dtype=oneflow.float32)

oneflow.nn.functional.
interpolate
(input, size=None, scale_factor=None, mode='nearest', align_corners=None, recompute_scale_factor=None)¶ The interface is consistent with PyTorch.
The documentation is referenced from: https://pytorch.org/docs/1.10/_modules/torch/nn/functional.html#interpolate.
Down/up samples the input to either the given
size
or the givenscale_factor
The algorithm used for interpolation is determined by
mode
.Currently temporal, spatial and volumetric sampling are supported, i.e. expected inputs are 3D, 4D or 5D in shape.
The input dimensions are interpreted in the form: minibatch x channels x [optional depth] x [optional height] x width.
The modes available for resizing are: nearest, linear (3Donly), bilinear, bicubic (4Donly), trilinear (5Donly), area
 Parameters
input (Tensor) – the input tensor
size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int]) – output spatial size.
scale_factor (float or Tuple[float]) – multiplier for spatial size. Has to match input size if it is a tuple.
mode (str) – algorithm used for upsampling:
'nearest'
'linear'
'bilinear'
'bicubic'
'trilinear'
'area'
. Default:'nearest'
align_corners (bool, optional) – Geometrically, we consider the pixels of the input and output as squares rather than points. If set to
True
, the input and output tensors are aligned by the center points of their corner pixels, preserving the values at the corner pixels. If set toFalse
, the input and output tensors are aligned by the corner points of their corner pixels, and the interpolation uses edge value padding for outofboundary values, making this operation independent of input size whenscale_factor
is kept the same. This only has an effect whenmode
is'linear'
,'bilinear'
,'bicubic'
or'trilinear'
. Default:False
recompute_scale_factor (bool, optional) – recompute the scale_factor for use in the interpolation calculation. When scale_factor is passed as a parameter, it is used to compute the output_size. If recompute_scale_factor is
False
or not specified, the passedin scale_factor will be used in the interpolation computation. Otherwise, a new scale_factor will be computed based on the output and input sizes for use in the interpolation computation (i.e. the computation will be identical to if the computed output_size were passedin explicitly). Note that when scale_factor is floatingpoint, the recomputed scale_factor may differ from the one passed in due to rounding and precision issues.
Note
With
mode='bicubic'
, it’s possible to cause overshoot, in other words it can produce negative values or values greater than 255 for images. Explicitly callresult.clamp(min=0, max=255)
if you want to reduce the overshoot when displaying the image.Warning
With
align_corners = True
, the linearly interpolating modes (linear, bilinear, and trilinear) don’t proportionally align the output and input pixels, and thus the output values can depend on the input size. This was the default behavior for these modes up to version 0.3.1. Since then, the default behavior isalign_corners = False
. SeeUpsample
for concrete examples on how this affects the outputs.Warning
When scale_factor is specified, if recompute_scale_factor=True, scale_factor is used to compute the output_size which will then be used to infer new scales for the interpolation.
For example:
>>> import oneflow as flow >>> import numpy as np >>> input = flow.tensor(np.arange(1, 5).reshape((1, 1, 4)), dtype=flow.float32) >>> output = flow.nn.functional.interpolate(input, scale_factor=2.0, mode="linear") >>> output tensor([[[1.0000, 1.2500, 1.7500, 2.2500, 2.7500, 3.2500, 3.7500, 4.0000]]], dtype=oneflow.float32)

oneflow.nn.functional.
ctc_greedy_decoder
()¶ Performs greedy decoding on the logits given in input (best path).
 Parameters
log_probs (oneflow.Tensor) – A Tensor of shape [input_length, batch_size, num_labels]. The logarithmized probabilities of the outputs (e.g. obtained with flow.nn.logsoftmax()).
input_lengths (oneflow.Tensor) – A Tensor of shape [batch_size]. It represent the lengths of the inputs. And the lengths are specified for each sequence to achieve masking under the assumption that sequences are padded to equal lengths.
merge_repeated (bool, optional) – If merge_repeated is True, merge repeated classes in output. This means that if consecutive logits’ maximum indices are the same, only the first of these is emitted. Defaults to True.
 Returns
A Tensor of shape [batch_size, input_length], The decoded outputs. neg_sum_logits(oneflow.Tensor): A float matrix (batch_size x 1) containing, for the sequence found, the negative of the sum of the greatest logit at each timeframe.
 Return type
decoded(oneflow.Tensor)
For example:
>>> import oneflow as flow >>> import numpy as np >>> log_probs = flow.tensor( ... [ ... [[1.54, 1.20, 1.95, 1.65, 1.81], [1.84, 1.74, 1.58, 1.55, 1.12]], ... [[1.68, 1.48, 1.89, 1.30, 2.07], [1.13, 1.45, 1.24, 1.61, 1.66]], ... [[1.56, 1.40, 2.83, 1.67, 1.48], [1.20, 2.01, 2.05, 1.95, 1.24]], ... [[2.09, 1.76, 1.36, 1.67, 1.45], [1.85, 1.48, 1.34, 2.16, 1.55]], ... ] ... ) >>> input_lengths = flow.tensor([4, 4]) >>> decoded, neg_sum_logits = flow.nn.functional.ctc_greedy_decoder(log_probs, input_lengths) >>> decoded tensor([[1, 3, 1, 2], [0, 2, 0, 0]], dtype=oneflow.int64) >>> neg_sum_logits tensor([[5.2600], [4.7900]], dtype=oneflow.float32)

oneflow.nn.functional.
sparse_softmax_cross_entropy
(labels, logits)¶ The interface is consistent with TensorFlow. The documentation is referenced from: https://www.tensorflow.org/api_docs/python/tf/nn/sparse_softmax_cross_entropy_with_logits
Computes sparse softmax cross entropy between logits and labels.
Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.
A common use case is to have logits of shape [batch_size, num_classes] and have labels of shape [batch_size], but higher dimensions are supported, in which case the dimth dimension is assumed to be of size num_classes. logits must have the dtype of float16, float32, or float64, and labels must have the dtype of int32 or int64.
 Parameters
labels (Tensor) – shape with [d_0, d_1, …, d_{r1}] (where r is rank of labels and output) and dtype int32 or int64. Each entry in labels must be an index in [0, num_classes).
logits (Tensor) – Perlabel activations (typically a linear output) of shape [d_0, d_1, …, d_{r1}, num_classes] and dtype float16, float32, or float64. These activation energies are interpreted as unnormalized log probabilities.
 Returns
A Tensor of the same shape as labels and of the same type as logits with the softmax cross entropy loss.
 Return type
output (Tensor)
 Examples::
>>> import numpy as np >>> import oneflow as flow >>> np_logits = np.array( ... [ ... [2.0, 5.0, 0.5, 0.1], ... [0.0, 0.0, 1.9, 1.4], ... [100.0, 100.0, 100.0, 100.0], ... ] ... ) >>> np_labels = np.array([0, 3, 1]) >>> logits = flow.tensor(np_logits, dtype=flow.float32) >>> labels = flow.tensor(np_labels, dtype=flow.int32) >>> output = flow.nn.functional.sparse_softmax_cross_entropy( ... labels=labels, logits=logits ... ) >>> output tensor([ 2.9751e01, 1.1448e+00, 1.4305e06], dtype=oneflow.float32)

oneflow.nn.functional.
embedding
(input, weight, padding_idx=None, max_norm=None, norm_type=None, scale_grad_by_freq=False, sparse=False)¶ A simple lookup table that looks up embeddings in a fixed dictionary and size.
This module is often used to retrieve word embeddings using indices. The input to the module is a list of indices, and the embedding matrix, and the output is the corresponding word embeddings.
See
oneflow.nn.Embedding
for more details. Parameters
input (LongTensor) – Tensor containing indices into the embedding matrix
weight (Tensor) – The embedding matrix with number of rows equal to the maximum possible index + 1, and number of columns equal to the embedding size
padding_idx (int, optional) – If specified, the entries at
padding_idx
do not contribute to the gradient; therefore, the embedding vector atpadding_idx
is not updated during training, i.e. it remains as a fixed “pad”.
For example:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> # a batch of 2 samples of 4 indices each >>> input = flow.tensor([[1,2,4,5],[4,3,2,9]]) >>> # an embedding matrix containing 10 tensors of size 3 >>> embedding_matrix = flow.rand(10, 3) >>> output = F.embedding(input, embedding_matrix) >>> output.shape oneflow.Size([2, 4, 3]) >>> # example with padding_idx >>> input = flow.tensor([[0,2,0,5]]) >>> output = F.embedding(input, embedding_matrix, padding_idx=0) >>> output.shape oneflow.Size([1, 4, 3])

oneflow.nn.functional.
linear
(input, weight, bias=None)¶ Applies a linear transformation to the incoming data: \(y = xA^T + b\).
Shape:
Input: \((N, *, in\_features)\) N is the batch size, * means any number of additional dimensions
Weight: \((out\_features, in\_features)\)
Bias: \((out\_features)\)
Output: \((N, *, out\_features)\)
For example:
>>> import numpy as np >>> import oneflow as flow >>> input = flow.tensor(np.random.randn(128, 20)) >>> weight = flow.tensor(np.random.randn(30, 20)) >>> output = flow.nn.functional.linear(input, weight) >>> output.size() oneflow.Size([128, 30])

oneflow.nn.functional.
cosine_similarity
()¶

oneflow.nn.functional.
cross_entropy
()¶ The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.functional.cross_entropy.html.
See
CrossEntropyLoss
for details. Parameters
input (Tensor) – \((N, C)\) where C = number of classes or \((N, C, H, W)\) in case of 2D Loss, or \((N, C, d_1, d_2, ..., d_K)\) where \(K \geq 1\) in the case of Kdimensional loss. input is expected to contain unnormalized scores (often referred to as logits).
target (Tensor) – If containing class indices, shape \((N)\) where each value is \(0 \leq \text{targets}[i] \leq C1\), or \((N, d_1, d_2, ..., d_K)\) with \(K \geq 1\) in the case of Kdimensional loss. If containing class probabilities, same shape as the input.
weight (Tensor, optional) – a manual rescaling weight given to each class. If given, has to be a Tensor of size C
ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. When
size_average
isTrue
, the loss is averaged over nonignored targets. Note thatignore_index
is only applicable when the target contains class indices. Default: 100reduction (string, optional) – Specifies the reduction to apply to the output:
'none'
'mean'
'sum'
.'none'
: no reduction will be applied,'mean'
: the sum of the output will be divided by the number of elements in the output,'sum'
: the output will be summed. Note:size_average
andreduce
are in the process of being deprecated, and in the meantime, specifying either of those two args will overridereduction
. Default:'mean'
For example:
>>> import oneflow as flow >>> import oneflow.nn.functional as F >>> input = flow.randn(3, 5, requires_grad=True) >>> target = flow.ones(3, dtype=flow.int64) >>> loss = F.cross_entropy(input, target) >>> loss.backward()