oneflow.nn.RMSLayerNorm¶
-
class
oneflow.nn.
RMSLayerNorm
(hidden_size, eps=1e-06)¶ Construct a layernorm module in the T5 style. No bias and no subtraction of mean.
T5 uses a layer_norm which only scales and doesn’t shift, which is also known as Root Mean
Square Layer Normalization https://arxiv.org/abs/1910.07467 thus varience is calculated
w/o mean and there is no bias. Additionally we want to make sure that the accumulation for
half-precision inputs is done in fp32.
- Parameters
hidden_size (int) – number of features in the hidden state
eps – a value added to the denominator for numerical stability. Default: 1e-6
- Shape:
Input: \((N, *)\)
Output: \((N, *)\) (same shape as input)
For example:
>>> import oneflow as flow >>> x = flow.randn(2, 4, 3) >>> m = flow.nn.RMSLayerNorm(3) >>> y = m(x) >>> y.size() oneflow.Size([2, 4, 3])