GLU(dim: Optional[int] = - 1)¶
The GLU activation.
Input: \((\ast_1, N, \ast_2)\) where * means, any number of additional dimensions
Output: \((\ast_1, M, \ast_2)\) where \(M=N/2\)
The formula is:\[GLU(input) = GLU(a, b) = a \otimes sigmoid(b)\]
where input is split in half along dim to form a and b, ⊗ is the element-wise product between matrices.
>>> import oneflow as flow >>> import oneflow.nn as nn >>> m = nn.GLU() >>> x = flow.tensor([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=flow.float32) >>> y = m(x) >>> y tensor([[0.9526, 1.9640], [4.9954, 5.9980]], dtype=oneflow.float32)