oneflow.nn.RNNCell

class oneflow.nn.RNNCell(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device=None, dtype=None)

An Elman RNN cell with tanh or ReLU non-linearity.

\[h' = \tanh(W_{ih} x + b_{ih} + W_{hh} h + b_{hh})\]

If nonlinearity is ‘relu’, then ReLU is used in place of tanh.

The interface is consistent with PyTorch. The documentation is referenced from: https://pytorch.org/docs/1.10/generated/torch.nn.RNNCell.html.

Parameters
  • input_size – The number of expected features in the input x

  • hidden_size – The number of features in the hidden state h

  • bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True

  • nonlinearity – The non-linearity to use. Can be either 'tanh' or 'relu'. Default: 'tanh'

Inputs: input, hidden
  • input: tensor containing input features

  • hidden: tensor containing the initial hidden state Defaults to zero if not provided.

Outputs: h’
  • h’ of shape (batch, hidden_size): tensor containing the next hidden state for each element in the batch

Shape:
  • input: \((N, H_{in})\) or \((H_{in})\) tensor containing input features where \(H_{in}\) = input_size.

  • hidden: \((N, H_{out})\) or \((H_{out})\) tensor containing the initial hidden state where \(H_{out}\) = hidden_size. Defaults to zero if not provided.

  • output: \((N, H_{out})\) or \((H_{out})\) tensor containing the next hidden state.

weight_ih

the learnable input-hidden weights, of shape (hidden_size, input_size)

weight_hh

the learnable hidden-hidden weights, of shape (hidden_size, hidden_size)

bias_ih

the learnable input-hidden bias, of shape (hidden_size)

bias_hh

the learnable hidden-hidden bias, of shape (hidden_size)

Note

All the weights and biases are initialized from \(\mathcal{U}(-\sqrt{k}, \sqrt{k})\) where \(k = \frac{1}{\text{hidden\_size}}\)

For example:

>>> import oneflow as flow
>>> import oneflow.nn as nn

>>> rnn = nn.RNNCell(10, 20)
>>> input = flow.randn(6, 3, 10)
>>> hx = flow.randn(3, 20)
>>> hx = rnn(input[0], hx)
>>> hx.size()
oneflow.Size([3, 20])
__init__(input_size: int, hidden_size: int, bias: bool = True, nonlinearity: str = 'tanh', device=None, dtype=None)

Initialize self. See help(type(self)) for accurate signature.

Methods

__call__(*args, **kwargs)

Call self as a function.

__delattr__(name, /)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattr__(name)

__getattribute__(name, /)

Return getattr(self, name).

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init__(input_size, hidden_size[, bias, …])

Initialize self.

__init_subclass__

This method is called when a class is subclassed.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(**kwargs)

Create and return a new object.

__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__()

Return repr(self).

__setattr__(name, value)

Implement setattr(self, name, value).

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__

Abstract classes can override this to customize issubclass().

_apply(fn[, applied_dict])

_get_name()

_load_from_state_dict(state_dict, prefix, …)

_named_members(get_members_fn[, prefix, recurse])

_save_to_state_dict(destination, prefix, …)

_shallow_repr()

add_module(name, module)

Adds a child module to the current module.

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self.

buffers([recurse])

Returns an iterator over module buffers.

children()

Returns an iterator over immediate children modules.

cpu()

Moves all model parameters and buffers to the CPU.

cuda([device])

Moves all model parameters and buffers to the GPU.

double()

Casts all floating point parameters and buffers to double datatype.

eval()

Sets the module in evaluation mode.

extra_repr()

Set the extra representation of the module

float()

Casts all floating point parameters and buffers to float datatype.

forward(input[, hx])

half()

Casts all floating point parameters and buffers to half datatype.

load_state_dict(state_dict[, strict])

Copies parameters and buffers from state_dict into this module and its descendants.

modules()

Returns an iterator over all modules in the network.

named_buffers([prefix, recurse])

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

named_modules([memo, prefix])

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

named_parameters([prefix, recurse])

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

parameters([recurse])

Returns an iterator over module parameters.

register_buffer(name, tensor[, persistent])

Adds a buffer to the module.

register_forward_hook(hook)

Registers a forward hook on the module.

register_forward_pre_hook(hook)

Registers a forward pre-hook on the module.

register_parameter(name, param)

Adds a parameter to the module.

reset_parameters()

state_dict([destination, prefix, keep_vars])

Returns a dictionary containing a whole state of the module.

to([device])

Moves the parameters and buffers.

to_consistent(*args, **kwargs)

This interface is no longer available, please use oneflow.nn.Module.to_global() instead.

to_global([placement, sbp])

Convert the parameters and buffers to global.

train([mode])

Sets the module in training mode.

zero_grad([set_to_none])

Sets gradients of all model parameters to zero.