RNN
$$ h^{(i)} = \sigma{( W^{hx} x^{t} + W^{hh}h^{(t-1)} + b_h )} $$
LSTM
$$ g^{(t)} = \Phi(W^{gx} x^{(t)} + W^{gh}h^{(t-1)} + b_g ) $$
$$ i^{(t)} = \Phi(W^{ix} x^{(t)} + W^{ih}h^{(t-1)} + b_i ) $$
$$ f^{(t)} = \Phi(W^{fx} x^{(t)} + W^{fh}h^{(t-1)} + b_f ) $$
$$ o^{(t)} = \Phi(W^{ox} x^{(t)} + W^{oh}h^{(t-1)} + b_o ) $$
$$ s^{(t)} = g^{(t)} \odot i^{(t)} + s^{(t-1)} \odot f^{(t)} $$
$$ h^{(t)} = \Phi (s^{(t)}) \odot o^{(t)} $$
各个门i, f, o与g近似,都含有$h^{(t-1)}$成分,拥有前面状态的记忆。
GRU
$$ z^{(t)} = \sigma(W^{zx} x^{(t)} + W^{zh}h^{(t-1)} ) $$
$$ r^{(t)} = \sigma(W^{rx} x^{(t)} + W^{rh}h^{(t-1)} ) $$
$$ \tilde{h}^{(t)} = \tanh (W^{hx} x^{(t)} + W^{hh} (r^{(t)} \odot h^{(t-1)})) $$
$$ h^{(t)} = (1-z^{(t)}) \odot h^{(t-1)} + z^{(t)} \odot \tilde{h}^{(t)} $$