Recurrent Neural Networks and Language Models

Recurrent Neural Networks and Language Models

Equivalence of Heaviside RNNs and Deterministic PFSAs

The hidden states of an Heaviside RNNs live in \( \mathbb{B}^\hiddDim \), and can thus take \( 2^\hiddDim \) different values. This invites an interpretation of \( \hiddState \) as the state of an underlying FSA that transitions between states based on the Heaviside RNNs recurrence, specifying its local conditional distributions with the output matrix \( \outMtx \).

Similarly, one can also imagine designing an Heaviside RNNs that simulates the transitions of a given FSA by appropriately specifying the parameters of the Heaviside RNNs. We explore this connection formally in this section and present the main technical result of the paper. The central result that characterizes the representational capacity of Heaviside RNNs can be informally summarized by the following theorem.

Theorem (Informal): Heaviside RNNs LMs are equivalent to deterministic PFSAs.

We split this result into the question of:

  • how deterministic PFSAs can simulate Heaviside RNNs LMs, and
  • how Heaviside RNNs LMs can simulate deterministic PFSAs.