*This is inspired from Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy by Andrej Karpathy.*

*The blog post updated in December, 2017 based on feedback from @AlexSherstinsky; Thanks!*

This is a simple implementation of Long short-term memory (LSTM) module on numpy from scratch. This is for learning purposes. The network is trained with stochastic gradient descent with a batch size of 1 using AdaGrad algorithm (with momentum).

You can download the jupyter notebook from http://blog.varunajayasiri.com/ml/numpy_lstm.ipynb

The model usually reaches an error of about 45 after 5000 iterations when tested with 100,000 character sample from Shakespeare. However it sometimes get stuck in a local minima; reinitialize the weights if this happens.

You need to place the input text file as `input.txt` in the same folder as the python code.