# Stochastic Gradient Descent in AD.

In stochastic gradient descent, the true gradient is approximated by gradient at each single example. As the algorithm sweeps through the training set, it performs the above update for each training example. Several passes can be made over the training set until the algorithm converges, if this is done, the data can be shuffled for each pass to prevent cycles.

Obviously it is faster than normal gradient descent, cause we don’t have to compute cost function over the entire data set in each iteration in case of stochastic gradinet descent.

## stochasticGradientDescent in AD:

This is my implementation of Stochastic Gradient Descent in AD library, you can get it from my fork of AD.

Its type signature is

#### Its arguments are:

• `errorSingle :: (forall s. Reifies s Tape => f (Scalar a) -> f (Reverse s a) -> Reverse s a)` function, that computes error in a single training sample given `theta`
• Entire training data, you should be able to map the above `errorSingle` function over the training data.
• and initial Theta

## Example:

Here is the sample data I’m running `stochasticGradientDescent` on.

Its just 97 rows of samples with two columns, first column is `y` and the other is `x`

Below is our error function, a simple squared loss error function. You can introduce regularization here if you want.

Running Stochastic Gradient Descent:

## Cross checking with SGDRegressor from scikit-learn

The only restriction we have in our implementation of stochasticGradientDescent is that we set the learning rate a default value of 0.001 and is a constant through out the algorithm.

The rest of the things like the sort of regulariztion, regularization parameter, loss function we are using, we can specify in `errorSingle`.

## Results:

So when `n_iter = 1`, went through the entire data set once, so we must check `97th` theta from our regression result from AD. Similarly `n_iter = 2` implies `97*2` iteration in our implementation, and etc.,

Here in this repository, you can find the ipython notebook and haskell code so that you can test these yourself.