Skip to content

A haphazardly implemented perceptron in c++ from scratch

License

Notifications You must be signed in to change notification settings

Aggerio/Cpp-Perceptron

Repository files navigation

the goal is to make a simple perceptron which can distinguish

hand written digits

input - 28 x 28

output - 10

middle layers - I am going with 2, should be more than enough

so that gives me about 7840 neurons in between each neuron has about 784 inputs so that means about 784 weight per neuron

that gives about 784 * 784 roughly weights = 614656 weights

gonna train them with backpropagation

//NEURON -

Y = sigma((weight * input) + bias)

//ACTIVATION FUNCTION =

Probably gonna go with relu activation function

// MIDDLE LAYER PLANNING

Gonna do a seperate input neuron and seperate output neuron

The 1st layer will be custom fed the input in a for loop

The following layers will work backwards from the 1st layer

A seperate output layer will be used to translate the output from the last layer to desired no. of outputs

//NOTES --> Weights Implementation

for 2 layers we will require three weights array

size of the elements in the array will be -

std::vector<std::vector<float>> weights(HIDDEN_LAYERS + 1);

the first element will contain the weights same as the no. of inputs

so

std::vector<float> layer1_special_case(INPUTS);
weights.push_back(layer1_special_case)

Then we can fill the rest of them with the same no. of weights as the no. of neurons

for(int i = 0; i < HIDDEN_LAYERS; ++i)
{
  std::vector<float> temp1(NEURON_PER_LAYER);
  weights.push_back(temp1);
}

Now we have our weights

//NOTES --> Evaluating the result and setting the output neuron to that

Assumptions -->

  1. For example we have a 2 3 3 2 neural network we have 2 i/p, 2 hidden layers, and 2 output

Notation --> output 1 --> o1 output 2 --> o2

  input 1 --> i1
  input 2 --> i2
  bias per neuron --> b
  Activation function --> sigma

every weight is signified by a 3 digit following it 1st) --> The layer 2nd) --> The no. of the neuron in that layer from the top 3rd) --> The correspoding no. of weight of the neuron

      weights for the 1st neuron of 1st layer --> w111, w112
      weights for the 2nd neuron of 1st layer --> w121, w122
      weights for the 3rd neuron of 1st layer --> w131, w132


      weights for the 1st neuron of 2nd layer --> w211, w212, w213
      weights for the 2nd neuron of 2nd layer --> w221, w222, w223
      weights for the 3rd neuron of 2nd layer --> w231, w232, w233

      weights for the 1st neuron of 3rd layer --> w311, w312, w313
      weights for the 2nd neuron of 3rd layer --> w321, w322, w323

Value of Neuron in each layer -->

Represented by L following two digits

  1st) --> Gives the layer
  2nd) --> Gives the no. of the neuron in the layer from the top

      Value of neuron in Layer 1 and at position 1 --> L11
      Value of neuron in Layer 1 and at position 2 --> L12
      Value of neuron in Layer 1 and at position 3 --> L13

      Value of neuron in Layer 2 and at position 1 --> L21
      Value of neuron in Layer 2 and at position 2 --> L22
      Value of neuron in Layer 2 and at position 3 --> L23

We can see for 2 HIDDEN_LAYERS we have to calculate a neurons value 3 times

Assuming we have inputs --> std::vector inputs{i1,i2};

then we can probably just use a for loop

Therefore Calculating the Values of Each layer of neurons

      L11 = sigma(i1 * w111 + i2 * w112 ) + b
      L12 = sigma(i1 * w121 + i2 * w122 ) + b
      L13 = sigma(i1 * w131 + i2 * w132 ) + b

      L21 = sigma(L11 * w211 + L12 * w212 + L13 * w213) + b
      L22 = sigma(L11 * w221 + L12 * w222 + L13 * w223) + b
      L23 = sigma(L11 * w231 + L12 * w232 + L13 * w233) + b

      o1 = sigma(L21 * w311 + L22 * w312 + L23 * w313) + b
      o2 = sigma(L21 * w321 + L22 * w322 + L23 * w323) + b
std::vector<float> inputs {i1,i2}; //Given vector of inputs
std::vector<std::vector<Neuron>> layers; // Initialize to required size
for(int i = 0; i < HIDDEN_LAYERS+1; ++i)
{
  std::vector<Neuron> temp_Neuron_layer;
  for(int j = 0; j < weights[i].size(); ++j )
  {
    Neuron temp_Neuron;
    for(int k = 0; k < weights[i][j].size(); ++k)
    {
      if( i == 0 )
      {
        temp_Neuron.value += inputs[k] * weights[i][j][k];
      }
      else
      {
        temp_Neuron.value += layers[i-1][j] * weights[i][j][k];
      }
    }
    temp_Neuron_layer.push_back(Neuron);
  }
  layers.push_back(temp_Neuron_layer);
}

// TODO - How the fuck do I implement backpropagation

//NOTES --> Backpropagation

Following the notation of Rojas 1996, chapter 7, backpropagation computes partial derivatives of the error function E (aka cost, aka loss)

∂E/∂w[i,j] = delta[j] * o[i] where w[i,j] is the weight of the connection between neurons i and j, j being one layer higher in the network than i, and o[i] is the output (activation) of i (in the case of the "input layer", that's just the value of feature i in the training sample under consideration). How to determine delta is given in any textbook and depends on the activation function, so I won't repeat it here.

These values can then be used in weight updates, e.g.

// update rule for vanilla online gradient descent w[i,j] -= gamma _ o[i] _ delta[j] where gamma is the learning rate.

The rule for bias weights is very similar, except that there's no input from a previous layer. Instead, bias is (conceptually) caused by input from a neuron with a fixed activation of 1. So, the update rule for bias weights is

bias[j] -= gamma_bias _ 1 _ delta[j] where bias[j] is the weight of the bias on neuron j, the multiplication with 1 can obviously be omitted, and gamma_bias may be set to gamma or to a different value. If I recall correctly, lower values are preferred, though I'm not sure about the theoretical justification of that.

This is a stack overflow answer

Ok finally understood it

About

A haphazardly implemented perceptron in c++ from scratch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published