Jordan Savant # Software Engineer

Neural Network Linear Algreba

Prior to this topic is the Perceptron and Multilayer Perceptron. This is information about linear algebra specific to Neural Networks.

Vectors

In an 2-dimensional space, a vector is simply a line, it has a direction and a magnitude. We can represent this vector as: [x y].

In a 3-dimensional space the vector would be three directions and magnitudes: [x y z].

In 4, 5, ... up to N dimensional space, a vector would be a single list of direction and magnitudes, one for each dimension to represent itself.

I have worked with vectors often in game development. A 2D game may have a vector of [1.3 -7.0] which could be the direction and speed of a bullet. The two values constitute a direction in an x/y coordinate system and their magnitude is like their force. These vector values can be normalized to within a range of -1 to 1 to get only their direction without their magnitude of force.

Note these are other ways we can notate a vector:

[x y]

[x]  // one tall square bracket on each side
[y]

(x)  // one tall pair of parenthsis
(y)

Vector Operations

1) Scalar

Each element is multiplied by the scalar and produces another vector.

[2] * 2  = [4]
[3]        [6]

2) Element Wise

Each element is added to another vector and produces another vector.

[2] + [-1] = [1]
[3]   [ 5]   [8]

3) Dot Product

Two vectors multiplied together and summed to produce a scalar.

Each element in the same row multiplied together. The result of each row added together.

[2] . [-1] =  ((2 * -1) + (3 * 5)) = 13
[3]   [ 5]

Matrix

A matrix, instead of a linear list of values, is a 2-dimensional grid of values. They are referred to by rows x columns.

This is an example of a 2x3 matrix:

[a b c]
[d e f]

Here is an example of a matrix implementation in Javascript

function Matrix(rows, cols) {
    this.rows = rows;
    this.cols = cols;
    this.matrix = [];
    // Initialize our 2d array with 0
    for (var i=0; i < rows; i++) {
        this.matrix[i] = [];
        for (var j=0; j < cols; j++) {
            this.matrix[i][j] = 0;
        }
    }
}
var m = new Matrix(2, 3);

Matrix Operations

1) Scalar

Each element is multiplied by the scalar and produces a matrix of the same size.

[ 2 3] * 2 = [ 4  6]
[-4 9]       [-8 18]

[ 2 3] + 2 = [ 4  5]
[-4 9]       [-2 11]

Here is an implementation of this method in our Javascript example.

// Multiplication
Matrix.prototype.scalarMultiply = function (scalar) {
    for (var i=0; i < rows; i++) {
        for (var j=0; j < cols; j++) {
            this.matrix[i][j] *= scalar;
        }
    }
}
// Addition
Matrix.prototype.scalarAdd = function (scalar) {
    for (var i=0; i < rows; i++) {
        for (var j=0; j < cols; j++) {
            this.matrix[i][j] += scalar;
        }
    }
}

2) Element Wise

Each element is added to the corresponding element in the matrices producting a matrix of the same size.

[a b] + [e f] = [a+e b+f]
[c d]   [g h]   [x+g d+h]

We can update our scalarMultiple and scalarAdd to work with other matrices.

// Multiplication
Matrix.prototype.multiply = function (n) {
    if (n instanceof Matrix) {
        for (var i=0; i < rows; i++) {
            for (var j=0; j < cols; j++) {
                this.matrix[i][j] *= n.matrix[i][j]; // element wise (Hadamard Product)
            }
        }
    } else {
        for (var i=0; i < rows; i++) {
            for (var j=0; j < cols; j++) {
                this.matrix[i][j] += n // scalar
            }
        }
    }
}
// Addition
Matrix.prototype.add = function (n) {
    if (n instanceof Matrix) {
        for (var i=0; i < rows; i++) {
            for (var j=0; j < cols; j++) {
                this.matrix[i][j] += n.matrix[i][j]; // element wise
            }
        }
    } else {
        for (var i=0; i < rows; i++) {
            for (var j=0; j < cols; j++) {
                this.matrix[i][j] += n // scalar
            }
        }
    }
}

4) Dot Product

A dot product can only be calculated on two matrices of the same size. The number of columns of one must equal the number of rows of the other.

A = [a b c]  2x3
    [d e f]

B = [g h]  3x2
    [i j]
    [k l]

A . B

These two matrices can be dot producted'ed. But they are not communicative, meaning they cannot be multiplied in either order and produce the same result, A . B != B . A.

The resulting matrix will always be the number of rows of A by the number of colums of B. A . B = 2x2 Matrix.

To calculate the values of the resultant 2x2 matrix we will perform the Vector Dot Product of each row of A to its corresponding column of B.

  • Row 1 in A dot Column 1 in B for position 1,1.
  • Row 1 in A dot Column 2 in B for position 1,2.
  • Row 2 in A dot Column 1 in B for position 2,1.
  • Row 2 in A dot Column 2 in B for position 2,2.
A = [a b c]  2x3
    [d e f]

B = [g h]  3x2
    [i j]
    [k l]

A . B = C

C = [p11 p12]
    [p21 p22]

C = [ (a*g + b*i + c*k) (a*h + b*j + c*l) ]
    [ (d*g + e*i + f*k) (d*h + e*j + f*l) ]

We can expand our Javascript example Matrix to have a dot product operation.

Matrix.prototype.dot = function (n) {

    if (n instanceof Matrix) {
        // check size
        if (this.cols !== n.rows) {
            return undefined;
        }

        let result = new Matrix(this.rows, n.cols);
        for (let i=0; i < result.rows; i++) {
            for (let j=0; j < result.cols; j++) {
                // dot product of values in col
                let sum = 0;
                for (let k=0; k < this.cols; k++) {
                    sum += this.matrix[i][k] * n.matrix[k][j];
                }
                result.matrix[i][j] = sum;
            }
        }
        return result;
    } else {
        return undefined;
    }
}

5) Transpose

Another important operation for matrices in neural networks is the transpose operation. It basically makes the rows the columns and columns the rows.

[a b c]
[d e f]

transposed becomes:

[a d]
[b e]
[c f]

And we can implement in our Javacript Matrix as:

Matrix.prototype.transpose = function() {
    let result = new Matrix(this.cols, this.rows);
    for (var i=0; i < this.rows; i++) {
        for (var j=0; j < this.cols; j++) {
            result.matrix[j][i] = this.matrix[i][j];
        }
    }
    return result;
}