tensornn.activation

This file contains the activation functions of TensorNN. Activation functions modify their input to create non-linearity in the network. This allows your network to handle more complex problems. They are very similar to a layer.

Classes

Activation

Base activation class.

ELU

Exponential linear unit is similar to ReLU, but it is not piecewise.

LeakyReLU

Leaky ReLU is extremely similar to ReLU.

LecunTanh

The LeCun Tanh function is a scaled version of the tanh function, such that LecunTanh(1) = 1 and LecunTanh(-1) = -1

NewtonsSerpentine

NOTE: THIS IS NOT A GOOD CANDIDATE.

NoActivation

Linear activation function, doesn't change anything.

ReLU

The rectified linear unit activation function is one of the simplest activation function.

Sigmoid

The sigmoid function's output is always between -1 and 1 Formula: 1 / (1+e^(-x)) | constants: e(Euler's number, 2.718...)

Softmax

The softmax activation function is most commonly used in the output layer.

Swish

The swish activation function is the output of the sigmoid function multiplied by x.

Tanh

The tanh function is similar to the sigmoid function, but it is always between -1 and 1.

class tensornn.activation.Activation

Bases: ABC, TensorNNObject

Base activation class. All activation classes should inherit from this.

abstractmethod derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

abstractmethod forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.ELU(a: float = 1)

Bases: Activation

Exponential linear unit is similar to ReLU, but it is not piecewise. Formula: A*((e^x)-1) | constants: A, e(Euler’s number, 2.718…)

Ex, A=1: [12.319, -91.3, 0.132] -> [12.319, -1, 0.132]

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.LeakyReLU(a: float = 0.1)

Bases: Activation

Leaky ReLU is extremely similar to ReLU. ReLU is LeakyReLU if A was 1. Formula: if x>=0, x; if x<0, Ax | constants: A(leak)

Ex, A=0.1: [12.319, -91.3, 0.132] -> [12.319, -9.13, 0.132]

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.LecunTanh

Bases: Activation

The LeCun Tanh function is a scaled version of the tanh function, such that LecunTanh(1) = 1 and LecunTanh(-1) = -1

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.NewtonsSerpentine(a: float = 1, b: float = 1)

Bases: Activation

NOTE: THIS IS NOT A GOOD CANDIDATE. Larger numbers result in a lower value, which means being large doesn’t give importance. Do not use unless you want to have some fun ;)

Formula: (A*B*x)/(x^2+A^2) | A, B constants

Ex, A=1,B=1: [12.319, -91.3, 0.132] -> [0.08064402, -0.01095159, 0.12973942]

https://mathworld.wolfram.com/SerpentineCurve.html

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.NoActivation

Bases: Activation

Linear activation function, doesn’t change anything. Use this if you don’t want an activation.

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.ReLU

Bases: Activation

The rectified linear unit activation function is one of the simplest activation function. It is a piecewise function. Formula: if x>=0, x; if x<0, 0

Ex: [12.319, -91.3, 0.132] -> [12.319, 0, 0.132]

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.Sigmoid

Bases: Activation

The sigmoid function’s output is always between -1 and 1 Formula: 1 / (1+e^(-x)) | constants: e(Euler’s number, 2.718…)

Ex: [12.319, -91.3, 0.132] -> [9.99995534e-01, 2.23312895e-40, 5.32952167e-01]

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.Softmax

Bases: Activation

The softmax activation function is most commonly used in the output layer. If you are using this activation function, you should be using tnn.CategoricalCrossEntropy as your loss function. This is because the softmax function always generates a probability distribution with all values between 0 and 1, and for these types of values, tnn.CategoricalCrossEntropy is the best loss function to use.

The goal of softmax is to convert the predicted values of the network into percentages that add up to 1. Ex. it converts [-1.42, 3.312, 0.192] to [0.00835, 0.94970, 0.41935] which is much easier to understand.

When coming up with a way to write this, a big problem is negative numbers since we can’t have negative numbers in our final output. So how do we get rid of them? Do we clip them to 0? Do we square them? Do we use absolute value? Though all these methods seem nice, they take away from the value of negative numbers. If we clip to 0 then negative numbers are no more than just 0, and squaring or using absolute value will just result in the opposite of what we want (large negative number turns into large positive number). So the most effective way is to use exponentiation. Through exponentiation, negative numbers will be small while positive numbers will be large.

But exponentiation raises a new problem, super large numbers which can cause overflow. Fortunately there is a simple solution, we can convert all the values into non positive values prior to exponentiation. We can do this by subtracting each value by the maximum value of our output. This way our values before exponentiation will range between -inf to 0 and our values after exponentiation will range between 0 (e^-inf) to 1 (e^0).

Finally, to come up with all the percentages we can just figure out how much each value contributes to the final sum, what fraction of the sum does each value make. So we can do each value divided by the total sum.

All steps/TLDR: Starting values (from previous example): [-1.42, 3.312, 0.192] Subtract largest value to make all negative: 3.312 is max so subtract from all values, [-4.732, 0, -3.120] Exponentiation, raise each value to e (e^x): [0.0080884, 1, 0.04415717] Come up with percentages, divide each number by the sum: sum is 1.05224557 so we divide each value by it, [0.00836574, 0.94969828, 0.04193599]

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.Swish

Bases: Activation

The swish activation function is the output of the sigmoid function multiplied by x. Formula: x / (1+e^(-x)) | constants: e(Euler’s number, 2.718…)

Ex: [12.319, -91.3, 0.132] -> [1.23189450e+01, -2.03884673e-38, 7.03496861e-02]

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function

class tensornn.activation.Tanh

Bases: Activation

The tanh function is similar to the sigmoid function, but it is always between -1 and 1.

derivative(inputs: Tensor) Tensor

The derivative of the function. Used for backpropagation.

Parameters:

inputs – get the derivative of the function at this input

Returns:

the derivative of the function at the given input

forward(inputs: Tensor) Tensor

Calculate a forwards pass of this activation function.

Parameters:

inputs – the outputs from the previous layer

Returns:

the inputs after they are passed through the activation function