tensornn.activation

This file contains the activation functions of TensorNN. Activation functions modify their input to create non-linearity in the network. This allows your network to handle more complex problems. They are very similar to a layer.

Classes

`Activation`	Base activation class.
`ELU`	Exponential linear unit is similar to ReLU, but it is not piecewise.
`LeakyReLU`	Leaky ReLU is extremely similar to ReLU.
`LecunTanh`	The LeCun Tanh function is a scaled version of the tanh function, such that LecunTanh(1) = 1 and LecunTanh(-1) = -1
`NewtonsSerpentine`	NOTE: THIS IS NOT A GOOD CANDIDATE.
`NoActivation`	Linear activation function, doesn't change anything.
`ReLU`	The rectified linear unit activation function is one of the simplest activation function.
`Sigmoid`	The sigmoid function's output is always between -1 and 1 Formula: `1 / (1+e^(-x))` \| constants: e(Euler's number, 2.718...)
`Softmax`	The softmax activation function is most commonly used in the output layer.
`Swish`	The swish activation function is the output of the sigmoid function multiplied by x.
`Tanh`	The tanh function is similar to the sigmoid function, but it is always between -1 and 1.

class tensornn.activation.Activation

Bases: ABC, TensorNNObject

Base activation class. All activation classes should inherit from this.

abstractmethod derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

abstractmethod forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.ELU(a: float = 1)

Bases: Activation

Exponential linear unit is similar to ReLU, but it is not piecewise. Formula: A*((e^x)-1) | constants: A, e(Euler’s number, 2.718…)

Ex, A=1: [12.319, -91.3, 0.132] -> [12.319, -1, 0.132]

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.LeakyReLU(a: float = 0.1)

Bases: Activation

Leaky ReLU is extremely similar to ReLU. ReLU is LeakyReLU if A was 1. Formula: if x>=0, x; if x<0, Ax | constants: A(leak)

Ex, A=0.1: [12.319, -91.3, 0.132] -> [12.319, -9.13, 0.132]

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.LecunTanh

Bases: Activation

The LeCun Tanh function is a scaled version of the tanh function, such that LecunTanh(1) = 1 and LecunTanh(-1) = -1

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.NewtonsSerpentine(a: float = 1, b: float = 1)

Bases: Activation

NOTE: THIS IS NOT A GOOD CANDIDATE. Larger numbers result in a lower value, which means being large doesn’t give importance. Do not use unless you want to have some fun ;)

Formula: (A*B*x)/(x^2+A^2) | A, B constants

Ex, A=1,B=1: [12.319, -91.3, 0.132] -> [0.08064402, -0.01095159, 0.12973942]

https://mathworld.wolfram.com/SerpentineCurve.html

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.NoActivation

Bases: Activation

Linear activation function, doesn’t change anything. Use this if you don’t want an activation.

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.ReLU

Bases: Activation

The rectified linear unit activation function is one of the simplest activation function. It is a piecewise function. Formula: if x>=0, x; if x<0, 0

Ex: [12.319, -91.3, 0.132] -> [12.319, 0, 0.132]

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.Sigmoid

Bases: Activation

The sigmoid function’s output is always between -1 and 1 Formula: 1 / (1+e^(-x)) | constants: e(Euler’s number, 2.718…)

Ex: [12.319, -91.3, 0.132] -> [9.99995534e-01, 2.23312895e-40, 5.32952167e-01]

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.Softmax

Bases: Activation

The softmax activation function is most commonly used in the output layer. If you are using this activation function, you should be using tnn.CategoricalCrossEntropy as your loss function. This is because the softmax function always generates a probability distribution with all values between 0 and 1, and for these types of values, tnn.CategoricalCrossEntropy is the best loss function to use.

The goal of softmax is to convert the predicted values of the network into percentages that add up to 1. Ex. it converts [-1.42, 3.312, 0.192] to [0.00835, 0.94970, 0.41935] which is much easier to understand.

When coming up with a way to write this, a big problem is negative numbers since we can’t have negative numbers in our final output. So how do we get rid of them? Do we clip them to 0? Do we square them? Do we use absolute value? Though all these methods seem nice, they take away from the value of negative numbers. If we clip to 0 then negative numbers are no more than just 0, and squaring or using absolute value will just result in the opposite of what we want (large negative number turns into large positive number). So the most effective way is to use exponentiation. Through exponentiation, negative numbers will be small while positive numbers will be large.

But exponentiation raises a new problem, super large numbers which can cause overflow. Fortunately there is a simple solution, we can convert all the values into non positive values prior to exponentiation. We can do this by subtracting each value by the maximum value of our output. This way our values before exponentiation will range between -inf to 0 and our values after exponentiation will range between 0 (e^-inf) to 1 (e^0).

Finally, to come up with all the percentages we can just figure out how much each value contributes to the final sum, what fraction of the sum does each value make. So we can do each value divided by the total sum.

All steps/TLDR: Starting values (from previous example): [-1.42, 3.312, 0.192] Subtract largest value to make all negative: 3.312 is max so subtract from all values, [-4.732, 0, -3.120] Exponentiation, raise each value to e (e^x): [0.0080884, 1, 0.04415717] Come up with percentages, divide each number by the sum: sum is 1.05224557 so we divide each value by it, [0.00836574, 0.94969828, 0.04193599]

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.Swish

Bases: Activation

The swish activation function is the output of the sigmoid function multiplied by x. Formula: x / (1+e^(-x)) | constants: e(Euler’s number, 2.718…)

Ex: [12.319, -91.3, 0.132] -> [1.23189450e+01, -2.03884673e-38, 7.03496861e-02]

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function

class tensornn.activation.Tanh

Bases: Activation

The tanh function is similar to the sigmoid function, but it is always between -1 and 1.

derivative(inputs: Tensor) → Tensor

The derivative of the function. Used for backpropagation.

Parameters:: inputs – get the derivative of the function at this input
Returns:: the derivative of the function at the given input

forward(inputs: Tensor) → Tensor

Calculate a forwards pass of this activation function.

Parameters:: inputs – the outputs from the previous layer
Returns:: the inputs after they are passed through the activation function