Saturday, July 29, 2023

Sigmoid Function and Python Implementation

The sigmoid function is a widely used activation function in artificial neural networks. It maps the input to a value between 0 and 1, which allows it to introduce non-linearity in the model's output. The sigmoid function is also known as the logistic function, and it is mathematically defined as:

f(x) = 1 / (1 + exp(-x))

Where:

x is the input to the function, which can be a single value or a vector (in the case of neural networks, it is usually the weighted sum of inputs to a neuron).

exp denotes the exponential function, which raises the mathematical constant "e" (approximately 2.71828) to the power of the argument.

The sigmoid function "squashes" the input into the range (0, 1), making it useful for binary classification tasks, where the output can be interpreted as a probability. When x is large and positive, the sigmoid function approaches 1, and when x is large and negative, the function approaches 0. As x approaches 0, the sigmoid function asymptotically approaches 0.5.


While the sigmoid function was one of the earliest activation functions used in neural networks, it has some limitations, particularly in deep networks:

  1. Vanishing Gradient Problem: During backpropagation, gradients can become extremely small for very positive or negative inputs to the sigmoid function. As a result, the learning process may slow down, and the network may have difficulty learning long-range dependencies.
  2. Output Saturation: For very positive or negative inputs, the sigmoid function outputs values close to 0 or 1, respectively. This can cause the network to be insensitive to changes in those regions, leading to slower learning and reduced expressiveness.


Due to these limitations, alternative activation functions like ReLU (Rectified Linear Unit) and its variants have gained popularity in deep learning architectures as they mitigate the vanishing gradient problem and accelerate training. However, sigmoid functions can still be useful in certain scenarios, such as the output layer of binary classifiers where the output represents probabilities of a binary event.


Python Implementation

To implement the sigmoid function in the python programming language, all you have to do is write the following code.

The result of running this code is shown in the original image at the top of the article. You can run the code in the following colab notebook: sigmoid acrtivation function 

No comments:

Post a Comment