Sigmoid Activation Function
A sigmoid activation function has a characteristic "S"-shaped curve defined, using e (Euler's Number), by the formula:
The curve produced has a fairly gradual rise:
The derivative of the function is:
Common negative comments about the sigmoid activation function include:
Sigmoids can saturate and kill gradients. Gradients (change) at the tails are almost zero.
Sigmoid outputs are all positive values. This can bias network results. The effect can be mitigated by not using sigmoids in the final layers of a network.