Smooth Maximum
An overview of the concept “Smooth Maximum”. In this work, we discuss important concepts of smooth max, a concept useful to make the maximum operator differentiable in deep learning.
Smooth Maximum
For large positive values of parameter , the following formulation is a smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.
Thus, has the following useful properties:
- as .
- as .
- as .
LogSumExp
Another option for a smooth maximum function is the LogSumExp.
The formulation shares derivation from entropic regularization process in reinforcement learning.
p-Norm
Another smooth maximum is the p-norm. As , the p-Norm tends to the maximum funciton.
An intrinsic advantage of the p-norm is that it is a norm. As such, it is “scale invariant” (homogeneous).