Optimizing Neural Networks with Convex Hybrid Combination of Activation Functions: A New Approach to Enhance Gradient Flow and Learning Dynamics.
Publication Date: 28/01/2025
Author(s): Samuel O. Essang, Jackson E. Ante, Augustine O. Otobi, Stephen I. Okeke, Ubong D. Akpan, Runyi E. Francis, Jonathan T. Auta, Daniel. E. Essien, Sunday E. Fadugba, Olamide M. Kolawole, Edet E. Asanga, and Benedict I. Ita.
Volume/Issue: Volume 5 , Issue 1 (2025)
Abstract:
Activation functions are crucial for the efficacy of neural networks as they introduce non-linearity and affect gradient propagation. Traditional activation functions, including Sigmoid, ReLU, Tanh, Leaky ReLU, and ELU, possess distinct advantages but also demonstrate limits such as vanishing gradients and inactive neurons. This research introduces an innovative method that integrates five activation functions using coefficients to formulate a new hybrid activation function. This integrated function seeks to harmonize the advantages of each element, alleviate their deficiencies, and enhance network training and generalization. Our mathematical study, graphical visualization, and hypothetical tests demonstrate that the combined activation function provides enhanced gradient flow in deeper layers, expedited convergence, and improved generalization relative to individual activation functions. Significant outcomes encompass the alleviation of disappearing gradient and inactive neuron issues, augmented gradient stability, and enhanced expressiveness in intricate neural networks. These findings indicate that mixed activation functions can enhance the learning dynamics of deep networks, offering an effective and resilient alternative to conventional activation functions.
Keywords:
Neural Networks, Activation Functions, ReLU, Sigmoid, Tanh, Leaky ReLU, ELU, Gradient Flow, Vanishing Gradient Problem, Deep Learning.