Exploding Gradients

定义 Definition

Exploding gradients（梯度爆炸）：在训练神经网络（尤其是深层网络或循环神经网络）时，反向传播得到的梯度在层与层（或时间步）之间不断放大，导致梯度数值变得非常大，从而引起参数更新过猛、训练不稳定、损失发散甚至出现数值溢出（NaN/Inf）。该术语也常与 vanishing gradients（梯度消失）并列讨论；二者都与梯度在传播过程中的缩放有关。

发音 Pronunciation (IPA)

/ɪkˈsploʊdɪŋ ˈɡreɪdiənts/

例句 Examples

The model failed to train because of exploding gradients.
由于梯度爆炸，这个模型训练失败了。

Without gradient clipping, the recurrent network may suffer exploding gradients, causing unstable updates and a rapidly increasing loss.
如果不做梯度裁剪，循环神经网络可能会出现梯度爆炸，导致参数更新不稳定、损失值快速上升。

词源 Etymology

exploding 来自动词 explode（“爆炸、猛增”），引申为“数值急剧变大”；gradients 是 gradient（“梯度”）的复数。在机器学习语境中，gradient（梯度）指损失函数对参数的偏导数；“exploding gradients”形象地描述了这些导数在反向传播中“越传越大、像爆炸一样失控”的现象。该说法在深度学习与RNN训练问题的讨论中逐渐固定为常用术语。

文献与作品 Literary Works

Ian Goodfellow, Yoshua Bengio, Aaron Courville：《Deep Learning》（书中讨论梯度消失/爆炸及训练稳定性方法）
Yoshua Bengio, Patrice Simard, Paolo Frasconi (1994)：Learning Long-Term Dependencies with Gradient Descent is Difficult（经典论文，讨论长程依赖与梯度问题）
Sepp Hochreiter (1991)：Untersuchungen zu dynamischen neuronalen Netzen（早期系统分析RNN训练中的梯度问题）
Aurélien Géron：《Hands‑On Machine Learning with Scikit‑Learn, Keras & TensorFlow》（实践书籍中常提到梯度裁剪等应对手段）

Exploding Gradients

定义 Definition

发音 Pronunciation (IPA)

例句 Examples

词源 Etymology

相关词 Related Words

文献与作品 Literary Works