To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The effect of Target Normalization and Momentum on Dying ReLU
KTH Royal Institute of Technology, Stockholm, Sweden.
Univrses AB, Stockholm, Sweden.
KTH Royal Institute of Technology, Stockholm, Sweden.ORCID iD: 0000-0003-2965-2953
Örebro University, School of Science and Technology. (AASS)ORCID iD: 0000-0003-3958-6179
2020 (English)In: The 32nd annual workshop of the Swedish Artificial Intelligence Society (SAIS), 2020Conference paper, Published paper (Refereed)
Abstract [en]

Optimizing parameters with momentum, normalizing data values, and using rectified linear units (ReLUs) are popular choices in neural network (NN) regression. Although ReLUs are popular, they can collapse to a constant function and" die", effectively removing their contribution from the model. While some mitigations are known, the underlying reasons of ReLUs dying during optimization are currently poorly understood. In this paper, we consider the effects of target normalization and momentum on dying ReLUs. We find empirically that unit variance targets are well motivated and that ReLUs die more easily, when target variance approaches zero. To further investigate this matter, we analyze a discrete-time linear autonomous system, and show theoretically how this relates to a model with a single ReLU and how common properties can result in dying ReLU. We also analyze the gradients of a single-ReLU model to identify saddle points and regions corresponding to dying ReLU and how parameters evolve into these regions when momentum is used. Finally, we show empirically that this problem persist, and is aggravated, for deeper models including residual networks.

Place, publisher, year, edition, pages
2020.
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-89069OAI: oai:DiVA.org:oru-89069DiVA, id: diva2:1523714
Conference
The 32nd annual workshop of the Swedish Artificial Intelligence Society (SAIS), Gothenburg, Sweden (Virtual), June 16–17, 2020
Available from: 2021-01-29 Created: 2021-01-29 Last updated: 2023-05-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Stork, Johannes Andreas

Search in DiVA

By author/editor
Kragic, DanicaStork, Johannes Andreas
By organisation
School of Science and Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 231 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf