1 min readJun 4, 2018
Hey, great work! It’s also worth checking against the last change than just the average change. Also I believe it’s gradient exploding problem you solve by clipping, not vanishing.
Hey, great work! It’s also worth checking against the last change than just the average change. Also I believe it’s gradient exploding problem you solve by clipping, not vanishing.
PhD in Machine Learning | Founder of DeepSchool.io