add cosine restart learning rate#2953
add cosine restart learning rate#2953hellozhaoming wants to merge 3 commits intodeepmodeling:masterfrom
Conversation
Signed-off-by: hellozhaoming <747247642@qq.com>
Add cosine restart learning rate
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2953 +/- ##
==========================================
- Coverage 75.36% 75.07% -0.30%
==========================================
Files 245 220 -25
Lines 24648 20297 -4351
Branches 1582 903 -679
==========================================
- Hits 18577 15238 -3339
+ Misses 5140 4526 -614
+ Partials 931 533 -398 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| """Get the start lr.""" | ||
| return self.start_lr_ | ||
|
|
||
| def value(self, step: int) -> float: |
There was a problem hiding this comment.
you may not need to implement the value method if you do not print the information regarding the learning rate at the beginning of the training:
https://github.com/hellozhaoming/deepmd-kit/blob/05052c195308f61b63ce2bab130ce0e8cba60604/deepmd/train/trainer.py#L566
njzjz
left a comment
There was a problem hiding this comment.
Please run pre-commit to format and lint the code: https://docs.deepmodeling.com/projects/deepmd/en/master/development/coding-conventions.html#run-scripts-to-check-the-code. Or you can submit from a non-protect branch and pre-commit.ci can do it for you.
Unit tests should be added for two new learning rate classes.
| "softplus": tf.nn.softplus, | ||
| "sigmoid": tf.sigmoid, | ||
| "tanh": tf.nn.tanh, | ||
| "swish": tf.nn.swish, |
There was a problem hiding this comment.
It seems that it has been renamed to silu: tensorflow/tensorflow#41066
| ) | ||
| else: | ||
| for fitting_key in self.fitting: | ||
| if self.lr_type == "exp": |
There was a problem hiding this comment.
It's not a good behavior to switch the learning rate in the Trainer. Instead, implement the method LearningRate.log_start (LearningRate should be an abstract base class and inherited by all learning rate classes) and call self.lr.log_start(self.sess) here.
| [Argument("exp", dict, learning_rate_exp())], | ||
| [Argument("exp", dict, learning_rate_exp()), | ||
| Argument("cos", dict, learning_rate_cos()), | ||
| Argument("cosrestart", dict, learning_rate_cosrestarts())], |
There was a problem hiding this comment.
You may need to add some documentation to variants (doc="xxx"). Otherwise, no one knows what they are.
| ```python | ||
| global_step = min(global_step, decay_steps) | ||
| cosine_decay = 0.5 * (1 + cos(pi * global_step / decay_steps)) | ||
| decayed = (1 - alpha) * cosine_decay + alpha | ||
| decayed_learning_rate = learning_rate * decayed | ||
| ``` |
There was a problem hiding this comment.
Please use this style: https://numpydoc.readthedocs.io/en/latest/format.html#other-points-to-keep-in-mind
|
|
||
| The function returns the cosine decayed learning rate while taking into account | ||
| possible warm restarts. | ||
| ``` |
No description provided.