optimizers module¶
- class optimizers.AdaGradOptimizer(lr_schedule, grad_clipper, epsilon=1e-06)¶
Bases:
optimizers.Optimizer
AdaGrad gradient descent optimizer.
- lr_schedule¶
The learning rate schedule of the optimizer.
- Type
- lr¶
The latest learning rate.
- Type
float
- first_call¶
If first call.
- Type
bool
- epsilon¶
Numerical stability constant.
- Type
float
- cache¶
A list of dicts for cache of feature of param grads such as running mean of squared grads, m.
- Type
list
- __init__()¶
Constructor.
- apply_lr_schedule()¶
Applies the learning rate schedule of the optimizer.
- get_lr()¶
Returns the latest learning rate of the optimizer’s learning rate schedule.
- apply_grads(trainable_params, grads)¶
Applies the gradient update rule to trainable params using gradients.
- build_cache(trainable_params, grads):
Build cache on first call.
- update_cache(trainable_params, grads):
Update cache on call.
- get_opt_grad(trainable_params, grads):
Get the optimizer specific grads computed using the cache.
- apply_grads(trainable_params, grads)¶
Applies the gradient update rule to trainable params using gradients.
- Parameters
trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.
- Returns
updated_trainable_params – The list of dictionaries of the updated trainable parameters of all layers of a model. At idx is the dictionary of the updated trainable parameters of layer idx in the Model.layers list.
- Return type
list
Notes
Iterates over layers in ascending order in the Model.layers list.
- Raises
AssertionError – If the lengths of trainable_weights and grads lists are not the same.
- build_cache(trainable_params, grads)¶
Build cache on first call.
- Parameters
trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.
- Returns
- Return type
Notes
self.cache is built after calling.
- get_opt_grad(trainable_params, grads)¶
Get the optimizer specific grads computed using the cache.
- Parameters
trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.
- Returns
opt_grads – The list of dictionaries of the optimizer specifc gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.
- Return type
list
Notes
None
- update_cache(trainable_params, grads)¶
Update cache on call.
- Parameters
trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.
- Returns
- Return type
Notes
self.cache is updated after calling.
- class optimizers.Optimizer(lr_schedule, grad_clipper, repr_str)¶
Bases:
object
Optimizer parent class.
- lr_schedule¶
The learning rate schedule of the optimizer.
- Type
- lr¶
The latest learning rate.
- Type
float
- __init__()¶
Constructor.
- apply_lr_schedule()¶
Applies the learning rate schedule of the optimizer.
- get_lr()¶
Returns the latest learning rate of the optimizer’s learning rate schedule.
- apply_lr_schedule()¶
Applies the learning rate schedule of the optimizer.
- Parameters
None –
- Returns
- Return type
Notes
Updates self.lr
- get_lr()¶
Returns the latest learning rate of the optimizer’s learning rate schedule.
- Parameters
None –
- Returns
lr – The latest learning rate of the learning rate schedule of the optimizer.
- Return type
float
Notes
Updates self.lr
- class optimizers.SGDOptimizer(lr_schedule, grad_clipper)¶
Bases:
optimizers.Optimizer
Stochastic gradient descent optimizer.
- lr_schedule¶
The learning rate schedule of the optimizer.
- Type
- lr¶
The latest learning rate.
- Type
float
- __init__()¶
Constructor.
- apply_lr_schedule()¶
Applies the learning rate schedule of the optimizer.
- get_lr()¶
Returns the latest learning rate of the optimizer’s learning rate schedule.
- apply_grads(trainable_params, grads)¶
Applies the gradient update rule to trainable params using gradients.
- apply_grads(trainable_params, grads)¶
Applies the gradient update rule to trainable params using gradients.
- Parameters
trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.
- Returns
updated_trainable_params – The list of dictionaries of the updated trainable parameters of all layers of a model. At idx is the dictionary of the updated trainable parameters of layer idx in the Model.layers list.
- Return type
list
Notes
Iterates over layers in ascending order in the Model.layers list.
- Raises
AssertionError – If the lengths of trainable_weights and grads lists are not the same.