optimizers module¶

class optimizers.AdaGradOptimizer(lr_schedule, grad_clipper, epsilon=1e-06)¶

Bases: optimizers.Optimizer

AdaGrad gradient descent optimizer.

lr_schedule¶

The learning rate schedule of the optimizer.

Type: LRSchedule

lr¶

The latest learning rate.

Type: float

first_call¶

If first call.

Type: bool

epsilon¶

Numerical stability constant.

Type: float

cache¶

A list of dicts for cache of feature of param grads such as running mean of squared grads, m.

Type: list

__init__()¶: Constructor.

apply_lr_schedule()¶: Applies the learning rate schedule of the optimizer.

get_lr()¶: Returns the latest learning rate of the optimizer’s learning rate schedule.

apply_grads(trainable_params, grads)¶: Applies the gradient update rule to trainable params using gradients.

build_cache(trainable_params, grads):: Build cache on first call.

update_cache(trainable_params, grads):: Update cache on call.

get_opt_grad(trainable_params, grads):: Get the optimizer specific grads computed using the cache.

apply_grads(trainable_params, grads)¶

Applies the gradient update rule to trainable params using gradients.

Parameters

trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

updated_trainable_params – The list of dictionaries of the updated trainable parameters of all layers of a model. At idx is the dictionary of the updated trainable parameters of layer idx in the Model.layers list.

Return type

list

Notes

Iterates over layers in ascending order in the Model.layers list.

Raises: AssertionError – If the lengths of trainable_weights and grads lists are not the same.

build_cache(trainable_params, grads)¶

Build cache on first call.

Parameters

trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

Return type

None

Notes

self.cache is built after calling.

get_opt_grad(trainable_params, grads)¶

Get the optimizer specific grads computed using the cache.

Parameters

trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

opt_grads – The list of dictionaries of the optimizer specifc gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Return type

list

Notes

None

update_cache(trainable_params, grads)¶

Update cache on call.

Parameters

trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

Return type

None

Notes

self.cache is updated after calling.

class optimizers.Optimizer(lr_schedule, grad_clipper, repr_str)¶

Bases: object

Optimizer parent class.

lr_schedule¶

The learning rate schedule of the optimizer.

Type: LRSchedule

lr¶

The latest learning rate.

Type: float

__init__()¶: Constructor.

apply_lr_schedule()¶: Applies the learning rate schedule of the optimizer.

get_lr()¶: Returns the latest learning rate of the optimizer’s learning rate schedule.

apply_lr_schedule()¶

Applies the learning rate schedule of the optimizer.

Parameters: None –
Returns
Return type: None

Notes

Updates self.lr

get_lr()¶

Returns the latest learning rate of the optimizer’s learning rate schedule.

Parameters: None –
Returns: lr – The latest learning rate of the learning rate schedule of the optimizer.
Return type: float

Notes

Updates self.lr

class optimizers.SGDOptimizer(lr_schedule, grad_clipper)¶

Bases: optimizers.Optimizer

Stochastic gradient descent optimizer.

lr_schedule¶

The learning rate schedule of the optimizer.

Type: LRSchedule

lr¶

The latest learning rate.

Type: float

__init__()¶: Constructor.

apply_lr_schedule()¶: Applies the learning rate schedule of the optimizer.

get_lr()¶: Returns the latest learning rate of the optimizer’s learning rate schedule.

apply_grads(trainable_params, grads)¶: Applies the gradient update rule to trainable params using gradients.

apply_grads(trainable_params, grads)¶

Applies the gradient update rule to trainable params using gradients.

Parameters

trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.
grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

updated_trainable_params – The list of dictionaries of the updated trainable parameters of all layers of a model. At idx is the dictionary of the updated trainable parameters of layer idx in the Model.layers list.

Return type

list

Notes

Iterates over layers in ascending order in the Model.layers list.

Raises: AssertionError – If the lengths of trainable_weights and grads lists are not the same.