optimizers module

class optimizers.AdaGradOptimizer(lr_schedule, grad_clipper, epsilon=1e-06)

Bases: optimizers.Optimizer

AdaGrad gradient descent optimizer.

lr_schedule

The learning rate schedule of the optimizer.

Type

LRSchedule

lr

The latest learning rate.

Type

float

first_call

If first call.

Type

bool

epsilon

Numerical stability constant.

Type

float

cache

A list of dicts for cache of feature of param grads such as running mean of squared grads, m.

Type

list

__init__()

Constructor.

apply_lr_schedule()

Applies the learning rate schedule of the optimizer.

get_lr()

Returns the latest learning rate of the optimizer’s learning rate schedule.

apply_grads(trainable_params, grads)

Applies the gradient update rule to trainable params using gradients.

build_cache(trainable_params, grads):

Build cache on first call.

update_cache(trainable_params, grads):

Update cache on call.

get_opt_grad(trainable_params, grads):

Get the optimizer specific grads computed using the cache.

apply_grads(trainable_params, grads)

Applies the gradient update rule to trainable params using gradients.

Parameters
  • trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.

  • grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

updated_trainable_params – The list of dictionaries of the updated trainable parameters of all layers of a model. At idx is the dictionary of the updated trainable parameters of layer idx in the Model.layers list.

Return type

list

Notes

Iterates over layers in ascending order in the Model.layers list.

Raises

AssertionError – If the lengths of trainable_weights and grads lists are not the same.

build_cache(trainable_params, grads)

Build cache on first call.

Parameters
  • trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.

  • grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

Return type

None

Notes

self.cache is built after calling.

get_opt_grad(trainable_params, grads)

Get the optimizer specific grads computed using the cache.

Parameters
  • trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.

  • grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

opt_grads – The list of dictionaries of the optimizer specifc gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Return type

list

Notes

None

update_cache(trainable_params, grads)

Update cache on call.

Parameters
  • trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.

  • grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

Return type

None

Notes

self.cache is updated after calling.

class optimizers.Optimizer(lr_schedule, grad_clipper, repr_str)

Bases: object

Optimizer parent class.

lr_schedule

The learning rate schedule of the optimizer.

Type

LRSchedule

lr

The latest learning rate.

Type

float

__init__()

Constructor.

apply_lr_schedule()

Applies the learning rate schedule of the optimizer.

get_lr()

Returns the latest learning rate of the optimizer’s learning rate schedule.

apply_lr_schedule()

Applies the learning rate schedule of the optimizer.

Parameters

None

Returns

Return type

None

Notes

Updates self.lr

get_lr()

Returns the latest learning rate of the optimizer’s learning rate schedule.

Parameters

None

Returns

lr – The latest learning rate of the learning rate schedule of the optimizer.

Return type

float

Notes

Updates self.lr

class optimizers.SGDOptimizer(lr_schedule, grad_clipper)

Bases: optimizers.Optimizer

Stochastic gradient descent optimizer.

lr_schedule

The learning rate schedule of the optimizer.

Type

LRSchedule

lr

The latest learning rate.

Type

float

__init__()

Constructor.

apply_lr_schedule()

Applies the learning rate schedule of the optimizer.

get_lr()

Returns the latest learning rate of the optimizer’s learning rate schedule.

apply_grads(trainable_params, grads)

Applies the gradient update rule to trainable params using gradients.

apply_grads(trainable_params, grads)

Applies the gradient update rule to trainable params using gradients.

Parameters
  • trainable_params (list) – The list of dictionaries of the trainable parameters of all layers of a model. At idx is the dictionary of trainable parameters of layer idx in the Model.layers list.

  • grads (list) – The list of dictionaries of gradients of all parameters of all layers of a model. At idx is the dictionary of gradients of layer idx in the Model.layers list.

Returns

updated_trainable_params – The list of dictionaries of the updated trainable parameters of all layers of a model. At idx is the dictionary of the updated trainable parameters of layer idx in the Model.layers list.

Return type

list

Notes

Iterates over layers in ascending order in the Model.layers list.

Raises

AssertionError – If the lengths of trainable_weights and grads lists are not the same.