Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt at a base class for .py encoders #831

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
v2.1.1
v2.1.15
62 changes: 62 additions & 0 deletions py/htm/encoders/BaseEncoder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# ------------------------------------------------------------------------------
# HTM Community Edition of NuPIC
# Copyright (C) 2020, David Keeney
#
# This program is free software: you can redistribute it and/or modify it under
# the terms of the GNU Affero Public License version 3 as published by the Free
# Software Foundation.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Affero Public License for more details.
#
# You should have received a copy of the GNU Affero Public License along with
# this program. If not, see http://www.gnu.org/licenses.
# ------------------------------------------------------------------------------


# Base class for all encoders.
# An encoder converts a value to a sparse distributed representation.
#
# Subclasses must implement method encode and Serializable interface.
# Subclasses can optionally implement method reset.
#
# There are several critical properties which all encoders must have:
#
# 1) Semantic similarity: Similar inputs should have high overlap. Overlap
# decreases smoothly as inputs become less similar. Dissimilar inputs have
# very low overlap so that the output representations are not easily confused.
#
# 2) Stability: The representation for an input does not change during the
# lifetime of the encoder.
#
# 3) Sparsity: The output SDR should have a similar sparsity for all inputs and
# have enough active bits to handle noise and subsampling.
#
# Reference: https://arxiv.org/pdf/1602.05925.pdf


# Members dimensions & size describe the shape of the encoded output SDR.
# For example, a 6 by 4 dimension would be (6,4,)
# size is the total number of bits in the result.
# A subclass of the BaseEncoder should be passed the dimension in the constructor.
# All subclasses must contain an encode( ) method and a reset( ) method.

from abc import ABC, abstractmethod
import numpy as np
from htm.bindings.sdr import SDR

class BaseEncoder(ABC):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've touched this..it's probably not worth creating a pybind interface for BaseEncoder.hpp? This file will be more flexible


@abstractmethod
def __init__(dimensions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might think off of some common, useful params here: debug=0, inputType=None,...

self.dimensions = dimensions
self.size = SDR(dimensions).size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert size > 0


@abstractmethod
def reset(self):
raise NotImplementedError()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mistified you about the raise, with @abstractmethod you should replace the raise with just pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and also remove the @AbstractMethod, most encoders dont use reset, so they'll use this implementation

def reset(self):
  pass


@abstractmethod
def encode(self, input, output):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one will remain abstract -> we want all subclasses to reimplement it.
But might add some common checks to save code elsewhere:

def encode(self, inp, output=None):
  if output is None:
    output = SDR(self.dimensions)
  else:
    assert( isinstance(output, SDR) )
    assert( all(x == y for x, y in zip( output.dimensions, self.dimensions )))

  if inp is None or (isinstance(inp, float) and math.isnan(inp)):
    output.zero()
    return output

raise NotImplementedError()
15 changes: 8 additions & 7 deletions py/htm/encoders/grid_cell_encoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,9 @@

from htm.bindings.sdr import SDR
from htm.bindings.math import Random
from htm.bindings.encoder import BaseEncoder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above is not in htm.bindings


class GridCellEncoder:
class GridCellEncoder(BaseEncoder):
"""
This Encoder converts a 2-D coordinate into plausible grid cell activity.
The output SDR is divided into modules. Each module is a distinct groups of
Expand Down Expand Up @@ -50,8 +51,8 @@ def __init__(self,
encoder uses. This encoder produces deterministic output. The seed
zero is special, seed zero is replaced with a truly random seed.
"""
self.size = size
self.dimensions = (size,)
super().__init__((size,))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this is a correct use of explicitely calling the super's methods. (note: unlike in c++, you dont need to call the super.init as the first statement


self.sparsity = sparsity
self.periods = tuple(sorted(float(p) for p in periods))
assert(len(self.periods) > 0)
Expand All @@ -60,14 +61,14 @@ def __init__(self,
assert(self.sparsity <= 1)

# Assign each module a range of cells in the output SDR.
partitions = np.linspace(0, self.size, num=len(self.periods) + 1)
partitions = np.linspace(0, super().size, num=len(self.periods) + 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subclass has all the members,methods of its superclass, so this could/should remain self.size, or even size

partitions = list(zip(partitions, partitions[1:]))
self.partitions_ = [(int(round(start)), int(round(stop)))
for start, stop in partitions]

# Assign each module a random offset and orientation.
rng = np.random.RandomState(seed = Random(seed).getUInt32())
self.offsets_ = rng.uniform(0, max(self.periods)*9, size=(self.size, 2))
self.offsets_ = rng.uniform(0, max(self.periods)*9, size=(super().size, 2))
self.angles_ = []
self.rot_mats_ = []
for period in self.periods:
Expand Down Expand Up @@ -95,10 +96,10 @@ def encode(self, location, grid_cells=None):
location = list(location)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encode should call super().encode(location)

assert(len(location) == 2)
if grid_cells is None:
grid_cells = SDR((self.size,))
grid_cells = SDR(super().dimensions)
else:
assert(isinstance(grid_cells, SDR))
assert(grid_cells.dimensions == [self.size])
assert(grid_cells.dimensions == [super().size])
if any(math.isnan(x) for x in location):
grid_cells.zero()
return grid_cells
Expand Down