Add Samples distribution #1233

tbekolay · 2016-12-12T22:40:19Z

Motivation and context:
Look, people want to pass in NumPy arrays to EnsembleArray. They just do! See #691 and and #766 for proof, it's a thing. In #766 @arvoelke suggested a "no-op" distribution, which no one at the dev meeting disliked, so here it is.

Closes #691 and #766.

How has this been tested?
New EnsembleArray test included 👍

How long should this take to review?

Quick (less than 40 lines changed or changes are straightforward)

Types of changes:

New feature (non-breaking change which adds functionality)

Checklist:

I have read the CONTRIBUTING.rst document.
I have updated the documentation accordingly.
I have included a changelog entry.
I have added tests to cover my changes.
All new and existing tests passed.

Still to do:

Discussion point: right now I make a copy of the array passed into Samples. Should we?
Add to docs
Add changelog entry
Are there other places Samples would simplify code?

arvoelke

Sweet. Looks good! I realized half way that this was only meant for ndarray, but my feedback is working under the assumption that we could make this to handle any iterable / array_like parameter?

arvoelke · 2016-12-12T23:00:37Z

nengo/dists.py

+        shape = (n,) if d is None else (n, d)
+
+        if d is None:
+            samples = samples.squeeze()


What if d is not None and does not match the second dimension? Is this handled somewhere downstream?

Oh whoops, totally forgot to check that ;) Originally the check below checked the whole shape, but then I changed it and forgot to add in the check for the second dimension. Will update tomorrow.

arvoelke · 2016-12-12T23:02:16Z

nengo/dists.py

+        self.samples = samples
+
+    def __repr__(self):
+        return "Samples(samples=%r)" % self.samples


... % (self.samples,), otherwise if self.samples is a tuple, then you will get a string formatting TypeError.

As above, samples is always a numpy array here so we should be good. I considered the defensive tuple though, so I can slip it in there for kicks.

arvoelke · 2016-12-12T23:02:59Z

nengo/dists.py

+
+    def __init__(self, samples):
+        super(Samples, self).__init__()
+        self.samples = samples


Cast to ndarray here instead of in sample? This will then work if given a generator, and will also resolve the comment in __repr__ indirectly.

This is done automatically since samples is an NdarrayParam.

arvoelke · 2016-12-12T23:04:24Z

nengo/networks/ensemblearray.py

        super(EnsembleArray, self).__init__(label, seed, add_to_container)

+        for param in ens_kwargs:
+            if isinstance(ens_kwargs[param], np.ndarray):


This could be generalized to handle any iterable?

Right yeah, I'll use nengo.utils.compat.is_array_like.

arvoelke · 2016-12-12T23:05:20Z

nengo/networks/tests/test_ensemblearray.py

+    assert np.allclose(built.max_rates, max_rates)
+    assert np.allclose(built.intercepts, intercepts)
+    assert np.allclose(built.eval_points, eval_points)
+    assert built.eval_points.shape == eval_points.shape


Include test for mismatched shape?

Will do! 👍

arvoelke · 2016-12-12T23:09:54Z

For your discussion point,

right now I make a copy of the array passed into Samples

I also think a copy is fine. Better safe than sorry (risk/reward is too high).

jgosmann · 2016-12-12T23:35:54Z

nengo/networks/ensemblearray.py


+        for param in ens_kwargs:
+            if isinstance(ens_kwargs[param], np.ndarray):
+                ens_kwargs[param] = nengo.dists.Samples(ens_kwargs[param])


I think it is a little bit inconsistent and magical to do this only for ensemble arrays.

I would prefer that any array (or equivalent) assigned to a DistOrArrayParam will be automatically cast in the Samples distribution. PR #1207 adds some functionality to make such casts require less boilerplate, particularly the commit 7af0856.

Yeah, so like we talked about in the dev meeting, the issue there is that then we lose the model construction-time validation of shapes. This isn't a simple no-op, it means that if people are accessing ensemble.encoders they will never get a NumPy array, which is almost always what they want if they're accessing it to do some math or some such. I see the benefit of this distribution for fixing this specific problem, but I think it's premature to switch a lot of core functionality over to it because of issues like this. Not saying we won't, but I'd prefer it in a separate PR. I'm happy to switch over other specific instances in which nothing changes to the user, but yeah, mostly I wanted to get this out quickly and haven't looked around for other places to do this.

if people are accessing ensemble.encoders

How common is that? I think I never set encoders to something different than a distribution.

I see the benefit of this distribution for fixing this specific problem

How feel you about leaving this block out of the PR? It requires a little bit more boilerplate on the user side, but also makes explicit what's happening. Also with this change ensemble_array.ensembles[i].encoders will never return a NumPy array. So if we worry about that in general, why not in the case of ensemble arrays?

tbekolay · 2016-12-13T22:55:58Z

OK, pushed a fixup addressing @arvoelke's comments. I opted for testing mismatched shapes on the distribution itself, so added tests to tests_dists.py for that.

Also, for future PRs, could you read over our reviewing guidelines @arvoelke? We're trying to move away from reviews of the form "please change this" and instead having the reviewer make the change themselves. We're also not using Github's reviewing interface and opting for labels instead, though we might move to using that in the new year.

Re @jgosmann's comments, setting encoders to numbers is why Xuan raised #691 so people do it. As for not including the block in EnsembleArray, if we didn't include it then we couldn't close #691 or #766, so I'm not a fan of that.

One possible way around it would be for the DistributionParam to return a NumPy array when accessing the value of the distribution is a Samples instance, but that's definitely pretty magical so I'd want to sit on that for a while. Setting parameters is far more common than reading them, so while in the ideal case what you set is always what you get back, that's already not quite the case in all situations. It's also not at very common to access the individual ensembles in an EnsembleArray, so given all these factors, this half-measure seems okay for now and not worth blocking this PR which fixes two open issues. Happy to hear other perspectives though.

jgosmann · 2016-12-13T23:08:47Z

setting encoders to numbers is why Xuan raised #691 so people do it

This was not about setting, but accessing encoders. So how common is it for people to set the encoders to numbers and access them afterwards to do math on them? (And even if, one could easily keep a variable with those encoders around or query them from Samples.)

Why was it that we're updating the config and do not pass **ens_kwargs to the Ensemble constructor? This seems to me to have exactly the same effect except that NumPy arrays instead of distributions would work out of the box.

tbekolay · 2016-12-13T23:13:32Z

This was not about setting, but accessing encoders.

Sorry, I was replying to the comment "I think I never set encoders to something different than a distribution."

Why was it that we're updating the config and do not pass **ens_kwargs to the Ensemble constructor?

To save a few lines of code, and so that if you add more ensembles to the ensemble array after it's created those parameters still apply. I brought this up as a possible resolution to these issues when we discussed it in the dev meeting but no one had an opinion on it, but yeah, we could do that instead.

Anyone else have thoughts?

jgosmann · 2016-12-13T23:17:44Z

To save a few lines of code,

Doesn't this add a line of code?

so that if you add more ensembles to the ensemble array after it's created those parameters still apply.

To me this seems to be a pretty weird use case ... if no one is actually doing this or has any other arguments, I would favour passing **ens_kwargs to the Ensemble. Even though, the Samples distribution can still be useful in other situations.

jgosmann · 2016-12-14T19:43:01Z

Added a small commit, apart from the discussion about the usage in EnsembleArray this LGTM.

Seanny123

You've addressed the concerns of @arvoelke and the code looks good, so I personally believe that this is ready to merge.

tbekolay · 2017-03-29T17:24:14Z

Pushed a commit for one comment I hadn't addressed yet. @arvoelke and/or @jgosmann, could you take another look at this?

jgosmann · 2017-03-29T17:41:47Z

The commit looks fine to me.

tbekolay · 2017-03-29T17:50:24Z

Sorry should have been more explicit... could you take a look at the whole PR, and give it a review approval if it looks good?

jgosmann

Haven't given this a close look, but it looked good to me before. I would still prefer to pass ens_kwargs directly to the ensembles instead of modifying the config, but I won't block this PR because of it.

This is essentially a no-op distribution that takes a set of samples and provides them when the `samples` method is called. It's intended to simplify situations in which a distribution or samples could be provided by converting samples to a distribution. The main place this happens is when passing `Ensemble` arguments to the `EnsembleArray` constructor. Previously, a distribution had to be provided, since the passed parameters are set as defaults on the `EnsembleArray` network. Now, any samples passed in are wrapped in the `Samples` distribution, which makes it possible to pass in samples. Tests have been added to verify this. Addresses #691 and #766.

tbekolay added the needs review label Dec 12, 2016

arvoelke suggested changes Dec 12, 2016

View reviewed changes

jgosmann reviewed Dec 12, 2016

View reviewed changes

jgosmann added needs second review and removed needs review labels Dec 14, 2016

Seanny123 approved these changes Mar 27, 2017

View reviewed changes

Seanny123 added reviewed and removed needs second review labels Mar 27, 2017

jgosmann approved these changes Mar 29, 2017

View reviewed changes

tbekolay force-pushed the samples-dist branch from 76c3dcb to a3eef28 Compare March 29, 2017 18:23

tbekolay merged commit a3eef28 into master Mar 29, 2017

tbekolay deleted the samples-dist branch March 29, 2017 18:43

jgosmann mentioned this pull request Jul 18, 2017

Cannot supply array of eval_points to EnsembleArray #766

Closed

Add Samples distribution #1233

Add Samples distribution #1233

Uh oh!

Conversation

tbekolay commented Dec 12, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arvoelke left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arvoelke commented Dec 12, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tbekolay commented Dec 13, 2016

Uh oh!

jgosmann commented Dec 13, 2016

Uh oh!

tbekolay commented Dec 13, 2016

Uh oh!

jgosmann commented Dec 13, 2016

Uh oh!

jgosmann commented Dec 14, 2016

Uh oh!

Seanny123 left a comment

Choose a reason for hiding this comment

Uh oh!

tbekolay commented Mar 29, 2017

Uh oh!

jgosmann commented Mar 29, 2017

Uh oh!

tbekolay commented Mar 29, 2017

Uh oh!

jgosmann left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

tbekolay commented Dec 12, 2016 •

edited

Loading

arvoelke commented Dec 12, 2016 •

edited

Loading