-
Notifications
You must be signed in to change notification settings - Fork 191
Add Samples distribution #1233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Samples distribution #1233
Conversation
arvoelke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet. Looks good! I realized half way that this was only meant for ndarray, but my feedback is working under the assumption that we could make this to handle any iterable / array_like parameter?
| shape = (n,) if d is None else (n, d) | ||
|
|
||
| if d is None: | ||
| samples = samples.squeeze() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if d is not None and does not match the second dimension? Is this handled somewhere downstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh whoops, totally forgot to check that ;) Originally the check below checked the whole shape, but then I changed it and forgot to add in the check for the second dimension. Will update tomorrow.
nengo/dists.py
Outdated
| self.samples = samples | ||
|
|
||
| def __repr__(self): | ||
| return "Samples(samples=%r)" % self.samples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... % (self.samples,), otherwise if self.samples is a tuple, then you will get a string formatting TypeError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, samples is always a numpy array here so we should be good. I considered the defensive tuple though, so I can slip it in there for kicks.
|
|
||
| def __init__(self, samples): | ||
| super(Samples, self).__init__() | ||
| self.samples = samples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cast to ndarray here instead of in sample? This will then work if given a generator, and will also resolve the comment in __repr__ indirectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done automatically since samples is an NdarrayParam.
nengo/networks/ensemblearray.py
Outdated
| super(EnsembleArray, self).__init__(label, seed, add_to_container) | ||
|
|
||
| for param in ens_kwargs: | ||
| if isinstance(ens_kwargs[param], np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be generalized to handle any iterable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right yeah, I'll use nengo.utils.compat.is_array_like.
| assert np.allclose(built.max_rates, max_rates) | ||
| assert np.allclose(built.intercepts, intercepts) | ||
| assert np.allclose(built.eval_points, eval_points) | ||
| assert built.eval_points.shape == eval_points.shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include test for mismatched shape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do! 👍
|
For your discussion point,
I also think a copy is fine. Better safe than sorry (risk/reward is too high). |
|
|
||
| for param in ens_kwargs: | ||
| if isinstance(ens_kwargs[param], np.ndarray): | ||
| ens_kwargs[param] = nengo.dists.Samples(ens_kwargs[param]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a little bit inconsistent and magical to do this only for ensemble arrays.
I would prefer that any array (or equivalent) assigned to a DistOrArrayParam will be automatically cast in the Samples distribution. PR #1207 adds some functionality to make such casts require less boilerplate, particularly the commit 7af0856.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so like we talked about in the dev meeting, the issue there is that then we lose the model construction-time validation of shapes. This isn't a simple no-op, it means that if people are accessing ensemble.encoders they will never get a NumPy array, which is almost always what they want if they're accessing it to do some math or some such. I see the benefit of this distribution for fixing this specific problem, but I think it's premature to switch a lot of core functionality over to it because of issues like this. Not saying we won't, but I'd prefer it in a separate PR. I'm happy to switch over other specific instances in which nothing changes to the user, but yeah, mostly I wanted to get this out quickly and haven't looked around for other places to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if people are accessing ensemble.encoders
How common is that? I think I never set encoders to something different than a distribution.
I see the benefit of this distribution for fixing this specific problem
How feel you about leaving this block out of the PR? It requires a little bit more boilerplate on the user side, but also makes explicit what's happening. Also with this change ensemble_array.ensembles[i].encoders will never return a NumPy array. So if we worry about that in general, why not in the case of ensemble arrays?
|
OK, pushed a fixup addressing @arvoelke's comments. I opted for testing mismatched shapes on the distribution itself, so added tests to Also, for future PRs, could you read over our reviewing guidelines @arvoelke? We're trying to move away from reviews of the form "please change this" and instead having the reviewer make the change themselves. We're also not using Github's reviewing interface and opting for labels instead, though we might move to using that in the new year. Re @jgosmann's comments, setting encoders to numbers is why Xuan raised #691 so people do it. As for not including the block in One possible way around it would be for the |
This was not about setting, but accessing encoders. So how common is it for people to set the encoders to numbers and access them afterwards to do math on them? (And even if, one could easily keep a variable with those encoders around or query them from Why was it that we're updating the config and do not pass |
Sorry, I was replying to the comment "I think I never set encoders to something different than a distribution."
To save a few lines of code, and so that if you add more ensembles to the ensemble array after it's created those parameters still apply. I brought this up as a possible resolution to these issues when we discussed it in the dev meeting but no one had an opinion on it, but yeah, we could do that instead. Anyone else have thoughts? |
Doesn't this add a line of code?
To me this seems to be a pretty weird use case ... if no one is actually doing this or has any other arguments, I would favour passing |
|
Added a small commit, apart from the discussion about the usage in |
Seanny123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've addressed the concerns of @arvoelke and the code looks good, so I personally believe that this is ready to merge.
|
The commit looks fine to me. |
|
Sorry should have been more explicit... could you take a look at the whole PR, and give it a review approval if it looks good? |
jgosmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't given this a close look, but it looked good to me before. I would still prefer to pass ens_kwargs directly to the ensembles instead of modifying the config, but I won't block this PR because of it.
This is essentially a no-op distribution that takes a set of samples and provides them when the `samples` method is called. It's intended to simplify situations in which a distribution or samples could be provided by converting samples to a distribution. The main place this happens is when passing `Ensemble` arguments to the `EnsembleArray` constructor. Previously, a distribution had to be provided, since the passed parameters are set as defaults on the `EnsembleArray` network. Now, any samples passed in are wrapped in the `Samples` distribution, which makes it possible to pass in samples. Tests have been added to verify this. Addresses #691 and #766.
Motivation and context:
Look, people want to pass in NumPy arrays to
EnsembleArray. They just do! See #691 and and #766 for proof, it's a thing. In #766 @arvoelke suggested a "no-op" distribution, which no one at the dev meeting disliked, so here it is.Closes #691 and #766.
How has this been tested?
New
EnsembleArraytest included 👍How long should this take to review?
Types of changes:
Checklist:
Still to do:
Samples. Should we?Sampleswould simplify code?