[MRG + 1] Do not shuffle by default for DBSCAN. #4066

kno10 · 2015-01-08T11:54:46Z

Shuffling is not necessary, the effect on the result is usually nonexistant (except for permuted cluster numbering); DBSCAN is mostly deterministic except for rare border cases.

Add a note about the increased memory complexity of this implementation compared to original DBSCAN.

jnothman · 2015-01-08T12:23:46Z

sklearn/cluster/dbscan_.py

-        The generator used to shuffle the samples. Defaults to numpy.random.
+        The generator used to shuffle the samples, which affects the cluster
+        numbering and cluster assignments of points that are border points to
+        more than one cluster. Defaults to not shuffling. None.


This is not the convention used elsewhere; rather, a separate shuffle parameter is used.

So should I rename it to "shuffle" then instead of random_state?

No, I mean for good or bad, random_state=None means use an arbitrary random number generator, while an additional parameter controls whether randomness is used at all! See for instance the cross_validation module or SGD*.

jnothman · 2015-01-08T12:30:39Z

I'm not sure whether there are concerns about backwards compatibility regarding making shuffle False by default. As you say, it's mostly deterministic (and I had wondered whether it would make sense in the batch-computed approach to work from densest to sparsest core samples).

Regarding the complexity issue. Do you find the new implementation prohibitively costly for datasets that were fine under the previous implementation? This sort of trade-off seems to me quite common in interpreted numerical processing (where speed is obtained through vectorized (native code) bulk operations), so I wasn't concerned in making that change for the sake of substantial speed-up (which can be further improved upon, mind you, but only if done in bulk).

However if you have a real concern, we might be able to find a compromise solution that works in batches, but the second-order lookup means the code will be messy. Or we might decide that the previous implementation, albeit somewhat slow, was fine.

kno10 · 2015-01-08T12:34:56Z

It may well be acceptable. I have not benchmarked. How much speedup does vectorization give in neighbors_model.radius_neighbors, which is probably the only really costly part?

I'd suggest to drop the random_state parameter then completely. People may think that random_state has a similar impact as with k-means, but it doesn't matter much. If someone really wants to experiment with shuffled data, he can just shuffle the data prior to running DBSCAN.

jnothman · 2015-01-08T12:54:34Z

I think that's an interesting proposal, but we would need some kind of deprecation strategy. @robertlayton wdyt about removing randomisation from DBSCAN on the basis that it is deterministic except in rare edge cases?

How much speedup does vectorization give in neighbors_model.radius_neighbors, which is probably the only really costly part?

Not a great lot, it seems, as we move asymptotic. Maybe I should reevaluate those changes in implementation. In the meantime, perhaps your note is apt.

robertlayton · 2015-01-08T21:24:44Z

I agree that the algorithm is "mostly" deterministic. However, the trend is to perform shuffling within the classifier, rather than out of it. For that reason, I would recommend leaving the random_state parameter in tack, and providing an option random_state=False, which doesn't shuffle. I don't mind if it is False by default, just do a warning if called without setting random_state for one version or so?

jnothman · 2015-01-08T21:52:25Z

Is random_state=False a convention used elsewhere in the package?

On 9 January 2015 at 08:24, Robert Layton [email protected] wrote:

I agree that the algorithm is "mostly" deterministic. However, the trend
is to perform shuffling within the classifier, rather than out of it. For
that reason, I would recommend leaving the random_state parameter in tack,
and providing an option random_state=False, which doesn't shuffle. I
don't mind if it is False by default, just do a warning if called without
setting random_state for one version or so?

—
Reply to this email directly or view it on GitHub
#4066 (comment)
.

GaelVaroquaux · 2015-01-08T21:53:58Z

Is random_state=False a convention used elsewhere in the package?

I don't think so, and I don't really like it. Maybe people don't
understand well the difference between False and None.

amueller · 2015-01-08T22:10:29Z

I think shuffling inside estimators for stochastic algorithms is basically mandatory, as in SGDClassifier.
Here it seems not so important.
I am against random_state=False. The two options are

Deprecate shuffling, that is don't shuffle and warn if random_state is not None, later remove random_state.
Make shuffling optional, that is introduce a boolean shuffle=False (or =True?)

jnothman · 2015-01-09T00:26:37Z

Let alone False, None and 0.

On 9 January 2015 at 08:53, Gael Varoquaux [email protected] wrote:

Is random_state=False a convention used elsewhere in the package?

I don't think so, and I don't really like it. Maybe people don't
understand well the difference between False and None.

—
Reply to this email directly or view it on GitHub
#4066 (comment)
.

jnothman · 2015-01-09T00:28:47Z

Note that since the changes I made the other week change what is being
shuffled (core samples only), there are no greater backwards compatibility
issues in making shuffle off by default.

On 9 January 2015 at 11:26, Joel Nothman [email protected] wrote:

Let alone False, None and 0.

On 9 January 2015 at 08:53, Gael Varoquaux [email protected]
wrote:

Is random_state=False a convention used elsewhere in the package?

I don't think so, and I don't really like it. Maybe people don't
understand well the difference between False and None.

—
Reply to this email directly or view it on GitHub
#4066 (comment)
.

kno10 · 2015-01-09T12:51:22Z

What is the preferred way of warning of the removed parameter in scipy?
The latest version of the patch silently ignores the random_state parameter.

I do not think we should add another option that does not help the user get better results. It at most changes a few border points, this will not increase the overall performance. Having the option will only make users assume this is another knob to tune. For compatibility, it makes sense to keep the parameter, and either silently ignore it, or warn if it is set.

Indeed, the changes by @jnothman already changed the shuffling compared to previous versions.

GaelVaroquaux · 2015-01-09T16:43:28Z

What is the preferred way of warning of the removed parameter in scipy?
The latest version of the patch silently ignores the random_state parameter.

I would warn if shuffle is True or random_state is not None.

amueller · 2015-01-09T18:15:29Z

@GaelVaroquaux there is currently no shuffle parameter ;)

Ok then let's not introduce one, and if anyone set's random state we raise a deprecation warning and don't shuffle?
I have to say I can't judge the impact of shuffling in this algorithm, so someone who is more familiar should confirm that this doesn't usually change results, even if the data is ordered in some way.

GaelVaroquaux · 2015-01-09T18:57:13Z

Ok then let's not introduce one, and if anyone set's random state we raise a
deprecation warning and don't shuffle?

Yes. That sounds good to me.

I have to say I can't judge the impact of shuffling in this algorithm, so
someone who is more familiar should confirm that this doesn't usually change
results, even if the data is ordered in some way.

Ping @jnothman and @robertlayton. Do you have an idea?

jnothman · 2015-01-10T10:42:05Z

I have to say I can't judge the impact of shuffling in this algorithm, so someone who is more familiar should confirm that this doesn't usually change results, even if the data is ordered in some way.

Ping @jnothman and @robertlayton. Do you have an idea?

Not a strong idea, but reasoning roughly: The algorithm calculates core samples depending only on neighborhood density; and assigns distinct labels to connected components of the distance < eps graph among core samples*. It is the non-core samples (which means in areas of low density relative to model parameters) that may be within eps of multiple core samples, which need to be > eps from each other in order for there to be label ambiguity. But presumably these points are relatively rare, in that they lie between two areas of sufficiently high density, but are not in one themselves. @kno10's reference to "except for rare border cases" implies this has been more robustly analysed somewhere, and I would be glad for a reference before making any rash decisions.

(*) this makes me now think the implementation can be easily made still-faster - i.e. dropping any Python loops - with scipy.sparse.csgraph.connected_components.

jnothman · 2015-01-10T15:01:22Z

(*) this makes me now think the implementation can be easily made still-faster - i.e. dropping any Python loops - with scipy.sparse.csgraph.connected_components.

I have such an implementation at https://github.com/jnothman/scikit-learn/tree/dbscan_vec2 which happens to assign peripheral points to the cluster of the nearest core sample rather than the first in a shuffled order.

GaelVaroquaux · 2015-01-10T15:50:46Z

I have such an implementation at
https://github.com/jnothman/scikit-learn/tree/ dbscan_vec2

Does it lead to computational speed ups?

kno10 · 2015-01-10T17:53:40Z

Border points: IIRC this was discussed in literature. With minPts=2 they cannot occur at all, and get more frequent with increasing minPts. Collissions aren't necessarily more frequent at higher minPts, because there will be more noise. The non-deterministic case can sometimes happen when a cluster is splitting into two, but still "almost density connected".
A connected-components based approach is possible (if done on the core points only, it does yield the same result - the authors did choose the name "density connected" not by chance).
However, I'm concerned that the implementation now moves away from what was published as DBSCAN. There are many many variations over this algorithm. But if it ends up being a different algorithm, it should probably use a different name and attribution then...
Performance: usually 95% of the time is finding the neighbors. So all possible improvements by delegating the computations into C/Cython/Fortran instead of interpreted Python (technically, it's not really "vectorization" anymore) are probably on the remaining 5%.
Personally, I would prefer a more literal implementation of the original algorithm, unless the performance saving are very much measureable.

GaelVaroquaux · 2015-01-10T17:55:52Z

 Personally, I would prefer a more literal implementation of the
 original algorithm, unless the performance saving are very much
 measureable.

+1

kno10 · 2015-01-10T18:06:58Z

The original DBSCAN publication specifies "it might happen that some point p belongs to both, C1 and C2. [...] In this case, point p will be assigned to the cluster discovered first. Except from these rare situations, the result of DBSCAN is independent of the order in which the points of the database are visited [...]"
So if you want an "exact" DBSCAN implementation, objects should be processed in the order of the database; and should not be shuffled randomly.
Given that the points where the result is non-deterministic are rare, they will not have a measureable impact on the evaluation performance of the algorithm.

jnothman · 2015-01-10T20:40:41Z

Personally, I would prefer a more literal implementation of the original algorithm, unless the performance saving are very much measureable.

With #4009 merged, the calculation of radius neighbors becomes parallelisable which means that this can be sped up close to n-cores times. That's certainly something I'll want in my use of DBSCAN, and it is not possible when querying one point at a time (although conceivably we could parallelise over points in the visited sample's neighborhood, to much less gain per overhead).

IMO, using connected components means that the code is much easier to read rather than looking at nested loops and trying to understand their invariants.

jnothman · 2015-01-10T20:49:11Z

But I guess one can get the n_cores speed-up by calculating the complete pairwise distance matrix, but the memory usage is much more concerning then.

jnothman · 2015-01-10T21:26:17Z

But I aim to give you benchmarks of the improvement without parallelism on a real dataset I'm using.

jnothman · 2015-01-11T11:52:55Z

Some benchmarks. I should note in advance that a major reason for rewriting the dbscan code is that iterating over rows of a sparse matrix is a lot slower than over a dense matrix. You will see this effect below.

I'm comparing a version of the scikit-learn 0.15 dbscan with sample weight and sparse support with that at https://github.com/jnothman/scikit-learn/tree/dbscan_vec2. Neither shuffles and both should give the same results.

My input is an array of (7737, 100) minhashes that I am comparing with hamming distance. They are weighted to avoid excess work for duplicate hashes. Because this can't test the effect on sparse matrices. This is obviously not a very large dataset, but is a realistic start to get some idea of what would be the best way to implement this.

Experiment 1: what I actually want to do

dbscan_.dbscan(sketch_array, eps=.3, min_samples=20, sample_weight=np.array(weights), metric='hamming')

old: 55.2 s, new: 30.5s

Experiment 2: the same, but with precomputed distance matrix (which takes 13s to compute)

dbscan_.dbscan(dist, eps=.3, min_samples=20, sample_weight=np.array(weights), metric='precomputed')

old: 762 ms, new: 4.14 s ... clearly there's something a bit odd happening here that should be checked out. But this does show that the 25s gain above comes from not querying each sample at a time.

Experiment 3: use Euclidean distance, even though it's nonsense over this dataset, because it has a fast implementation and works for sparse

dbscan_.dbscan(sketch_array, eps=1e6, min_samples=20, sample_weight=np.array(weights), metric='euclidean')

old: 10.1s, new: 8.5s

Experiment 4: same with sparse input

dbscan_.dbscan(sparse.csr_matrix(sketch_array), eps=1e6, min_samples=20, sample_weight=np.array(weights), metric='euclidean')

old: 3min 18s, 24.2s

In summary, the main benefit of the new approach/es is not extracting and querying individual rows from the input, as well as having a much more succinct implementation. The main disadvantages are extra memory usage, and less direct comparability between the algorithm in the paper and the code.

Clearly, the row extraction is very costly for sparse input. An alternative would be to special-case sparse input and compute the distance matrix first, or to suggest the use of 'precomputed' where memory allows it. I'm happy to revert much of #3994 and find another way to handle these slow cases if that is deemed appropriate, and better for uses where memory is an issue.

kno10 · 2015-01-11T12:14:51Z

@jnorman The current version in this branch still computes all neighborhoods in one pass via:
neighborhoods = neighbors_model.radius_neighbors(X, eps
thus, it should benefit from any parallelism introduced in NearestNeighbors, without using connected_components.

Since finding the neighbors is 99% of the cost in my experience, I do see potential for speedup there; and the code remains still easy to map back to what is published as DBSCAN for a new reader. The iteration for index in core_samples: skips non-core points, but this is also still easy to understand from DBSCAN. I also didn't change your bulk-operations in the neighborhood loop, although I'm not convinced they will give a speedup.

The patch proposed in this branch contains:

replace random_state functionality with warnings.warn().
update the documentation of random_state to reflect this change.
update the "Notes" __doc__ section to document the memory difference to published DBSCAN.
Other than that it is what you committed to sklearn already ([MRG+1] DBSCAN: faster, weighted samples, and sparse input #3994).

Discussing #3994 now:
Your benchmarks do IMHO not support the "vectorization" changes inside the for index loop. Given that the old version takes 55.2s without and 0.762s (+13s) with a distance matrix, we should A) suggest to use a distance matrix when memory is sufficient (and maybe fall back to a O(n) memory approach otherwise), and B) we must assume that finding the neighbors takes 54.5s, and your patch was only able to shave off 25s of this. :-( It is to be expected that NearestNeighbors.radius_neighbors is 2x slower than the distance matrix, but the savings inside the while(len(candidates) > 0) may be negative in the worst case. Your code assumes that
np.concatenate(np.take(neighborhoods, candidates, axis=0).tolist()) is highly efficient. What if it isn't?

jnothman · 2015-01-11T12:52:03Z

I realise you've made no substantive changes to the implementation. But you've highlighted a critique of #3994, which is why we're discussing it here. I've not benchmarked the #3994 code presently. I don't think the efficiency of the concatenation is an issue, but can be benchmarked if we get there.

The only question is whether the memory trade-offs that your note highlights are worthwhile. I now suspect they are not, but that we might make it easier for a user to request that the matrix be precomputed (either only those neighbors within eps, or all pairs, which seems a much faster operation) rather than iterate through the dataset itself.

kno10 · 2015-01-11T16:01:02Z

On my test data sets (10k and 50k coordinates from Twitter, but "misusing" Euclidean distance), current head was not slower with a precomputed distance matrix, and 5x faster than 0.15.2. However, I was able to shave off another 20-40% with a different vectorization approach, which is in my patch-2 branch.
On the 50k data set, run times were 11s (0.15.2), 2.1s (HEAD), 2.0s (patch-1), 1.2s (patch-2) - distance matrix fails with MemoryError.
Performance benchmarks vary a lot with parameters. In particular, iterating only over core points pay off best iff there are only a few core points.

Any ideas to further improve this version, before I do a pull request?

coveralls · 2015-01-11T18:40:13Z

Coverage increased (+0.0%) when pulling 635ab93 on kno10:patch-1 into cdae4a4 on scikit-learn:master.

kno10 · 2015-01-14T21:09:24Z

(Travis CI build failure is due to also including a fix from #4073: min_pts does include the query point in DBSCAN - it is in the database, and thus returned by a range query).

amueller · 2015-01-16T21:36:10Z

sklearn/cluster/dbscan_.py

@@ -89,15 +95,15 @@ def dbscan(X, eps=0.5, min_samples=5, metric='minkowski',
    """
    if not eps > 0.0:
        raise ValueError("eps must be positive.")
+    if random_state is not None:


This should be a deprecation warning and should say that it will be removed in 0.18, I think.

This comment needs to be addressed before merging.

Indeed this has not been addressed yet.

jnothman · 2015-01-18T10:59:59Z

@jnothman I'm not sure what you are proposing.

I think I'd like to propose reverting to the previous implementation (or some cleaned up variant thereof with sample_weight support), accepting sparse matrices as input but recommending that their distances be passed in precomputed if memory constraints permit.

Basically, I think you're right that departing from the linear memory requirements for no great speed gains is a Bad Thing, given that passing a precomputed distance matrix is an option where memory permits.

jnothman · 2015-01-18T11:00:56Z

revert #3994 to only use O(n) memory (at the cost of being a lot slower)

Have we found it's a lot slower given precomputed input? If not, then the changes provide no real efficiency advantage.

jnothman · 2015-02-07T21:31:47Z

Okay, given movements and discussions elsewhere, I think we shouldn't revert anything. Yes, we should probably add a note about higher space complexity than the traditional algorithm. And perhaps @larsmans as a DBSCAN user has some input on turning shuffling off by default.

larsmans · 2015-02-07T21:38:39Z

I haven't seen it matter on real-world data yet, and I doubt it will. I have noticed that the batch distance computations can be problematic though, with machines locking up and all the assorted nastiness if the parameters are not set properly.

jnothman · 2015-02-08T03:28:43Z

if the parameters are not set properly.

You mean if the radius is unreasonably big for the data? Maybe we should have an option in BinaryTree to raise an error if too many neighbors for any particular query... Won't help for brute force though.

Are we better off doing something that doesn't require batch computation, but allows the user to pass in a precomputed radius_neighbors_graph?

amueller · 2015-03-03T22:24:18Z

I'm not sure I'm up to date on the DBSCAN reimplementation discussion. Is this PR still relevant? Or do we want to refactor anyhow?

jnothman · 2015-03-03T22:48:02Z

This PR is still relevant, I think.

On 4 March 2015 at 09:24, Andreas Mueller [email protected] wrote:

I'm not sure I'm up to date on the DBSCAN reimplementation discussion.
Is this PR still relevant? Or do we want to refactor anyhow?

—
Reply to this email directly or view it on GitHub
#4066 (comment)
.

kno10 · 2015-03-03T22:58:37Z

This PR still applies to current head.

It may be best to merge the "remove shuffling, add warning" patch early, if we want to eventually remove it altogether. Even if a redesign will eventually happen.

But I lost track of what was the latest/fastest version of DBSCAN without reengineering everything... my fastest pure-python version was e48ade5
I remember there was a Cython rewrite discussed, that is when I stopped following.

amueller · 2015-03-03T23:03:37Z

Ok, so if we should just merge the "remove shuffle, add warning", could you please rebase? And it looks like Travis was not happy.

amueller · 2015-03-03T23:04:45Z

sklearn/cluster/dbscan_.py

-    random_state : numpy.RandomState, optional
-        The generator used to shuffle the samples. Defaults to numpy.random.
+    random_state: numpy.RandomState, optional
+        Not supported (DBSCAN does not use random initialization).


Should probably just say "ignored"

jnothman · 2015-03-03T23:06:44Z

#4151 was merged, #4157 is awaiting review.

On 4 March 2015 at 09:58, Erich Schubert [email protected] wrote:

This PR still applies to current head.

It may be best to merge the "remove shuffling, add warning" patch early,
if we want to eventually remove it altogether. Even if a redesign will
eventually happen.

But I lost track of what was the latest/fastest version of DBSCAN without
reengineering everything... my fastest pure-python version was e48ade5
e48ade5
I remember there was a Cython rewrite discussed, that is when I stopped
following.

—
Reply to this email directly or view it on GitHub
#4066 (comment)
.

This makes little difference, and original DBSCAN did not shuffle. Warn if `random_state` is used. As is `random_state` encourages users to experiment with different randomization, as you would do with k-means. But in contrast to k-means, the output of DBSCAN is deterministic except for cluster enumeration and "rare" cases, where a point is on the border of two clusters at the same time. As this affects single points only, the measureable performance difference will be close to zero. Also, incorporate fix for minpts including the query point.

kno10 · 2015-03-04T09:49:00Z

I have rebased the patch. It already incorporated a fix for #4073 (DBSCAN includes the query point when counting neighbors) but not the updated unit test, which I cherry-picked from #4073.

Travis CI is failing with "Unable to connect to www.rabbitmq.com:http:" which is down for me, too (but not my fault).

amueller · 2015-03-04T18:44:57Z

LGTM

ogrisel · 2015-03-05T22:07:22Z

sklearn/cluster/dbscan_.py

@@ -89,15 +96,16 @@ def dbscan(X, eps=0.5, min_samples=5, metric='minkowski',
    """
    if not eps > 0.0:
        raise ValueError("eps must be positive.")
+    if random_state is not None:
+        warnings.warn("The parameter random_state is ignored " + 


style: there is no need for the + sign here.

ogrisel · 2015-03-05T22:11:20Z

Alright this looks good to me as well. I will fix the style / deprecation warning issues when merging. Let's move the discussions on the algorithm considerations (space complexity, speed with sparse data, option to precompute distances, Cython version) on to dedicated PRs and or issues.

amueller · 2015-03-05T22:18:24Z

Thanks @ogrisel and @kno10 :)

ogrisel · 2015-03-05T22:33:00Z

I rebased, fixed deprecation messages, added a whats new entry and pushed to master. Thanks everyone.

cstich · 2015-10-12T15:16:47Z

Hi, I just bumped into this coming from here: #5275 and a real-world example of where the higher memory complexity seems to matter are GPS traces.

jnothman referenced this pull request Jan 8, 2015

ENH vectorize DBSCAN implementation

3633cb3

jnothman reviewed Jan 8, 2015
View reviewed changes

amueller added this to the 0.16 milestone Jan 16, 2015

amueller reviewed Jan 16, 2015
View reviewed changes

jnothman mentioned this pull request Jan 24, 2015

[MRG] optimize DBSCAN (~10% faster on my data) #4151

Merged

amueller reviewed Mar 3, 2015
View reviewed changes

kno10 mentioned this pull request Mar 4, 2015

[MRG] Faster vectorization of DBSCAN (plain python) #4334

Closed

FIX/TST boundary cases in dbscan (closes #4073)

aa5a01d

amueller changed the title ~~Do not shuffle by default for DBSCAN.~~ [MRG + 1] Do not shuffle by default for DBSCAN. Mar 4, 2015

ogrisel reviewed Mar 5, 2015
View reviewed changes

ogrisel closed this Mar 5, 2015

kno10 deleted the patch-1 branch March 5, 2015 22:45

kno10 mentioned this pull request Mar 13, 2015

[MRG+1] optimize DBSCAN by rewriting in Cython #4157

Closed

cstich mentioned this pull request Oct 12, 2015

DBSCAN memory consumption #5275

Closed

Uh oh!

[MRG + 1] Do not shuffle by default for DBSCAN. #4066

[MRG + 1] Do not shuffle by default for DBSCAN. #4066

Conversation

kno10 commented Jan 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jan 8, 2015

Uh oh!

kno10 commented Jan 8, 2015

Uh oh!

jnothman commented Jan 8, 2015

Uh oh!

robertlayton commented Jan 8, 2015

Uh oh!

jnothman commented Jan 8, 2015

Uh oh!

GaelVaroquaux commented Jan 8, 2015

Uh oh!

amueller commented Jan 8, 2015

Uh oh!

jnothman commented Jan 9, 2015

Uh oh!

jnothman commented Jan 9, 2015

Uh oh!

kno10 commented Jan 9, 2015

Uh oh!

GaelVaroquaux commented Jan 9, 2015

Uh oh!

amueller commented Jan 9, 2015

Uh oh!

GaelVaroquaux commented Jan 9, 2015

Uh oh!

jnothman commented Jan 10, 2015

Uh oh!

jnothman commented Jan 10, 2015

Uh oh!

GaelVaroquaux commented Jan 10, 2015

Uh oh!

kno10 commented Jan 10, 2015

Uh oh!

GaelVaroquaux commented Jan 10, 2015

Uh oh!

kno10 commented Jan 10, 2015

Uh oh!

jnothman commented Jan 10, 2015

Uh oh!

jnothman commented Jan 10, 2015

Uh oh!

jnothman commented Jan 10, 2015

Uh oh!

jnothman commented Jan 11, 2015

Uh oh!

kno10 commented Jan 11, 2015

Uh oh!

jnothman commented Jan 11, 2015

Uh oh!

kno10 commented Jan 11, 2015

Uh oh!

coveralls commented Jan 11, 2015

Uh oh!

kno10 commented Jan 14, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jan 18, 2015

Uh oh!

jnothman commented Jan 18, 2015

Uh oh!

jnothman commented Feb 7, 2015

Uh oh!