permutation inference performance using numba #116

ljwolf · 2020-04-29T16:40:48Z

This is a stub to work on performance in the permutational inference engine using numba.

…ished

…ebook

ljwolf · 2020-05-02T14:59:41Z

At this phase, we've got an 8x speedup over the current implementation. From here, we have three tasks.

Conceptually, we want the final implementation to look like this:

def crand_plus(*data, local_function=moran_i):
    cardinalities, weight_pointers, *etc = setup_crand()
    with multiprocessing.Pool() as P:
        larger, random_locals = P.map(_do_one_observation, stuff_needed_for_that_iteration)
    return larger, random_locals

defining the inner loop & lifting it into a numba method.

We need to lift up the body of neighbors_perm_plus (e.g. l1655) into its own thing. This will entail solving a few issues:

indexing into the flat weights array? *is the start position for i given by numpy.cumsum(cardinalities)[i]?
defining the signature of _do_one_observation. It'll probably need
- its weights or a pointer & into the weights flat array
- the cardinality of i
- the permutation matrix
- data for z
- the index of the observation

parallelizing the computation

If (and only if) there's gains to be made by parallelizing the simulations, we want to explore using joblib, which should be available on all platforms.

defining local functions

Ideally, we'd like to implement the permutation logic once and then use it in all permutation-inference driven things. If we do that, we need to define something like a numba-fied local statistic for anything using the permutation inference engine. For example, a Moran version might be:

@njit(fastmath=True)
def local_statistic(i, z, permutations, cardinality, weights_i):
    mask = numpy.ones_like(z)
    n = z.shape[0]
    scaling = (n - 1) / (z * z).sum()
    mask[i] = False
    z_no_i = z[mask].reshape(z.shape)
    flat_random_z = z_no_i[permutations[:, :cardinality]]
    z_no_i_lag = flat_random_z.reshape(-1,cardinality) @ weights_i
    rlisa = z[i] * z_no_i_lag.sum(axis=1) * scaling
    return rlisa

This will enable us to pass arbitrary JIT-ted functions to the random permutation engine.

sjsrey · 2020-05-02T16:24:01Z

defining the inner loop & lifting it into a numba method.

We need to lift up the body of neighbors_perm_plus (e.g. l1655) into its own thing. This will entail solving a few issues:

lifting is pending at ljwolf#4

…on single process. Start of parallel implementation

[WIP] Updating tests

Add seed argument to be passed to crand and insert in tests

Make seed not optional in vec_permutations

Moran perf

ljwolf · 2020-07-03T22:00:36Z

The only failures now are the known failures for the chi-squared test in join counts (#123). This should be ready to merge 🥳

sjsrey · 2020-07-04T18:07:55Z

The only failures now are the known failures for the chi-squared test in join counts (#123). This should be ready to merge partying_face

I think once #124 is merged, this will also pass if rerun.

ljwolf added 6 commits April 24, 2020 16:30

WIP: check numba for moran

e6e9571

WIP: keep iterating on algorithmic correctness

3006f0e

inline for performance

2b4a464

add jitting

fadc325

explore ndim broadcasting for permutations

e5abd15

finalize the neighbors_perm strategy

d7e2186

ljwolf assigned sjsrey, darribas and ljwolf Apr 29, 2020

darribas and others added 8 commits April 29, 2020 17:11

Add neighbors_perm_plus and start of hooking up to Moran_Local. Unfin…

decc9b4

…ished

Add notebook to track benchmarks

af6c782

Merge conflicts

998ebd2

Hook up numba randomisation with Moran_Local; add illustration in not…

fa18c5c

…ebook

Merge branch 'moran-perf' of github.com:darribas/esda into moran-perf

0b59600

clean up perf optimizations and add dani terminology

42c664e

add @sjsrey optimizations

f573d27

rescale lisas

27ec37a

sjsrey mentioned this pull request May 2, 2020

Memory efficient conditional permutation for LISA #113

Closed

darribas added 2 commits May 2, 2020 14:20

Correct but slow numba implementation

f84b393

Correct and performant implementation using numba

831e6ac

Make neighbors_perm_plus run steps only to be sent to workers. Works …

02702fb

…on single process. Start of parallel implementation

ljwolf mentioned this pull request May 6, 2020

[WIP] Parallel implementation of numba code ljwolf/esda#5

Merged

darribas added 6 commits May 6, 2020 11:58

Add numba-based parallel implementation + checks added on notebook

ad322ce

Patch parallel so it works. Add tests. No good

3856eef

Add comparison to pygeoda

c1f7290

Fix update on parallel loop

00da479

Debug parallel implementation w/ prange. Unfinished

70fd912

Add correct parallel implementation

e0bfc76

ljwolf and others added 23 commits July 1, 2020 20:56

Merge pull request #7 from darribas/moran-perf

8151e06

[WIP] Updating tests

add numba to requirements for testing

2129431

Merge branch 'master' of github.com:pysal/esda into moran-perf

fe89816

Merge branch 'moran-perf' of github.com:ljwolf/esda into moran-perf

923fa5d

TRAVIS: move numba into matrix, install from conda

0467f25

TRAVIS: add numba to can-fail cases

0fe9cde

TRAVIS: fix if/else for numba

60ff165

TRAVIS: missed the if/else terminator

a31cedf

TRAVIS: try strict channel preference for numba

a21896f

TRAVIS: try dropping conda-forge because we're using pip anyway

82ae56b

TRAVIS: forgot the --yes in conda install

5f0afc2

Add seed argument to be passed to crand and insert in tests

93a1445

Merge pull request #8 from darribas/moran-perf

46a415e

Add seed argument to be passed to crand and insert in tests

Make seed not optional in vec_permutations

6a446a3

Fix fixed-type syntax for seed

976cb41

Merge pull request #9 from darribas/moran-perf

f40ab56

Make seed not optional in vec_permutations

Adding further documentation

3118828

Clean up notebooks for performance/correctness

8c85b2c

Merge pull request #10 from darribas/moran-perf

439a684

Moran perf

handle the deprecations in unittests

c9b770d

fix false discovery rate filtering

1caa226

finalize edits for the new seed processing options

e17b8c8

fix up docscring

277eef4

Merge branch 'master' into moran-perf

f017903

sjsrey merged commit 96e2135 into pysal:master Jul 5, 2020

ljwolf mentioned this pull request Aug 3, 2020

document minimum numba version #143

Merged

MgeeeeK mentioned this pull request Aug 13, 2020

[WIP]: Numba-fied/multi-threaded weights builder MgeeeeK/libpysal#10

Closed

2 tasks

MgeeeeK mentioned this pull request Aug 28, 2020

[WIP]: Optimized raster-based weights builder pysal/libpysal#343

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

permutation inference performance using numba #116

permutation inference performance using numba #116

Uh oh!

ljwolf commented Apr 29, 2020

Uh oh!

ljwolf commented May 2, 2020

Uh oh!

sjsrey commented May 2, 2020

defining the inner loop & lifting it into a numba method.

Uh oh!

ljwolf commented Jul 3, 2020

Uh oh!

sjsrey commented Jul 4, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

permutation inference performance using numba #116

permutation inference performance using numba #116

Uh oh!

Conversation

ljwolf commented Apr 29, 2020

Uh oh!

ljwolf commented May 2, 2020

defining the inner loop & lifting it into a numba method.

parallelizing the computation

defining local functions

Uh oh!

sjsrey commented May 2, 2020

defining the inner loop & lifting it into a numba method.

Uh oh!

ljwolf commented Jul 3, 2020

Uh oh!

sjsrey commented Jul 4, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants