Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add alternative argument for determining one-tailed or two-tailed permutation tests #199

@JosiahParry

Description

@JosiahParry

It appears that when a p-value exceeds 0.5, a different method of p-value calculation is used rather than the simulation p-value formula (M + 1) / (R + 1).

The below example manually calculates the p-value for each observation.

Can the p-value calculation method be documented or corrected?

import libpysal
import numpy as np
import geopandas as gpd
from esda.join_counts_local import Join_Counts_Local

fp = libpysal.examples.root + "/guerry/" + "Guerry.shp" 

guerry_ds = gpd.read_file(fp)
guerry_ds['SELECTED'] = 0
guerry_ds.loc[(guerry_ds['Donatns'] > 10997), 'SELECTED'] = 1

w = libpysal.weights.Queen.from_dataframe(guerry_ds)

LJC_uni = Join_Counts_Local(connectivity=w, keep_simulations = True).fit(guerry_ds['SELECTED'])

# Calculate P-values manually

index = np.where(~np.isnan(ps))
index = index[0].tolist()
# I think these are the simulations
sims = LJC_uni.rjoins
sims_p = sims[index]
obs_p = LJC_uni.LJC[index]
ps = LJC_uni.p_sim[index]
nsim = LJC_uni.permutations
sims_p[0]

def p_sim_calc(obs, sims, nsim = 999):
  return (sum(obs <= sims) + 1) / (nsim + 1)  
  
p_sim_calc(obs_p[0], sims_p[0])

p = list()
for i in range(len(ps)):
  x = p_sim_calc(obs_p[i], sims_p[i])
  p.append(x)

print(p, "\n", ps)

Note that when the calculated p-value is 0.5 the report simulated p-value is less than 0.5.

#> [0.409, 0.026, 0.026, 0.662, 0.652, 0.286, 0.662, 0.513,  0.03, 0.131, 0.051] 
#> [0.409, 0.026, 0.026, 0.339, 0.349, 0.286, 0.339, 0.488, 0.03, 0.131, 0.051]

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions