-
Notifications
You must be signed in to change notification settings - Fork 60
Open
Description
It appears that when a p-value exceeds 0.5, a different method of p-value calculation is used rather than the simulation p-value formula (M + 1) / (R + 1).
The below example manually calculates the p-value for each observation.
Can the p-value calculation method be documented or corrected?
import libpysal
import numpy as np
import geopandas as gpd
from esda.join_counts_local import Join_Counts_Local
fp = libpysal.examples.root + "/guerry/" + "Guerry.shp"
guerry_ds = gpd.read_file(fp)
guerry_ds['SELECTED'] = 0
guerry_ds.loc[(guerry_ds['Donatns'] > 10997), 'SELECTED'] = 1
w = libpysal.weights.Queen.from_dataframe(guerry_ds)
LJC_uni = Join_Counts_Local(connectivity=w, keep_simulations = True).fit(guerry_ds['SELECTED'])
# Calculate P-values manually
index = np.where(~np.isnan(ps))
index = index[0].tolist()
# I think these are the simulations
sims = LJC_uni.rjoins
sims_p = sims[index]
obs_p = LJC_uni.LJC[index]
ps = LJC_uni.p_sim[index]
nsim = LJC_uni.permutations
sims_p[0]
def p_sim_calc(obs, sims, nsim = 999):
return (sum(obs <= sims) + 1) / (nsim + 1)
p_sim_calc(obs_p[0], sims_p[0])
p = list()
for i in range(len(ps)):
x = p_sim_calc(obs_p[i], sims_p[i])
p.append(x)
print(p, "\n", ps)Note that when the calculated p-value is 0.5 the report simulated p-value is less than 0.5.
#> [0.409, 0.026, 0.026, 0.662, 0.652, 0.286, 0.662, 0.513, 0.03, 0.131, 0.051]
#> [0.409, 0.026, 0.026, 0.339, 0.349, 0.286, 0.339, 0.488, 0.03, 0.131, 0.051]
Metadata
Metadata
Assignees
Labels
No labels