This repository contains the code for our recent work, The neural basis of intelligence in fine-grained cortical topographies.
Feilong, M., Guntupalli, J. S., & Haxby, J. V. (2021). The neural basis of intelligence in fine-grained cortical topographies. eLife, 10, e64058. https://doi.org/10.7554/eLife.64058
In this work, we found that predictions of general intelligence based on fine-grained (vertex-by-vertex) connectivity patterns were markedly stronger than predictions based on coarse-grained (region-by-region) patterns, accounting for approximately twice as much variance. Fine-grained connectivity in the default and frontoparietal cortical systems best predicts intelligence.
Comparison of predictions based on fine-grained connectivity and coarse-grained connectivity. Adapted from Figure 3 of the original paper.
The code works with recent versions of Python 3 and its packages. Prior to running the code you need to set up a Python environment using your favorite Python package manager. One way of doing this is to use conda-forge:
conda create -n IDM_pred 'python>=3.9'
conda activate IDM_pred
conda config --add channels conda-forge
conda config --set channel_priority strict
conda install numpy scipy scikit-learn pandas nibabel joblib ipython jupyterAlternatively the packages can also be installed with pip, preferably with venv.
pip install numpy scipy scikit-learn pandas nibabel joblib ipython jupyterThe code also uses a package named IDM_pred which is included in the repository. Suppose you are in the root directory of a clone of this repository, you can install it in development mode by running:
cd package/
python setup.py developTwo types of data are needed for the analysis. One is subject measures, which can be downloaded from ConnectomeDB as CSV files. After downloading these files, they can be combined into one pickle file using pandas and added to the IDM_pred package:
import pandas as pd
# Please replace {FILENAME_1} and {FILENAME_2} with the real file names.
df1 = pd.read_csv('{FILENAME_1}.csv', index_col='Subject')
df2 = pd.read_csv('{FILENAME_2}.csv', index_col='Subject')
df1.index = df1.index.astype(str)
df2.index = df2.index.astype(str)
df = pd.merge(df1, df2, 'outer', left_index=True, right_index=True)
df.to_pickle('package/IDM_pred/io/hcp_full_restricted.pkl')More than 2 CSV files can be combined in a similar manner.
The other kind of data needed is connectivity profiles. We have condensed these data into individual differences matrix format (subjects x subjects similarity/dissimilarity matrices; Gramian matrices in this case), which can be used to compute the principal components of the connectivity profiles, but takes much less disk space (26 GB) than the original connectivity profiles (terabytes, see figure below on how they were calculated).
Computing connectivity profiles. Adapted from Figure 1 of the original paper.
We are working on possibilities to share these data openly. In the mean time, you can contact me to get a copy provided that you have been granted access to the original HCP dataset.
Workflow for the prediction analysis. Adapted from Figure 2 of the original paper.
The package IDM_pred includes functions that can be used to replicate this analysis. Specifically,
IDM_pred.io.get_connectivity_PCsandIDM_pred.io.get_measure_infocan be used to load the data.IDM_pred.cv.nested_cv_ridgeimplements the ridge-regularized principal components regression model with nested cross-validation.IDM_pred.cv.compute_ss0computes the sum of squares for null models. It can be used to compute R2:- R2 = 1 - SSres / SSnull
The script predict_g.py is an example to use these functions to replicate the prediction analysis of the paper. To do it, simply run
python predict_g.pyIt is highly recommended to run the analysis with a high-performance computing cluster, such as Dartmouth's Discovery.
This work has been inspired by many previous works, especially https://github.com/adolphslab/HCP_MRI-behavior and https://github.com/alexhuth/ridge. Please also consider citing these works.