Make Parcellated Dictionary Structure

After having output from fMRIPREP, we will (1) parcellate the fmri data that fMRIPREP has put into various spaces, and (2) organize the parcellated data along with various nuisance regressors from fMRIPREP into a format that is convenient for denoising. Later denoising scripts within this package rely on data being in this format.

There are a series of functions that will be used to accomplish this: (1) generate_file_paths, (2) all_files_paths_exist, (3) populate_parc_dictionary.

generate_file_paths can take functional data stored in either gifti format, nifti format, or both (as may be the case when wanting to look at surface + subcortical data).

file_path_dictionary = generate_file_paths(lh_gii_path=None,
                    lh_parcellation_path=None,
                    nifti_ts_path=None,
                    nifti_parcellation_path=None,
                    aroma_included=True)

generate_file_paths does not have any explicitly required commands BUT will only run if you specify lh_gii_path and lh_parcellation_path or nifti_ts_path and nifti_parcellation_path (or alternatively all four).

  • lh_gii_path: The path to the desired left hemisphere gifti file containing func data that needs to be parcellated. To identify the path to the gifti file for the right hemisphere, ‘lh’ will be swapped with ‘rh’ in the file name. lh_gii_path can be defined in any surface space as long as lh_parcellation_path is also in that same space.
  • lh_parcellation_path: The path to the desired left hemisphere *.annot file (using FreeSurfer formatting) that will be used to parcellate lh_gii_path. It is assumed that the first entry in this parcellation is the medial wall, and this entry will be not used to generate the output. Similar to lh_gii_path, the lh_parcellation_path will be used to determine the rh_parcellation_path.
  • nifti_ts_path: The path to the desired volumetric nifti file containing func data that needs to be parcellated. This can be in any space (native, MNI, etc) but must be in the same space as nifti_parcellation_path.
  • nifti_parcellation_path: The path to the volumetric nifti file containing the parcellation to be applied to the volumetric data. All unique values outside of zero will be used as parcels. Optionally, if you have a file in the same folder as nifti_parcellation_path with the same name as the parcellation file and the extension switched to .json, you can use this type of file to propogate specific parcel names to be used as labels instead of having labels named by their value in the nifti file (see here for example NEED TO DO THIS)
  • aroma_included: if you did not generate aroma output from fMRIPREP, then set this to False so that later functions won’t try to load any aroma_files

The output to this function will contain relevant file paths for functional images, parcellations, nuisance regressors, etc. that will be loaded later on. Once you have file_paths_dict, use the function all_file_paths_exist to see if all the file paths necessary for analyses exist:

file_paths_exist_true_false = file_paths_exist(file_path_dictionary)
  • file_path_dictionary: a dictionary containing relevant file paths for later analyses, that should have been generated by the function generate_file_paths.

If file_paths_exist outputs True, then all the files necessary to proceed with analyses are present, so the function populate_parc_dict can be used to populate the dictionary with the data of interest (i.e. the parcellated timeseries, confounds, etc)

parc_ts_dictionary = populate_parc_dictionary(file_path_dictionary, TR)
  • file_path_dictionary: a dictionary containing relevant file paths for later abnalyses, that should have been generated by the function generate_file_paths
  • TR: The repition time of the functional scan in seconds (this may later be automatically drawn from metadata)

Running these commands will generate an output dictionary with a varying number of the following keys:

  • labels: this is a list containing the names of all parcel labels (either from surface data, volumetric data, or both, depending on what input data was provided)
  • time_series: this is a numpy array with shape <n_regions, t_timepoints> containing the functional data resampled to the input parcellations with ordering same as labels. Each region will have a mean of approxamitely 1, as the timecourse for each vertex/voxel is divided by its temporal median, and then averaged across voxels/vertices within the region, excluding NaNs at any point in the calculation. This is done so that each vertex/voxel contributes roughly equal weight to the region.
  • median_ts_intensities: For each region this is the temporal median of each vertex/voxel timecourse, averaged across all vertices/voxels in the region.
  • surface_labels, nifti_labels, surface_time_series, nifti_timeseries, surface_median_ts_intensities, nifti_median_ts_intensities: These are all duplicates of the dictionary entries previously listed, only restricted to either the surface or nifti input (if present)
  • nifti_parcellation_info.json: if the nifti parcellation file has an accompanying .json file, this will be a copy of that json file.
  • file_path_dictionary.json: A dictionary with the file paths used to generate this parc_ts_dictionary.
  • general_info.json: A json file with some general information, including the TR, label names, mean_dvars, mean_fd, the number of volumes to skip at the beginning of the scan, the number of high motion and high dvars timepoints, the name of the session, and the name of the subject (some information here includes duplicates of what is present somewhere else)
  • TR: the repition time of the scan in seconds
  • confounds: Includes confounds found in the run’s desc-confounds_regressors.tsv file along with some custom groupings including:
    • motion_regs_six - the six realignment parameters
    • motion_regs_twelve - the six realignment parameters + derivatives
    • motion_regs_twentyfour - the six realignment parameters + derivatives and both of their squares
    • aroma_clean_ics - all the ics not identified as noise by aroma
    • aroma_noise_ics - all the ics identified as noise by aroma
    • five_acompcors - first five anatomical comp cor components
    • And wmcsf, wmcsfgsr, wmcsf_derivs, wmcsfgsr_derivs, with derivs indicating both the original timeseries for different regions plus their temporal derivatives

Because the function populate_parc_dictionary takes around a minute to run, after finishing this sequence of commands, you will likely want to save the output matrix as follows:

save_dictionary(parc_ts_dictionary, path_for_dictionary_dir, overwrite=False)
  • parc_ts_dictionary: the dictionary generated by populate_parc_dict
  • path_for_dictionary_dir: a string pointing to the path where a folder to contain the dictionary structure should be created (the dictionary object will be saved as a new directory and this should not exist yet unless you want to overwrite it)
  • overwrite: boolean declaring whether or not path_for_dictionary_dir should be overwritten

If you are looking for some template parcellations to use, the parcellations generated from Schaefer’s 2018 paper are available in a variety of formats and resolutins on Github <https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/brain_parcellation/Schaefer2018_LocalGlobal>. Alternatively a variety of other parcellations are listed on the Lead-DBS webpage <https://www.lead-dbs.org/helpsupport/knowledge-base/atlasesresources/cortical-atlas-parcellations-mni-space/>

Example Usage

from discovery_imaging_utils import parc_ts_dictionary
from discovery_imaging_utils import dictionary_utils

#Specify the path to lh gifti and annot files
lh_gii_path = '/path/to/sub-1234_ses-1234_task-REST_acq-AP_run-1_space-fsaverage_hemi-L_bold.func.gii'
lh_parcellation_path = '/path/to/lh.name_of_parc.annot'

#Specify the path to the nifti timeseries and
#parcellation (optional). Reminder: parcel names
#from a .json file can be incorporated if the .json
#file is named matching the parcellation file
nifti_ts_path = '/path/to/sub-1234_ses-1234_task-REST_acq-AP_run-1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz
nifti_parcellation_path = '/path/to/parcellation.nii.gz'

#Create the file paths dictionary
file_path_dictionary = parc_ts_dictionary.generate_file_paths(lh_gii_path=lh_gii_path,
                    lh_parcellation_path=lh_parcellation_path,
                    nifti_ts_path=nifti_ts_path,
                    nifti_parcellation_path=nifti_parcellation_path,
                    aroma_included=True) #Assuming aroma was ran

#Check that all file paths exist
paths_present = parc_ts_dictionary.file_paths_exist(file_path_dictionary)

#Populate the dictionary for the parcellation/nuisance metrics/etc
TR = 0.8 #TR is in seconds
parc_ts_dict = parc_ts_dictionary.populate_parc_dictionary(file_path_dictionary, TR)

#Save the dictionary for later use
path_for_dictionary_dir = '/path/to/dir/that/will/store/dictionary/structure'
save_dictionary(parc_ts_dict, path_for_dictionary_dir, overwrite=False)