Cleaning a Parcellated Dictionary¶

Using the tools in parc_ts_dictionary, a dictionary is generated containing all of the items that would be conventionally used in denoising the fMRI time signal. This dictionary can then be used to generated denoised data with the denoise function within denoise_ts_dict. The denoise function is seen as follows:

denoise(parc_dict,
        hpf_before_regression,
        scrub_criteria_dictionary,
        interpolation_method,
        noise_comps_dict,
        clean_comps_dict,
        high_pass,
        low_pass)

parc_dict: this is a dictionary generated by parc_ts_dictionary that has been loaded into memory.
hpf_before_regression: This argument specifies whether or not the nuisance regressors and the time-signals of interest (i.e. the parcellated time-signals) before the nuisance regressors are regressed from the parcellated time-signals. If you do not want to do this, set to False. Otherwise set to the desired high-pass cutoff point.
scrub_criteria_dictionary: This argument allows the user to define how scrubbing should be conducted. If you do not want to do scrubbing, set this argument to False. If you want to do scrubbing, there are a few different configuration options for the dictionary
1. {‘std_dvars’ : 1.2, ‘framewise_displacement’ : 0.5}
2. {‘Uniform’ : [0.8, [‘std_dvars’, ‘framewise_displacement’]]}

In the first example, any timepoints with std_dvars > 1.2, and framewise_displacement > 0.5 will be scrubbed. Any number of variables found under the confounds dictionary can be used to do scrubbing, with any cutoff point. In the second example, Uniform scrubbing is specified, with 0.8 meaning that the best 80% of timepoints should be kept. The sub-list with std_dvars and framewise_displacement says that std_dvars and framewise_displacement should be used to determine what the best volumes are. This means the two metrics will be demeaned and variance normalized and then added together, and the 80% of volumes with the lowest score on the combined metric will be kept. Any variables under the confounds dictionary (that are not groupings such as motion_regs_24) can be used to construct these dictionaries.

INSERT MORE DETAILS ABOUT THIS z-transform, confounds, etc.

interpolation_method: Can choose between ‘linear’, ‘cubic_spline’, and ‘spectral’. The spectral denoising takes the longest but is expected to perform the best (this is based off of the technique presented in Power’s 2014 NeuroImage paper/Anish Mitra’s work)
noise_comps_dict: this dictionary configures what nuisance signals will be removed from the parcellated timeseries. Each element represents an entry to the confounds dictionary, where the key is the name of the confound (or group of confounds) to be regressed, and the entry is either False or an integer, which specifies whether the nuisance regressors should be reduced by PCA and if so how many principal components should be kept. Some examples are seen below:

#Include 24 motion parameters as regressors
denoise_dict = {'motion_regs_twentyfour' : False}

#Include 24 motion parameters as regressors, reduced through PCA to 10 regressors
denoise_dict = {'motion_regs_twentyfour' : 10}

#Include WM/CSF/GSR + motion parameters as regressors
denoise_dict = {'wmcsfgsr' : False, 'motion_regs_twentyfour' : False}

#Include WM/CSF/GSR + ICA-AROMA Noise Timeseries as regressors
denoise_dict = {'wmcsfgsr' : False, 'aroma_noise_ics' : False}

#Skip nuisance regression
denoise_dict = False

clean_comps_dict: The formatting of this dictionary is identical to the noise_comps_dict, but this dictionary is used for specifying components whose variance you do not want to be removed from the parcellated timeseries. During the denoising process a linear model will be fit to the parcellated time-series using both the signals specified by the noise_comps_dict and clean_comps_dict, but only the signal explained by the noise_comps_dict will be removed.
high_pass: The cutoff frequency for the high-pass filter to be used in denoising. If you want to skip the high-pass filter, set to False.
low_pass: The cutoff frequency for the low-pass filter to be used in denoising. If you want tot skip the low-pass filter, set to False.

Running the function will output a dictionary containing the cleaned parcellated signal along with the settings used for denoising, other QC variables, and variables copied from the input dictionary. This includes:

cleaned_timeseries: The cleaned signal after denoising with shape <n_regions, n_timepoints>. Any scrubbed timepoints, or timepoints removed at beginning of the scan will be NaN
denoising_settings.json: The settings specified when using the denoise function
dvars_pre_cleaning: DVARS calculated pre-cleaning on all input parcels (timepoints skipped at the beginning of the run + the next timepoint after the initial skipped timepoints will have DVARS set to -0.001)
dvars_post_cleaning: DVARS calculated post-cleaning on all input parcels (scrubbed timepoints, timepoints at beginning of the run, and timepoints following scrubbed timepoints will have DVARS set to -0.001)
dvars_stats.json: Different statistics about DVARS including (removed TPs not included in any stats):
- mean_dvars_pre_cleaning: temporal average dvars before cleaning
- mean_dvars_post_cleaning: temporal average dvars after cleaning
- dvars_remaining_ratio: mean_dvars_post_cleaning/mean_dvars_pre_cleaning
- max_dvars_pre_cleaning: highest dvars value before cleaning
- max_dvars_post_cleaning: highest dvars value after cleaning
file_path_dictionary.json: copied from input, containing file paths involved in constructing the parcellated dictionary
general_info.json: copied from input, containing relevant info such as the name of the subject/session, parcel labels, number of high motion and fd timepoints (calculated from fMRIPREP), etc.
good_timepoint_inds: the indices for timepoints with defined signal (i.e. everything but the volumes dropped at the beginning of the scan and scrubbed timepoints)
labels: another copy of the parcel label names
mean_roi_signal_intensities.json: the mean signal intensities for raw fMRIPREP calculated csf, global_signal, and white_matter variables
median_ts_intensities: The spatial mean of the temporal median of all voxels/vertices within each parcel (calculated on fMRIPREP output)
num_good_timepoints: the total number of good timepoints left after scrubbing and removing initial volumes
std_after_regression: The temporal standard deviation of each parcel’s timesignal after nuisance regression (this is calcualated prior to the final filtering of the signal)
std_before_regression: The temporal standard deviation of each parcel’s timesignal prior to nuisance regression (if hpf_before_regression is used, this is calculated after that filtering step)
std_regression_statistics
- mean_remaining_std_ratio: the average of std_before_regression/std_after_regression across all parcels
- least_remaining_std_ratio: the minimum of std_before_regression/std_after_regression across all parcels

In totallity, processing follows the sequence below: 1. Calculate DVARS on the input time-series. 2. If hpf_before_regression is used, filter the parcellated time-series, and the signals specified by clean_comps_dict, and noise_comps_dict. 3. Calculate the temporal standard deviation for each parcel (for std_before_regression) 3. Fit the signals generated from clean_comps_dict and noise_comps_dict to the parcellated timeseries (using only defined, not scrubbed points) and remove the signal explained from the noise_comps_dict. 4. Calculate the temporal standard deviation for each parcel (for std_after_regression) 5. Interpolate over any scrubbed timepoints 6. Apply either highpass, lowpass, or bandpass filter if specified 7. Set all undefined timepoints to NaN 8. Calculate DVARS on the output time-series 9. Calculate remaining meta-data

Example¶

from discovery_imaging_utils import func_denoising
from discovery_imaging_utils import dictionary_utils

#Path to saved dictionary directory structure
#created from parc_ts_dict
path_to_parc_ts_dict = '/insert/name/of/path'

#Load the parcellated dictionary
parc_dict = dictionary_utils.load_dictionary(path_to_parc_ts_dict)

#Set the parameters for denoising
hpf_before_regression = False #don't filter variables before regression
scrub_criteria_dictionary = {'std_dvars' : 1.3, 'framewise_displacement' : 0.5} #scrub high dvars and fd timepoints
interpolation_method = 'spectral'
noise_comps_dict = {'wmcsfgsr' : False, 'motion_regs_twentyfour' : False} #regress white matter, csf, and gsr signal + 24 motion regressors
clean_comps_dict = False #Skip including variables whose signal should be preserved in denoising
high_pass = 0.01 #High pass filter cutoff at 0.01Hz
low_pass = 0.08 #Low pass filter cutoff at 0.08Hz

denoised_func_dict =    denoise(parc_dict,
                                hpf_before_regression,
                                scrub_criteria_dictionary,
                                interpolation_method,
                                noise_comps_dict,
                                clean_comps_dict,
                                high_pass,
                                low_pass)

#Save the output for later use
output_path = '/path/to/directory/to/be/created/for/output'
dictionary_utils.save_dictionary(denoise_func_dict, output_path)