Gene expression data¶
This page contains descriptions and examples to fetch microarray expression data.
Fetch gene expression data¶
The ENGMA TOOLBOX provides microarray expression data collected from six human donor brains and released by Allen Human Brain Atlas. Microarray expression data were first generated using abagen , a toolbox that provides reproducible workflows for processing and preparing gene co-expression data according to previously established recommendations (Arnatkevic̆iūtė et al., 2019, NeuroImage); preprocessing steps included intensity-based filtering of microarray probes, selection of a representative probe for each gene across both hemispheres, matching of microarray samples to brain parcels from the Desikan-Killiany, Glasser, and Schaefer parcellations, normalization, and aggregation within parcels and across donors. Moreover, genes whose similarity across donors fell below a threshold (r < 0.2) were removed, leaving a total of 12,668 genes for analysis (using the Desikan-Killiany atlas). To accommodate users, we also provide unthresholded gene datasets with varying stability thresholds (r ≥ 0.2, r ≥ 0.4, r ≥ 0.6, r ≥ 0.8) for every parcellation (https://github.com/saratheriver/enigma-extra).
Wanna know where we got those genes? 👖
The Allen Human Brain Atlas microarray expression data loaded as part of the ENIGMA TOOLBOX was originally
fetched from the abagen toolbox using the
command. For more flexibility, check out their toolbox!
Got NaNs? 🥛
Please note that two regions (right frontal pole and right temporal pole) in the Desikan-Killiany atlas were not matched to any tissue sample and thus are filled with NaN values in the data matrix.
Slow internet connection? 🐌
fetch_ahba() fetches a large (~24 MB) microarray dataset from the internet and may thus be
incredibly slow to load if you lack a good connection. But don’t you worry: you can download the
relevant file by typing this command in your terminal
and specifying its path in the
fetch_ahba() function as follows:
>>> from enigmatoolbox.datasets import fetch_ahba >>> # Fetch gene expression data >>> genes = fetch_ahba() >>> # Obtain region labels >>> reglabels = genes['label'] >>> # Obtain gene labels >>> genelabels = list(genes.columns)
% Fetch gene expression data genes = fetch_ahba(); % Obtain region labels reglabels = genes.label; % Obtain gene labels genelabels = genes.Properties.VariableNames(2:end);