p53 cistrome, or direct p53 target genes, are defined as genes at which p53 bound at a p53 motif near the transcription start site and for which associated changes in expression were described in the same study.
Transcriptome in 2+ data sets are differentially expressed genes (DEGs) that appear in at least two independent data sets.
The colors for each classification are as follows:
| p53 cistrome target genes | |
| transcriptome in 2+ data sets | |
| upregulated DEG | |
| downregulated DEG | |
| no change in expression |
Microarray analysis
Microarray perfect match pixel intensity data from the Affymetrix raw CEL files of a given study (see Supplementary Table ST1) were preprocessed in the Partek Genomics Suite v6.13 software (Partek, St. Louis, Missouri) using the robust multichip average (RMA) approach that includes log2 transformation, background correction, quantile normalization, and summarization by median polish to combine data from the probes in a probe set to get a single data value. The data was then modeled with an N-way analysis of variance (ANOVA) where N denotes the number of factors in the given study design. Fisher's least significant difference contrasts between the mean of treated replicates and the mean of the respective control replicates (not treated [NT], mock/vehicle, DMSO, etc.) were performed to identify statistically significant differentially expressed genes (DEGs) using a Benjamini & Hochberg multiple testing false discovery rate (FDR) threshold <0.01 based on two-sided nominal p-values and an absolute fold-change (FC) >1.5.
RNA-seq analysis of downloaded data sets
RNA-seq data from previous studies (GSE55727, GSE47042, GSE15780) (Supplementary Table ST1) were downloaded from the Gene Expression Omnibus (GEO). Data quality was assessed with FastQC, and the adapter sequences were removed (if detected) prior to alignment. Preprocessed RNA-seq short reads were aligned with TopHat to human genome hg19 guided with refseq-based gene model (time stamp as of Dec. 15th, 2014). The alignment .bam files were directly processed with HTseq modules, and count level measurement was produced based on refseq-based gene model. In the end, count level data was used for further statistical analysis. To test for biological hypotheses, pair-wise tests with replicates were conducted with DEseq. Differentially expressed genes at each comparison condition were obtained based on the negative binomial test at a FC >|2| and p-value <0.01.
https://www.niehs.nih.gov/research/resources/databases/p53/index.cfm