In vitro DNA-binding profile of transcription factors: methods and new insights

  1. Yingxun Liu
  1. The State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, People's Republic of China
  1. (Correspondence should be addressed to J Wang; Email: wangjinke{at}seu.edu.cn)
  1. Figure 1

    Schematic description of dsDNA microarray approach. The first step of studying in vitro DBPs of TFs is to prepare the high-density dsDNA microarray as described in the text. The second step is to react the TF protein of interest to dsDNA microarray. For example, a purified TF protein tagged with an epitope (such as glutathione S-transferase, GST) is allowed to bind directly to a dsDNA microarray (Berger et al. 2006). The third step is to report the binding interactions of TF protein with all dsDNA probes on microarray. For example, a fluorophore-conjugated antibody specific to the epitope (such as Alexa488-conjugated antibody to GST) is allowed to bind to dsDNA microarray followed by TF protein binding reaction (Berger et al. 2006). One-step binding of a fluorescently labeled TF protein to dsDNA microarray can also be adopted (Kim et al. 2009). The purified, epitope-tagged TF protein can also be replaced by a cell nuclear extract containing TF protein of interest (Egener et al. 2005). In correspondence, the antibody specific to the epitope has to be replaced with an antibody specific to TF protein, and a fluorophore-conjugated second antibody is used to report the bound TF protein. In this case, the detection process increases one step. Finally, the dsDNA microarray is scanned with a genechip scanner, such as GSI Lumonics ScanArray 5000. To confirm the reproducibility of dsDNA microarray detection, the experiments are performed in triplicate. To eliminate the influence of the density of dsDNA probes in each of the features in microarray on the signal of TF protein binding, the density of dsDNA probes of each microarray used to detect TF protein binding must be detected by using a second fluorescence signal, such as Sybrgreen I (a dsDNA-specific fluorescent dye) stained with dsDNA microarray (Mukherjee et al. 2004), Cyanine 3 (Cy3) linked to dUTP, which is incorporated in dsDNA in primer extension of ssDNA microarray (Egener et al. 2005), and Cy5 coupled to ddATP, which is used to fluorescently label DNAs on the microarray using terminal transferase (Berger et al. 2006). The signal of these detections of the density of dsDNA probes is subtracted from the signal of the TF protein binding as background. The averaged background-subtracted, normalized signal intensities for all spots (features) are used as binding affinity data to perform subsequent bioinformatics studies, such as finding specificity and relative binding affinity of TF protein to various DNA sequences, DNA-binding motif, additivity, and interdependence of nucleotides in a binding site, predicting DNA-binding sites and target genes of TF protein in genome. Full colour version of this figure available via http://dx.doi.org/10.1530/JME-11-0010.

  2. Figure 2

    Schematic description of the SELEX-seq approach. The basic procedures of SELEX-seq include three steps. The first step is to design and synthesize an ssDNA library and then convert ssDNA into dsDNA using primer extension. The ssDNA library contained all possible oligonucleotides in random sequence of length k. The length of k can be determined according to the nucleotide numbers of DNA-binding consensus bound by a TF protein of interest. The second step is to bind TF proteins to randomized dsDNA library and isolate the protein-bound dsDNAs from binding reaction using gel mobility shift assay (also called electrophoresis mobility shift assay, EMSA) (Zykovich et al. 2009), affinity chromatography (Zykovich et al. 2009), or TF protein-coupled microwell plate (Jolma et al. 2010). The isolated dsDNAs are amplified by PCR to prepare a new library for the next round of the selection process. Through several rounds of repeated screening, the TF protein-bound dsDNAs were enriched. To find all sequences that can be bound by a TF protein with various binding affinities, especially sequences with low affinity, the isolated dsDNAs from each round of selection can be separately collected for sequencing (Zykovich et al. 2009). The third step of SELEX-seq is to sequence the bound dsDNAs with massively parallel DNA sequencing techniques, such as Illumina SOLEXA. The sequencing reads are filtered with filters including only A, C, G, and T letters allowed, valid bar code, and constant regions and unique random regions. If multiple DNA samples are simultaneously sequenced, the filtered reads are sorted according to bar code sequence. The qualified reads data are then used to perform subsequent bioinformatics analysis, including finding motifs or position weight matrix (PWM) models with some typical algorithms for this purpose, such as multiple EM for motif elicitation (MEME) (Bailey & Elkan 1994). A more detailed experimental and computational procedure to infer parameters of TF-DNA interaction from SELEX experiments was described by some studies (Djordjevic & Sengupta 2006). Full colour version of this figure available via http://dx.doi.org/10.1530/JME-11-0010.

  3. Figure 3

    Schematic description of parallel SELEX-seq analysis of DNA-binding of multiple TFs. The bar code technique was employed in SELEX-seq when it was developed (Zykovich et al. 2009). In this case, a bar code sequence consisting of a few nucleotides is added to each SELEX-seq oligo substrate. The different bar-coded oligonucleotides are applied to different TF protein samples (Jolma et al. 2010), or different rounds of selection of a same TF protein (Zykovich et al. 2009). After SELEX selection of each protein or round, the enriched DNA samples are mixed in the same molar and sequenced as a single DNA sample. After sequencing and reads quantification, the qualified reads are sorted according to bar code sequence and then applied to independent bioinformatics analysis. The bar code technique can greatly improve the detection throughput and lower the experimental cost of SELEX-seq. For example, up to 28 samples were simultaneously analyzed in one sequencing by using 3 nt bar-coded oligonucleotides (Zykovich et al. 2009). Using 256 oligonucleotide libraries with different barcode sequences, the binding specificities of 256 different TFs can be analyzed in a single sequencing run (Jolma et al. 2010). Full colour version of this figure available via http://dx.doi.org/10.1530/JME-11-0010.

| Table of Contents