The functionality described in this topic is only available when you mark Show Advanced Options.


ChIP advanced wizard:

Add/Review Content

Select Probes from Agilent High Definition Database

 

This screen of the ChIP advanced wizard is for defining the genes or genomic regions of interest that you want covered in the design. When you submit the probe selection job, SureDesign searches the probes in Agilent's high definition (HD) probe database to locate the ones that cover your specified targets.

Screen 1 - Define Targets

Complete the fields and selections on this screen to define the desired genes or genomic regions.

Targets

In the Targets text area, enter the identifiers for the targets using either of the following approaches:

·        Type or paste the target identifiers directly into the text area. List one identifier per line.

·        Click Upload to browse to a text file (*.txt) that lists the target identifiers (one identifier per line).

The permitted identifiers are:

·        For target genes:

Gene name - enter the gene name (not case-sensitive) as it appears in one or more of the selected databases; example: brca1

Transcript ID - enter the transcript ID (not case-sensitive) as it appears in one or more of the selected databases; examples: NM_007294, OTTHUMT00000348798, or ENST00000357654

Gene ID - enter the numerical NCBI gene ID; example: 672

·        For target genomic intervals:

Genomic coordinates - enter the chromosome number and range of nucleotides using the UCSC browser format or BED format.

You can add a string of text, no spaces, after the target genomic interval to be used as the target ID (e.g. chr1:1-100 geneX). If you enter multiple target genomic intervals with the same target ID (e.g. chr1:1-100 geneX and chr1:201-300 geneX), SureDesign will treat the intervals as different regions within the same gene.

Databases

Below the Databases heading, mark the genome databases that you want SureDesign to use to obtain genomic coordinate information for your specified targets. You can hover the cursor over a database name to see the date that Agilent most recently downloaded data from the database. For H. sapiens, the available database sources are:

RefSeq - US National Center for Biotechnology Information (NCBI)

Ensembl - European Bioinformatics Institute and the Wellcome Trust Sanger Institute

CCDS - Consensus Coding Sequence project (CCDS) of the US National Center for Biotechnology Information (NCBI)

Gencode - US National Human Genome Research Institute (NHGRI) and the Wellcome Trust Sanger Institute

VEGA - Vertebrate Genome Annotation project of the Human and Vertebrate Analysis and Annotation (HAVANA) group at the Wellcome Trust Sanger Institute

CytoBand - CytoBand file from the UCSC Genome Browser

Include Flanking Regions (5' and 3')

In the field, type the number of base pairs of flanking sequence (on the 3' and 5' ends) that you want SureDesign to include on each target. SureDesign does not include flanking bases for targets entered as genomic coordinates.

Allow Synonyms

When this check box is marked, SureDesign compares the gene names you entered into the Targets area to a table of synonyms, and may use the synonym names to map the genes to a genomic location. For example, if you entered HER2 as a target, SureDesign would identify HER2 as a product of the gene ERBB2, and use ERBB2 to map the genomic location.

In cases in which the gene name for your target is also a synonym for another gene, SureDesign treats both genes as targets when Allow Synonyms is marked. For example, if you entered DSP as a target, SureDesign would identify your target as the official gene name for desmoplakin, but it would also identify it as a synonym for the gene encoding dentin sialophosphoprotein. Consequently, the program would map the genomic location to two completely genes, and in the next step of the wizard (Screen 2), you would see both genomic locations listed for the target.

When the Allow Synonyms check box is cleared, SureDesign maps your targets to genomic locations using only the entered gene names.

To fully control how SureDesign maps your targets to a genomic location, enter your targets using transcript IDs, gene IDs, or SNP IDs instead of gene names. Alternatively, after you advance to the next step of the wizard, click Download to download the Regions.bed file and then edit the genomic locations listed in the file so that they accurately match those of your targets. You can then go back to the previous step of the wizard and paste the genomic locations into the Targets input area.


Click Next to continue to Screen 2.

Screen 2 - Review Targets

This screen provides a chance for you to make sure that SureDesign successfully recognized all of the target identifiers that you entered on the previous screen. Review the Target Summary and Target Details before proceeding.

Target Summary

Near the top of the wizard window is a target summary with three bullet points that indicate:

·        1st bullet point: The number target identifiers entered in the Define Targets step.

·        2nd bullet point: The number of target identifiers that SureDesign was able to resolve to a genomic location. (If any of the target identifiers mapped to more than one genomic location, you will notice that the number of targets found is greater than the number of entered. See SureDesign gene finder for more information on how SureDesign maps target IDs to targets.)

·        3rd bullet point: The number of target identifiers that SureDesign was not able to find in any of the databases you selected in the Define Targets step.

If SureDesign did not accurately identify all of your target regions

Target Details

The Target Details table lists the following information for each of the target identifiers that SureDesign was able to locate:

·        Target ID - The target ID is the gene name, transcript ID, SNP ID, or genomic coordinates that you used to define the target.

·        # Regions - The # Regions column lists the number of target regions within the target.

·        Base Pairs - The Base Pairs column lists the total number of base pairs within the regions defined by the target identifier.

·        Position - The Position column lists the genomic coordinates identified for the target.

 NOTE  To perform a careful review of the individual regions, click View targets in UCSC to open the UCSC Genome Browser and see the genomic locations of the regions identified by SureDesign.


Click Next to continue to screen 3.

Screen 3 - Enter Parameters

This screen has a set of selection parameters that allow you to define some of the parameters of the probegroup. When you are finished with your selections, submit the design to SureDesign to begin the probe selection process.

Probegroup Name

SureDesign automatically populates the Probegroup Name field with a default name. To change the name, type a new name into the field.

Selection Method

Use the radio buttons to specify how SureDesign selects probes from the HD database to cover your target regions of interest. The options are:

·        Total Probes - Select this option to specify the total number of unique HD probes that you want SureDesign to add to the probegroup.

·        Probes Per Interval - Select this option to specify the maximum number of HD probes that you want SureDesign to add to the probegroup for each target interval.

In the field below the radio buttons, type the desired total number of probes into the field (if you selected Total Probes), or type the desired maximum number of probes per interval (if you selected Probes per Interval). You must enter a number greater than 0.

Similarity Filter

Use the similarity filter to set the specificity of the probes. Optimally, the HD probes that SureDesign selects hybridize to only one genomic location.

·        No Filter - Select this option if you do not want SureDesign to filter probes based on specificity. Keep in mind that data from probes with more than one matching sequence in the genome may be harder to interpret.

·        Similarity Score - Select this option to filter out all probes with significant similarity to multiple genomic locations. This is the most stringent filter option. For some of your target regions, SureDesign may not be able to identify any probes in the HD database that pass this filter.

·        Perfect Match - Select this option to filter out all probes that match more than one location in the genome. If you select this option, SureDesign will not select any probes in segmental duplication regions or pseudo-autosomal regions.


To submit the design for probe selection:

  1. Click Begin Probe Selection.

    A message box opens indicating the e-mail address where Agilent will contact you when the probe selection job is complete. If desired, you can enter additional e-mail addresses into the provided field.

  2. Click OK in the notification message to submit the design to SureDesign.

    Your submission is placed in the SureDesign job queue to await probe selection.

    The wizard takes you to the Add/Review Content screen. The new probegroup appears in the Probegroup Summary table. The # Targets column of the table lists Processing until the probe selection job is complete. Click the refresh icon to see updates to this column.

    You receive an e-mail from Agilent SureDesign notifying you when your job is complete. In order to finalize the design, you must wait until your probe selection job is complete.