When you create a SureSelect design or probegroup using the advanced wizard, you have the option to upload a group of probes from a file.
Follow these requirements when preparing a probe upload file:
· The file must be a tab-delimited text file (*.txt or *.tdt) that has been saved within a compressed folder (*.zip).
· If your file includes comments, the comment lines must start with a "#" symbol.
· For target enrichment probes, the probe information in the file must be in a 4-column, 6-column, 7-column (RNA designs only), or 8-column format, with each column completed as described in the tables below. The 4-column file is a minimal format for use when you do not wish to enter the genomic coordinates of the probes, while the 6- and 8-column formats require genomic coordinates to be provided.
· For microarray probes, the probe information in the file must be in a 2-column or 6-column format, with each column completed as described in the tables below.
The columns of a 2-column probe upload file for microarray designs are described in the table below.
NOTE SureDesign does not download annotation information for probes uploaded from a file. Probes uploaded from a 2-column file do not have any annotation information associated with them.
Column # |
Column header |
Description of column content |
1 |
ProbeID |
The ProbeID is a unique identifier for the probe sequence. The ProbeID can be up to 100 characters long. |
2 |
Sequence |
This column contains the complete sequence of the probe in 5' to 3' orientation. The sequence can contain only A, T, G, and C characters. All sequences must be 20-60 nucleotides long. |
The columns of a 4-column probe upload file for SureSelect or HaloPlex designs are described in the table below.
Column # |
Column header |
Description of column content |
1 |
TargetID |
The TargetID is an identifier that describes the target of the probe sequence. For example, the TargetID may be gene name (e.g. BRCA1). You can have more than one probe with the same TargetID. The Target ID can be up to 100 characters long. |
2 |
ProbeID |
The ProbeID is a unique identifier for the probe sequence. The ProbeID can be up to 100 characters long. |
3 |
Sequence |
This column contains the complete sequence of the probe in 5' to 3' orientation. The sequence can contain only A, T, G, and C characters. All sequences must be 120 nucleotides long. |
4 |
Replication |
The number in this column indicates the number of times that the probe is replicated within the probegroup. This column allows you to control the replication number for each probe in the probegroup. However, you can enter a 1 in this column for all probes, and then override that entry by selecting Balanced, Max Performance, or Max Performance - XTHS/XT Low Input only in the wizard's Boosting setting. With either of these boosting options, SureDesign assigns a replication number to each probe based on its GC content. |
The columns of a 6-column probe upload file for microarray designs are described in the table below.
Column # |
Column header |
Description of column content |
1 |
ProbeID |
The ProbeID is a unique identifier for the probe sequence. The ProbeID can be up to 100 characters long. |
2 |
Sequence |
This column contains the complete sequence of the probe in 5' to 3' orientation. The sequence can contain only A, T, G, and C characters. All sequences must be 20-60 nucleotides long. |
3 |
Coordinates |
This column lists the genomic coordinates of the probe. SureDesign uses the probe coordinates to compute the total capture size of the design. Coordinates must be provided in standard browser format, e.g. chr1:1-100. |
4 |
Accessions |
This column lists any accession numbers for sequences that overlap the probe coordinates. Separate the database source and accession number with a "|" character (e.g. ref|NM_008837, for RefSeq accession numbers). The database source can be up to 5 characters long. The accession number can be up to 25 characters long. Separate multiple accessions with a "|" character. You can enter up to 20 accession numbers. |
5 |
Gene Symbol |
This column lists any genes that overlap the probe coordinates. Enter the gene as a name (e.g. "Ube1x") or as a database source followed by a gene symbol (e.g. "ref|Prkcabp"). Separate multiple genes with a "|" character. |
6 |
Description |
Use this column to add any description of your choosing. The description can be up to 200 characters long. |
The columns of a 6-column probe upload file for SureSelect or HaloPlex designs are described in the table below.
Column # |
Column header |
Description of column content |
1 |
TargetID |
The TargetID is an identifier that describes the target of the probe sequence. For example, the TargetID may be gene name (e.g. BRCA1). You can have more than one probe with the same TargetID. The Target ID can be up to 100 characters long. |
2 |
ProbeID |
The ProbeID is a unique identifier for the probe sequence. The ProbeID can be up to 100 characters long. |
3 |
Sequence |
This column contains the complete sequence of the probe in 5' to 3' orientation. The sequence can contain only A, T, G, and C characters. All sequences must be 120 nucleotides long. |
4 |
Replication |
The number in this column indicates the number of times that the probe is replicated within the probegroup. This column allows you to control the replication number for each probe in the probegroup. However, you can enter a 1 in this column for all probes, and then override that entry by selecting Balanced, Max Performance, or Max Performance - XTHS/XT Low Input onlyin the wizard's Boosting setting. With either of these boosting options, SureDesign assigns a replication number to each probe based on its GC content. |
5 |
Strand |
An entry of "+" indicates that the probe is the sense strand and it captures the antisense strand of the target. An entry of "–" indicates that the probe is the antisense strand and it captures the sense strand of the target. For each probe, the Strand column must contain a single "+" or "–" character. |
6 |
Coordinates |
This column lists the genomic coordinates of the probe. SureDesign uses the probe coordinates to compute the total capture size of the design. Coordinates must be provided in standard browser format, e.g. Chr1:1-100. |
The columns of an 7-column probe upload file are described in the table below.
Column # |
Column header |
Description of column content |
1 |
TargetID |
The TargetID is an identifier that describes the target of the probe sequence. For example, the TargetID may be gene name (e.g. BRCA1). You can have more than one probe with the same TargetID. The Target ID can be up to 100 characters long. |
2 |
ProbeID |
The ProbeID is a unique identifier for the probe sequence. The ProbeID can be up to 100 characters long. |
3 |
Sequence |
This column contains the complete sequence of the probe in 5' to 3' orientation. The sequence can contain only A, T, G, and C characters. All sequences must be 120 nucleotides long. |
4 |
Replication |
The number in this column indicates the number of times that the probe is replicated within the probegroup. This column allows you to control the replication number for each probe in the probegroup. However, you can enter a 1 in this column for all probes, and then override that entry by selecting Balanced, Max Performance, or Max Performance - XTHS/XT Low Input only in the wizard's Boosting setting. With either of these boosting options, SureDesign assigns a replication number to each probe based on its GC content. |
5 |
TranscriptLocation |
This column lists the location of the probe within the target transcript sequence. The required format is similar to that for a genomic location (i.e., standard browser format of one-based, closed) but the transcipt ID replaces the chromosome number. |
6 |
Strand |
An entry of "+" indicates that the probe captures the sense strand of the target. An entry of "–" indicates that the probe captures the anti-sense strand of the target. For each probe, the Strand column must contain a single "+" or "–" character. |
7 |
Chromosome |
This column lists the chromosome number of the probe. SureDesign uses the probe coordinates (in columns 6, 7, and 8) to compute the total capture size of the design. The chromosome number must be provided in standard BED format, e.g. "Chr1". |
The columns of an 8-column probe upload file are described in the table below.
Column # |
Column header |
Description of column content |
1 |
TargetID |
The TargetID is an identifier that describes the target of the probe sequence. For example, the TargetID may be gene name (e.g. BRCA1). You can have more than one probe with the same TargetID. The Target ID can be up to 100 characters long. |
2 |
ProbeID |
The ProbeID is a unique identifier for the probe sequence. The ProbeID can be up to 100 characters long. |
3 |
Sequence |
This column contains the complete sequence of the probe in 5' to 3' orientation. The sequence can contain only A, T, G, and C characters. All sequences must be 120 nucleotides long. |
4 |
Replication |
The number in this column indicates the number of times that the probe is replicated within the probegroup. This column allows you to control the replication number for each probe in the probegroup. However, you can enter a 1 in this column for all probes, and then override that entry by selecting Balanced, Max Performance, or Max Performance - XTHS/XT Low Input only in the wizard's Boosting setting. With either of these boosting options, SureDesign assigns a replication number to each probe based on its GC content. |
5 |
Strand |
An entry of "+" indicates that the probe captures the sense strand of the target. An entry of "–" indicates that the probe captures the anti-sense strand of the target. For each probe, the Strand column must contain a single "+" or "–" character. |
6 |
Chromosome |
This column lists the chromosome number of the probe. SureDesign uses the probe coordinates (in columns 6, 7, and 8) to compute the total capture size of the design. The chromosome number must be provided in standard BED format, e.g. "Chr1". |
7 |
Start |
This column lists the nucleotide start position of the probe. SureDesign uses the probe coordinates (in columns 6, 7, and 8) to compute the total capture size of the design. The nucleotide number must be provided in standard BED format (0-based, half-open). |
8 |
Stop |
This column lists the nucleotide stop position of the probe. SureDesign uses the probe coordinates (in columns 6, 7, and 8) to compute the total capture size of the design. The nucleotide number must be provided in standard BED format (0-based, half-open). |
See Also
Overview of the SureDesign advanced options
View and search for probegroups