Target enrichment design files available for download

Different types of target enrichment designs have different sets of design files available for download. Use the links below to find more information on the files available for a particular design type.

SureSelect designs

Files for custom SureSelect DNA designs

Files for advanced SureSelect DNA designs

Files for custom SureSelect RNA designs

Files for advanced SureSelect RNA designs

Files for custom SureSelect Cancer CGP designs

Files for OneSeq designs

Files for SureSelect designs transferred from eArray

Files for SureSelect catalog designs

HaloPlex designs

Files for custom and catalog HaloPlex and HaloPlexHS designs

Files for advanced HaloPlex and HaloPlexHS designs

 


Files for custom SureSelect DNA designs

SureSelect DNA designs created in SureDesign with the standard wizard have PDF, BED, and text design files available for download.

PDF Report file

The PDF report file has the file name [design ID]_Report.pdf.

This file contains summary information on the submitted targets, probe selection parameters, coverage statistics, recommended minimum sequencing, and overall success of the design in covering the targets. This same information, and additional information, is provided in spreadsheet format in the Report text file.

BED files

The BED-format track files that SureDesign creates for custom SureSelect DNA designs are described below. You can import these files into a compatible genome browser to graphically view the locations of the tracks in the genome. For detailed information on the tracks and how they can help you analyze your design, see Design analysis using tracks.

[design ID]_Regions.bed - This BED file contains a single track of the target regions of interest that SureDesign used to select the probes. You can use this track to see the exact regions that the program was attempting to cover when selecting the probes.

[design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

[design ID]_AllTracks.bed - This multitrack BED file includes the following tracks:

·        The Target Regions track is identical to the track in the Regions BED file.

·        The Covered probes track is identical to the track in the Covered BED file.

·        The Padded Covered track contains the covered regions (from either the Covered or Covered_partial BED file) with 50 bp of padding added on each side of all regions.

·        The No Probes track contains any regions from the Target Regions track that are not included in the Covered probes track.

Text files

The text files for a custom SureSelect DNA design are described below. You can view these files in any text editor program (e.g., NotePad) or spreadsheet program (e.g., Excel). Any tables embedded in the text files are tab-delimited and contain column headers. Lines of text that start with a # character are comment lines.

[design ID]_Targets.txt - This file contains a list of the target identifiers that you entered when creating the design.

[design ID]_Report.txt - This file contains summary information on the design, the probes, the targets, and the parameters used to create the design.

 


Files for advanced SureSelect DNA designs

For custom SureSelect DNA designs created with the SureDesign advanced wizard, the program generates a set of download files that pertain to the whole design as well as a set of files for each probegroup in the design. The compressed folder contains a subfolder for each probegroup containing the appropriate set of files.

Files for the whole design

The files that pertain to the design are a PDF report file, a text file, and a BED-format track file. These files are described below.

·        [design ID]_Report.pdf - This file contains summary information on the submitted targets, probe selection parameters, coverage statistics, recommended minimum sequencing, and overall success of the design in covering the targets.

·        [design ID]_Report.txt - This file contains the same information as the Report PDF file along with some additional information.

·        [design ID]_Covered.bed OR [design ID]_Covered_partial.bed- This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

·        If all of the probegroups in the design have their own Covered BED file, then SureDesign generates a Covered BED file for the full design. This design-level Covered BED file contains the merged regions from all of the Covered BED files from the individual probegroups.

·        If some of the probegroups in the design do not have a Covered BED file (for example, if one of the probegroups was created with a 4-column probe upload file), then SureDesign generates a Covered_partial BED file. The Covered_partial BED file contains the merged regions from the probegroups that have Covered BED files.

·        If none of the probegroups in the design have a Covered BED file, then SureDesign does not generate a design-level Covered or Covered_partial BED file. This is the case for designs in which all of the probegroups in the design were created with a probe upload file.

·        [design ID]_Regions.bed OR [design ID]_Regions_partial.bed- This BED file contains a single track of the target regions of interest. It  contains the merged regions from all of the Regions BED files from the individual probegroups. Note that regions are merged per target — i.e., if different targets have overlapping regions, those regions are not merged. For probegroups created by tiling, the Regions BED file contains the regions that SureDesign was attempting to cover when selecting the probes for that probegroup. Probegroups created from a probe upload file only include a Regions BED file if one was provided by the user during creation of the probegroup.

·        If all of the probegroups in the design have their own Regions BED file, then SureDesign generates a Regions BED file for the full design. This design-level Regions BED file contains the merged regions from all of the Regions BED files from the individual probegroups.

·        If some of the probegroups in the design do not have a Regions BED file (for example, if one of the probegroups was created with a probe upload file and no Regions BED was provided by the user), then SureDesign generates a Regions_partial BED file. The Regions_partial BED file contains the merged regions from the probegroups that have Regions BED files.

·        If none of the probegroups in the design have a Regions BED file, then SureDesign does not generate a design-level Regions or Regions_partial BED file. This is the case for designs in which all of the probegroups in the design were created with a probe upload file with no Regions BED file provided.

·        [design ID]_AllTracks.bed - SureDesign generates this multitrack BED file only for advanced designs that have a complete Regions BED file and a complete Covered BED file. It includes the following tracks:

·        The Target Regions track is identical to the track in the Regions BED file.

·        The Covered probes track is identical to the track in the Covered BED file.

·        The Padded Covered track contains the covered regions (from either the Covered or Covered_partial BED file) with 50 bp of padding added on each side of all regions.

·        The No Probes track contains any regions from the Target Regions track that are not included in the Covered probes track.

·        The Missed Regions track contains any regions from the Target Regions track that are not included in the Padded Covered probes track.

Files for the individual probegroups

SureSelect DNA probegroups generated by selecting optimized probes, tiling, or selecting probes from an existing design or probegroup have the same files as a custom SureSelect design created with the standard SureDesign wizard. See Files for custom SureSelect DNA designs for descriptions of these files.

For SureSelect DNA probegroups generated from a probe upload job, the file set includes a Report text file. If the probe upload file included the coordinates of the probes (i.e., a 6-column or 8-column file), then SureDesign generates a Covered BED file for the probegroup. If a user-provided Regions BED file was uploaded during creation of the probegroup, then SureDesign generates a Regions BED file for the probegroup.

 


Files for custom SureSelect RNA designs

SureSelect RNA designs created in SureDesign with the standard wizard have PDF, BED, and text design files available for download.

PDF Report file

The PDF report file has the file name [design ID]_Report.pdf.

This file contains summary information on the submitted targets, probe selection parameters, coverage statistics, recommended minimum sequencing, and overall success of the design in covering the targets. This same information, and additional information, is provided in spreadsheet format in the Report text file.

BED files

The BED-format track files that SureDesign creates for custom SureSelect RNA designs are described below. You can import these files into a compatible genome browser to graphically view the locations of the tracks in the genome. For detailed information on the tracks and how they can help you analyze your design, see Design analysis using tracks.

[design ID]_Covered.bed OR [design ID]_Covered_partial.bed - Both of these BED files contain genomic intervals that correspond to the RNA transcript intervals that are covered by one or more probes in the SureSelect RNA design.

·        When SureDesign is able to successfully map all of the RNA transcripts covered by the design into the equivalent genomic coordinates, then a Covered BED file is generated.

·        When SureDesign is not able to translate all of the RNA transcripts covered by the design into equivalent genomic coordinates, then a Covered_partial BED file is generated. The Covered_partial BED file does not include the genomic intervals for the transcript intervals that SureDesign could not be mapped to genomic coordinates.

Text files

The three text files for a custom SureSelect RNA design are described below. You can view these files in any text editor program (e.g., NotePad) or spreadsheet program (e.g., Excel). Any tables embedded in the text files are tab-delimited and contain column headers. Lines of text that start with a # character are comment lines.

[design ID]_Targets.txt - This file contains a list of the target identifiers that you entered when creating the design.

[design ID]_Report.txt - This file contains summary information on the design, the probes, the targets, and the parameters used to create the design.

[design ID]_CoveredTranscript.txt -This text lists the transcript intervals that are covered by one or more probes in the design. Each line of the file is a separate interval. For each interval, the file lists the transcript ID that applies to that interval, as well as the start and stop positions (within the transcript ID) of the interval. Importantly, when target identifiers for SureSelect RNA designs are entered as gene names and/or gene IDs, SureDesign identifies all of the RNA transcripts for those genes that are present in the transcriptome, and uses those transcript sequences for probe selection.

[design ID]_UnmappedTranscripts.txt - This file is available for designs that have a Covered_partial BED file (described above). It contains a list of the transcript IDs that could not be mapped to genomic coordinates and the reason that they are unmapped. The two possible reasons are, 1) the transcript sequence was not found in the genome ("Transcript not mapped on genome"), or 2) the transcript maps to the genome but the sequence length does not match with the summation of exon lengths predicted by the genome sequence ("Transcript length mismatch"). The mismatch in length may the result of an insertion or deletion or the presence of poly-A tail.

 


Files for advanced SureSelect RNA designs

For custom SureSelect RNA designs created with the SureDesign advanced wizard, the program generates a set of download files that pertain to the whole design as well as a set of files for each probegroup in the design. The compressed folder contains a subfolder for each probegroup containing the appropriate set of files.

Files for the whole design

The files that pertain to the design are a PDF report file and text file. These files are described below.

·        [design ID]_Report.pdf - This file contains summary information on the submitted targets, probe selection parameters, coverage statistics, recommended minimum sequencing, and overall success of the design in covering the targets.

·        [design ID]_Report.txt - This file contains the same information as the Report PDF file along with some additional information.

Files for probegroups created with transcript tiling

SureSelect RNA probegroups generated by transcript tiling have the same files as a custom SureSelect RNA design created with the standard SureDesign wizard. See Files for custom SureSelect RNA designs for descriptions of these files.

Files for probegroups created with genomic tiling

For SureSelect RNA probegroups created with genomic tiling, the SureDesign algorithm finds probes by tiling across the targeted DNA sequences at the indicated density, much as it does when finding probes for a standard SureSelect DNA design. For this type of probegroup, SureDesign generates a set of BED files and text files.

You can import the BED files into a compatible genome browser to graphically view the locations of the tracks in the genome. You can view the text files in any text editor program (e.g., NotePad) or spreadsheet program (e.g., Excel). Any tables embedded in the text files are tab-delimited and contain column headers. Lines of text that start with a # character are comment lines.

The BED and text files for a SureSelect RNA probegroup created with genomic tiling are described below.

·        [design ID]_Regions.bed - This BED file contains a single track of the target regions of interest that SureDesign used to select the probes. You can use this track to see the exact regions that the program was attempting to cover when selecting the probes.

·        [design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics, but make note that the coverage applies to the genomic DNA sequences and not the RNA transcriptome selected for the design.

·        [design ID]_AllTracks.bed - This multitrack BED file includes the following tracks:

·        The Target Regions track is identical to the track in the Regions BED file.

·        The Covered probes track is identical to the track in the Covered BED file.

·        The Missed Regions track contains any regions from the Target Regions track that are not included in the Covered probes track.

·        [design ID]_Targets.txt - This file contains a list of the target identifiers that you entered when creating the probegroup.

·        [design ID]_Report.txt - This file contains summary information on the probes, targets, and the parameters used to create the probegroup.

Files for probegroups created by file upload

For SureSelect RNA probegroups generated from a probe upload job, SureDesign generates a set of text files.

·        [design ID]_Report.txt - This file contains summary information pertaining to the probegroup and the probe upload job.

·        [design ID]_Input_Probes.txt - This is the same file that you uploaded into the design wizard to create the probegroup.

·        [design ID]_Probes.txt - This file is a text file listing the probes that were uploaded into the probegroup.

Files for probegroups added from an existing design or probegroup

For SureSelect RNA probegroups added from an existing design or probegroup, the file set depends on which design/probegroup was selected as the source. If the source is a SureSelect DNA design/probegroup, see Files for custom SureSelect DNA designs. If the source is a SureSelect RNA design/probegroup, see Files for custom SureSelect RNA designs.

 


Files for custom SureSelect Cancer CGP designs

For custom SureSelect Cancer CGP designs created with the SureSelect Cancer CGP wizard or the SureSelect Cancer CGP Combined Design wizard, the program generates a set of download files for each probegroup in the design. The compressed folder contains a subfolder for each probegroup containing the appropriate set of files.

Files for the whole design

The files that pertain to the design are a PDF report file, a text file, and a BED-format track file. These files are described below.

·        [design ID]_Report.pdf - This file contains summary information on the submitted targets, exonic size, probe selection parameters, coverage statistics, recommended minimum sequencing, and overall success of the design in covering the targets.

·        [design ID]_Report.txt - This file contains the same information as the Report PDF file along with some additional information.

·        [design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

Files for the individual probegroups

SureSelect Cancer CGP probegroups have the following download files available:

·        [probegroup name]_Regions_SNV.bed - This BED file contains a single track of the coding regions of the target genes of interest that were used as input for SNV probe selection.

·        [probegroup name]_Regions_TL.bed - This BED file contains a single track of the target regions of interest that were used as input for translocation probe selection.

·        [probegroup name]_Regions_MSI.bed - This BED file is included in the MSI probegroup. It contains a single track of the target regions in the MSI probegroup.

·        [probegroup name]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the probegroup. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

·        [probegroup name]_Covered_SNV.bed - This BED file contains a single track of the genomic regions that are covered by one or more SNV probes in the probegroup.

·        [probegroup name]_Covered_CNV.bed - This BED file contains a single track of the genomic regions that are covered by one or more CNV probes in the probegroup.

·        [probegroup name]_Covered_TL.bed - This BED file contains a single track of the target regions of interest that are covered by one or more translocation probes in the probegroup.

·        [probegroup name]_Covered_MSI.bed - This BED file is included in the MSI probegroup. It contains a single track of the genomic regions that are covered by one or more probes in the MSI probegroup.

·        [probegroup name]_Covered_GenomicBackbone.bed - This BED file is included in the Agilent CGP CNV Backbone 100Kb probegroup. It contains a single track of the genomic regions that are covered by one or more probes in the probegroup.

·        [probegroup name]_AllTracks.bed - This multitrack BED file includes the following tracks:

·        The Target Regions-SNV track contains the coding regions of the target genes of interest that were used as input for SNV probe selection.

·        The Target Regions-TL track contains the full regions of the target genes of interest that were used as input for translocation probe selection.

·        The Covered track contains all of the genomic regions that are covered by one or more probes in the probegroup.

·        The Covered-CNV track contains all of the genomic regions that are covered by one or more CNV probes in the probegroup.

·        The Covered-SNV track contains all of the genomic regions that are covered by one or more SNV probes in the probegroup.

·        The Covered-TL track contains all of the genomic regions that are covered by one or more translocation probes in the probegroup.

·        The Missed-SNV track contains any regions from the Target Regions-SNV track that are not included in the Covered track.

·        The Missed-TL track contains any regions from the Target Regions-TL track that are not included in the Covered track.

·        [probegroup name]_Targets.txt - This file contains a list of the target identifiers that you entered when creating the probegroup.

·        [probegroup name]_Report.txt - This file contains summary information on the probes, the targets, and the parameters used to create the probegroup.

 


Files for OneSeq designs

For OneSeq designs, the program generates a set of download files that pertain to the whole design as well as a set of files for each probegroup in the design. The compressed folder contains a subfolder for each probegroup containing the appropriate set of files.

Files for the whole OneSeq design

The only file that pertains to the whole design is a BED-format track file called [design ID]_Covered.bed. This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

Files for the backbone design

·        [design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the backbone design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

·        [design ID]_AllTracks.bed - This multitrack BED file includes the following tracks:

·        The Target Regions track is identical to the Covered BED file for the backbone design.

·        The Covered probes track contains the genomic regions covered by the backbone design.

Files for the spike-in design

The files available for the spike-in design are the same as those available for a custom or advanced SureSelect design.

 


Files for SureSelect designs transferred from eArray

For SureSelect custom designs that were originally created in Agilent eArray and then migrated into SureDesign, the only available design file is a BED-format track file of the name [design ID]_Covered.bed. This file contains a single track of the genomic regions that are covered by one or probes in the design.

 


Files for SureSelect catalog designs

SureDesign provides BED-format track files and a text file for each SureSelect catalog design.

BED files

You can import these files into a compatible genome browser to graphically view the locations of the tracks in the genome.

[design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

[design ID]_Padded.bed - This BED file is provided for catalog SureSelect DNA designs. It contains a single track of the genomic regions that you can expect to sequence when using the design for target enrichment. To determine these regions, the program extends the regions in the Covered BED file by 100 bp on each side.

[design ID]_Regions.bed - This BED file is provided for catalog SureSelect DNA designs. It contains a single track of the target regions of the design. For catalog designs, the track in this file is identical to the track in the Covered BED file.

 NOTE  Some SureSelect catalog designs have not been annotated to gene databases. For these designs, the gene annotation information is missing from the Covered BED file.

Text files

You can view the text files in any text editor program (e.g., NotePad) or spreadsheet program (e.g., Excel).

[design ID]_Targets.txt - This file is provided for catalog SureSelect DNA designs. It  contains a list of the database identifiers to which the probes in this design were annotated.

 


Files for custom and catalog HaloPlex and HaloPlexHS designs

Catalog HaloPlex designs and designs created with the standard HaloPlex wizard have seven design files available for download: 1 PDF report, 4 BED files, and 2 text files.

PDF Report file

The PDF report file has the file name [design ID]_Report.pdf.

This file contains summary information on the submitted targets, probe selection parameters, coverage statistics, recommended minimum sequencing, and overall success of the design in covering the targets. This same information, and additional information, is provided in spreadsheet format in the Report text file.

BED files

The four BED-format track files that SureDesign creates for each catalog or standard custom HaloPlex design are described below. You can import these files into a compatible genome browser to graphically view the locations of the tracks in the genome. For detailed information on the tracks and how they can help you analyze your design, see Design analysis using tracks.

[design ID]_Regions.bed - This BED file contains a single track of the target regions that SureDesign used to select the probes. You can use this track to see the exact regions that the program was attempting to cover when selecting the probes.

[design ID]_Covered.bed - This BED file contains a single track of the genomic regions that are covered by one or more probes in the design. The fourth column of the file contains annotation information. You can use this file for assessing coverage metrics.

[design ID]_Amplicons.bed - This BED file contains a single track of the genomic regions of the expected PCR amplicons for all probes in the design. It also lists the Amplicon ID and strand for each amplicon.

[design ID]_AllTracks.bed - This multitrack BED file includes the following tracks:

·        The Target Regions track is identical to the track in the Regions BED file.

·        The Covered track is identical to the track in the Covered BED file.

·        The Missed Regions track contains any regions from the Target Regions track that are not included in the Covered track.

·        The Amplicons track is identical to the track in the Amplicons BED file.

Text files

The two text files for HaloPlex designs are described below. You can view these files in any text editor program (e.g., NotePad) or spreadsheet program (e.g., Excel). Any tables embedded in the text files are tab-delimited and contain column headers. Lines of text that start with a # character are comment lines.

[design ID]_Targets.txt - This file contains a list of the target identifiers that you entered when creating the design.

[design ID]_Report.txt - This file contains summary information on the design, the probes, the targets, and the parameters used to create the design.

 


Files for advanced HaloPlex and HaloPlexHS designs

For custom HaloPlex designs created with the advanced wizard, the program generates a set of download files that pertain to the whole design as well as a set of files for each probegroup in the design. The compressed folder contains a subfolder for each probegroup containing the appropriate set of files.

The files that SureDesign generates for the advanced design as a whole are the same seven files that it generates for a standard custom HaloPlex design (1 PDF report, 4 BED files, and 2 text files). These same files, minus the PDF report, are also generated for each individual probegroup in the design. See Files for custom and catalog HaloPlex and HaloPlexHS designs for descriptions.

 

 

See Also

Impact of parameters on probe selection

Design analysis using tracks

Download microarray design files