Upload baits |
|
You can prepare a file that contains bait data and upload it to your account in eArray. You can also create a new bait group based on the baits in the uploaded file. Later, you can use the baits to create a bait library. If you like, you can use a wizard to have eArray guide you through the creation and submission of a bait library with uploaded baits — see Create library from bait upload (wizard).
This help topic contains two sections:
Preparing a file of baits for upload
To upload baits to eArray, you must first create a file of baits in a format that eArray can interpret.
eArray supports these file types for bait uploads:
Microsoft Excel files (*.xls) – Note: If you use Microsoft Excel 2007 to create the file, save the file as an Excel 97-2003 workbook. This saves the file in the required *.xls format.
Tab-delimited text files (*.tdt or *.txt) – Put data for one bait on each line. Separate all values in each line with tab characters, even if the value is blank. Use new line characters at the ends of lines.
eArray supports these file formats. The list below the name of each format shows the columns that are supported.
Minimal:
Bait ID – A unique identifier for the bait sequence, containing up to 15 characters. Bait ID cannot be blank.
Bait sequence – The base sequence of the bait, in 5' to 3' orientation. The sequence must be 120 nucleotides in length, and must only contain the capital characters A, C, G, and T. All baits in the file must have the same length. Sequence cannot be blank.
Complete:
Bait ID – A unique identifier for the bait sequence, containing up to 15 characters. Bait ID cannot be blank.
Bait sequence – The base sequence of the bait, in 5' to 3' orientation. The sequence must be 120 nucleotides in length, and must only contain the capital characters A, C, G, and T. All baits in the file must have the same length. Sequence cannot be blank.
Genomic interval – The segment of the genome associated with the bait, for example chr1:1-10000. This column can be blank.
Bait genomic location – The exact position of the bait in the genome, for example chr1:1-169. This column can be blank.
Accessions – Unique identifier(s) that refer to a nucleotide sequence that is a target for the associated bait and/or a protein sequence that is a product of the target. Accessions are represented in a <source>|<ID> pair format. <source> is the symbol of the database from which the accession was derived and <ID> is the unique identifier accession. For example, ref|NM_015752 is a <source>|<ID> pair where ref (NCBI RefSeq) is the source and NM_015752 is the unique identifier for that source. The Accessions field can contain multiple <source>|<ID> pairs, delimited by pipe "|" characters. For example, gi|7657630|ref|NM_015752 is an allowable accession that gives both an NCBI gene identifier (gi), and a RefSeq identifier (ref) for the same bait sequence. Accessions can be blank.
GeneSymbols – A unique abbreviation for a gene name. GeneSymbols can be blank.
Description – A description of a phenotype, gene product, or its function. Description can be blank.
Strand – (SureSelect Target Enrichment application type only) The orientation of the bait, which can be + (for sense orientation) or – (for antisense orientation). If an orientation is not specified for a bait, eArray assumes sense orientation. Your bait file can contain both sense and antisense baits.
Note:
•
Currently, eArray supports enrichment libraries with a bait length
of 120 nucleotides.
•
(SureSelect Target Enrichment application type only) If you want to
upload antisense baits, or a combination of sense and antisense baits,
use the Complete file format.
This format includes the Strand
column, in which you enter the sense (+) or antisense (–) orientation of each bait.
In uploaded files, eArray:
Accepts columns in any order – You label columns as part of the upload process.
Accepts extra columns – When you label columns during the upload process, be sure to label any extra columns as Ignore.
Accepts, but does not interpret column headings – Be sure to mark My uploaded file contains column headings when you label columns during the upload process.
Does not accept double or single quotation marks, angle brackets, or forward or backward slashes.
Ignores blank lines.
Expects all entries within a row to be separated by tabs, even if the actual entry is blank.
Note: Upload no more than 200,000 baits at a time to eArray. Baits from very large uploaded files may not appear in your account for an extended period of time.
Set the application type to SureSelect Target Enrichment or SureSelect RNA Enrichment.
Click the Workspace tab, or enter a collaboration.
Click Baits
> Upload.
The Bait Parameter and File Details page appears.
Enter the following parameters. All are required.
Parameter |
Instructions/Details |
Library Category |
(Read-only, SureSelect RNA Enrichment application type only) eArray supports RNA Capture libraries, which retrieve specific RNA species from a pool of RNAs. |
Bait Parameter Details |
|
Species |
Select the desired species from the list. |
Remove replicate baits from upload |
A replicate bait has the same Bait ID as another bait in the file. If you mark this check box, eArray uploads the first bait in each set of replicate baits in your file, and ignores the others.
|
Baits Precedence
|
These options specify what eArray does if it finds baits in your uploaded file that match (have the same Bait ID and sequence as) baits that already exist in the system. Select one of these options: Overwrite matching baits – The annotation of the matching uploaded bait replaces the annotation of the existing bait. You can use this option to re-annotate existing baits. Skip matching baits – eArray ignores the matching uploaded bait. Cancel upload if any baits already exist – eArray cancels the entire upload process if it finds a matching uploaded bait. |
Length |
Currently, eArray supports a bait length of 120 nucleotides. |
Bait Upload File Details |
|
Upload Type |
Select one of these options: Upload Baits only – Creates baits from the data in the uploaded file, and makes them available to you in eArray as individual baits. Create New Bait Group – Creates baits from the data in the uploaded file, and puts all of the baits into a bait group. Type a name for the bait group. For the SureSelect RNA Enrichment application type, if you select this option, you must also select the tiling frequency that applies to the baits in your uploaded file. eArray uses this value to calculate base coverage. |
Upload File |
|
File Format |
Select MINIMAL or COMPLETE. The file format defines the specific kinds of data available in the uploaded file. See Preparing a file of baits for upload, above. |
File Type |
Select the appropriate file type from the drop-down list. The file type defines how the data items in the file are specified and separated. eArray accepts tab-delimited text (*.tdt and *.txt) and Microsoft Excel (*.xls) files.
|
Tiling Frequency |
(SureSelect RNA Enrichment application type only, appears if you select Create New Bait Group in Upload Type) Select the tiling frequency that applies to the baits in your uploaded file. Use the following as a guide:
|
Click Next.
The Define Uploaded File Columns pane appears at the bottom of the
page, with a preview of the first few lines of data in your file.
eArray does not interpret any existing column heading data in a bait file. If the first row of your bait file is actually a row of column headings, mark My uploaded file contains column headings to prevent eArray from interpreting the column headings in the file as a set of bait data.
From the list below each column,
select the label that best matches the data in the specific column
above it.
Bait ID and Bait Sequence are required columns. If you want the upload
process to ignore a specific column, select Ignore.
Use each label exactly once, except Ignore, which you can use any
number of times.
Click Upload.
eArray processes the uploaded file, and creates the baits. If you selected
Create New Bait Group, it also creates a bait group.
A message tells you that your file was successfully submitted to the
upload queue, and that eArray will send you an e-mail when the upload
process is finished.
Click Close.
You can monitor the progress of the upload in the Pending Jobs pane
on the workspace home page. After eArray finishes your upload, you
can search for your new baits
and bait group, and use them
to construct
a library.
The Bait Upload page appears, where you can set up another bait file
upload.
See also
SureSelect Target Enrichment libraries
SureSelect RNA Enrichment libraries