Upload baits

 

You can prepare a file that contains bait data and upload it to your account in eArray. You can also create a new bait group based on the baits in the uploaded file. Later, you can use the baits to create a bait library. If you like, you can use a wizard to have eArray guide you through the creation and submission of a bait library with uploaded baits — see Create library from bait upload (wizard).

This help topic contains two sections:

Preparing a file of baits for upload

To upload a bait file

Preparing a file of baits for upload

To upload baits to eArray, you must first create a file of baits in a format that eArray can interpret.

eArray supports these file types for bait uploads:

eArray supports these file formats. The list below the name of each format shows the columns that are supported.

Bait ID – A unique identifier for the bait sequence, containing up to 15 characters. Bait ID cannot be blank.

Bait sequence – The base sequence of the bait, in 5' to 3' orientation. The sequence must be 120 nucleotides in length, and must only contain the capital characters A, C, G, and T. All baits in the file must have the same length. Sequence cannot be blank.

Bait ID – A unique identifier for the bait sequence, containing up to 15 characters. Bait ID cannot be blank.

Bait sequence – The base sequence of the bait, in 5' to 3' orientation. The sequence must be 120 nucleotides in length, and must only contain the capital characters A, C, G, and T. All baits in the file must have the same length. Sequence cannot be blank.

Genomic interval – The segment of the genome associated with the bait, for example chr1:1-10000. This column can be blank.

Bait genomic location – The exact position of the bait in the genome, for example chr1:1-169. This column can be blank.

Accessions – Unique identifier(s) that refer to a nucleotide sequence that is a target for the associated bait and/or a protein sequence that is a product of the target. Accessions are represented in a <source>|<ID> pair format. <source> is the symbol of the database from which the accession was derived and <ID> is the unique identifier accession. For example, ref|NM_015752 is a <source>|<ID> pair where ref (NCBI RefSeq) is the source and NM_015752 is the unique identifier for that source. The Accessions field can contain multiple <source>|<ID> pairs, delimited by pipe "|" characters.  For example, gi|7657630|ref|NM_015752 is an allowable accession that gives both an NCBI gene identifier (gi), and a RefSeq identifier (ref) for the same bait sequence. Accessions can be blank.

GeneSymbols – A unique abbreviation for a gene name. GeneSymbols can be blank.

Description – A description of a phenotype, gene product, or its function. Description can be blank.

Strand – (SureSelect Target Enrichment application type only) The orientation of the bait, which can be + (for sense orientation) or – (for antisense orientation). If an orientation is not specified for a bait, eArray assumes sense orientation. Your bait file can contain both sense and antisense baits.

In uploaded files, eArray:

 

To upload a bait file

  1. Set the application type to SureSelect Target Enrichment or SureSelect RNA Enrichment.

  2. Click the Workspace tab, or enter a collaboration.

  3. Click Baits > Upload.

    The Bait Parameter and File Details page appears.

  4. Enter the following parameters. All are required.

Parameter

Instructions/Details

Library Category

(Read-only, SureSelect RNA Enrichment application type only) eArray supports RNA Capture libraries, which retrieve specific RNA species from a pool of RNAs.

Bait Parameter Details

Species

Select the desired species from the list.

Remove replicate baits from upload

A replicate bait has the same Bait ID as another bait in the file. If you mark this check box, eArray uploads the first bait in each set of replicate baits in your file, and ignores the others.

  • Note: If your bait file contains replicate baits, and you do not mark Remove replicate baits from upload, eArray displays an error message after you begin the upload, and does not upload your file.

Baits Precedence

 

These options specify what eArray does if it finds baits in your uploaded file that match (have the same Bait ID and sequence as) baits that already exist in the system. Select one of these options:

Overwrite matching baits – The annotation of the matching uploaded bait replaces the annotation of the existing bait. You can use this option to re-annotate existing baits.

Skip matching baits – eArray ignores the matching uploaded bait.

Cancel upload if any baits already exist – eArray cancels the entire upload process if it finds a matching uploaded bait.

Length

Currently, eArray supports a bait length of 120 nucleotides.

Bait Upload File Details

Upload Type

Select one of these options:

Upload Baits only – Creates baits from the data in the uploaded file, and makes them available to you in eArray as individual baits.

Create New Bait Group – Creates baits from the data in the uploaded file, and puts all of the baits into a bait group. Type a name for the bait group. For the SureSelect RNA Enrichment application type, if you select this option, you must also select the tiling frequency that applies to the baits in your uploaded file. eArray uses this value to calculate base coverage.

Upload File

  1. Click Browse.

  2. Select the desired file for upload, then click Open.
    The location of the file appears in Upload File.

File Format

Select MINIMAL or COMPLETE. The file format defines the specific kinds of data available in the uploaded file. See Preparing a file of baits for upload, above.

File Type

Select the appropriate file type from the drop-down list. The file type defines how the data items in the file are specified and separated. eArray accepts tab-delimited text (*.tdt and *.txt) and Microsoft Excel (*.xls) files.

  • Note: If you use Microsoft Excel 2007 to create the bait file, save the file as an Excel 97-2003 workbook. This saves the file in the required *.xls format.

Tiling Frequency

(SureSelect RNA Enrichment application type only, appears if you select Create New Bait Group in Upload Type) Select the tiling frequency that applies to the baits in your uploaded file. Use the following as a guide:

  • 1x – Baits are tiled end-to-end over each target in a single layer.

  • 2x – Baits overlap each other by 50% so that two baits cover each base in each target.

  • 3x – Baits overlap so that three baits cover each base in each target.

  • 4x – Baits overlap so that four baits cover each base in each target.
    vertical_dots.gif

  • 10x – Baits overlap so that ten baits cover each base in each target.

  1. Click Next.

    The Define Uploaded File Columns pane appears at the bottom of the page, with a preview of the first few lines of data in your file.

  2. eArray does not interpret any existing column heading data in a bait file. If the first row of your bait file is actually a row of column headings, mark My uploaded file contains column headings to prevent eArray from interpreting the column headings in the file as a set of bait data.

  3. From the list below each column, select the label that best matches the data in the specific column above it.  

    Bait ID and Bait Sequence are required columns. If you want the upload process to ignore a specific column, select Ignore. Use each label exactly once, except Ignore, which you can use any number of times.

  4. Click Upload.

    eArray processes the uploaded file, and creates the baits. If you selected Create New Bait Group, it also creates a bait group.

    A message tells you that your file was successfully submitted to the upload queue, and that eArray will send you an e-mail when the upload process is finished.

  5. Click Close.

    You can monitor the progress of the upload in the Pending Jobs pane on the workspace home page. After eArray finishes your upload, you can search for your new baits and bait group, and use them to construct a library.

    The Bait Upload page appears, where you can set up another bait file upload.

See also

Baits

SureSelect Target Enrichment libraries

SureSelect RNA Enrichment libraries