Searching for probes

The basic elements of a microarray are probes, and eArray provides many ways for you to define the probes that you want. One of the main ways is to search for them. In eArray, you can retrieve probes of interest with a variety of search methods. In addition, a probe search provides an easy way to construct a probe group based upon your own criteria. You then use one or more probe groups to create a microarray design. See Why create probe groups.

You can customize eArray to pre-set certain search parameters. You can also customize the content and column order of search results. See Set user preferences.

This help topic contains the following sections:

eArray probe search tools

Choosing a search tool

Additional guidance from Agilent

 

eArray probe search tools

You can search the probes in the eArray probe database in several ways:

Type of Search

What Is

How To

Example

Search Your Probes

What is

How to

Examples

GO Search
(Expression application type only)

What is

How to

Example

HD Probe Search
(CGH and ChIP application types only)

What is

How to

Example

SNP Probe Search
(CGH application type only)

What is

How to

Example

Exon Probe Search
(Expression application type only)

What is

How to

Example

 

 

Choosing a search tool

For probes that are already designed, the search methodology that you use to find probes depends upon your level of familiarity with the target sequences, and the specific kind of study that you want to do. For example, if you want to do a differential gene expression study, but you do not know specific transcript or gene identifiers, you can use GO Search to return probes for genes based upon biological process or molecular function.

You could also use Search Your Probes to take a similar, but less structured approach. When you enter one or more keywords as search criteria, you can search probe annotation, which often describes a target gene's function. A search can return all probes whose annotation contains the word or phrase you typed. If you know the specific targets for which you want probes, you can also use Search Your Probes to advantage. You can upload lists of GenBank IDs or gene symbols, and return all probes that match the criteria. This allows you to be very specific about your search criteria.

Each probe search tool has distinct advantages, as described in the table below.

Search tool

Comments

Search Your Probes

This tool gives you many options to retrieve the probes that you want. You can type a single search term, and the search can return all probes that contain the term in any of their annotation. This is a good search methodology to use when you first explore the content of the database, and want to see what types of probes exist in the system. For example, you can type the term kinase, and the search returns all probes that have this word in their annotation, including probes with annotation such as protein kinase C, delta, and hexokinase 3.

You can also select a specific type of annotation or accession, and enter one or many search terms of that type. This is most useful when you want to use identifiers such as probe IDs, GenBank IDs, or gene symbols. You can simply upload a list, and the search returns all probes that match.

GO Search

A Gene Ontology (GO) Search is a good way to identify probes for genes and gene products that are associated with biological processes, molecular functions, and/or cellular components that you want to investigate. You enter a standard GO term, and the search retrieves all of the probes that are associated with the term. eArray can also help you find relevant GO terms to use in the search. This type of search is available for standard Expression probes.

HD Probe Search

You can search the HD CGH and ChIP databases to return probes within specified genomic regions, at a very high density. You can then use these probes to create a microarray that has higher resolution than is seen for catalog array offerings for these applications. To do an HD search, you define the desired genomic regions, and the density or number of probes that you want.

SNP Probe Search

This specialized search is the only way to retrieve Agilent SNP probes, which are probes that are designed specifically for Agilent CGH+SNP microarrays. You can use CGH+SNP microarrays to deduce the genotypes of SNP sites, calculate allele-specific SNP copy numbers, and find regions of loss of heterozygosity (LOH). This search is available in the CGH application type. For more information, see CGH+SNP Microarrays.

Exon Probe Search

This specialized search is the only way to retrieve Agilent Exon probes, which are probes that are designed specifically for Gene Expression Exon microarrays. You can use these microarrays to study differential splicing between tissues, between disease and non-disease states (i.e. cancerous vs. non-cancerous), and between different forms of the same disease. This search is available in the Expression application type. For more information, see Exon microarrays.

 

 

Additional guidance from Agilent

How do I create a complex CGH search that yields probes at different resolutions for different genomic regions and also with different filtering criteria for different regions?

This needs to be done through different, iterative searches. For each search a different Probe Group will be created, the microarray is then designed by combining the different Probe Groups. Once all of the Probe Groups are created, an array calculator is available on the Microarray tab to help calculate which array format makes the best sense for the given number of probes. There are utilities available under the Probe Group page to compare Probe Groups and remove duplicates in different Probe Groups.

For the CGH application, what are the advantages and disadvantages of using the ‘Genomic Tiling’ function (under the “Probe” tab) compared to using Agilent HD-CGH Database Probes?

Overall, probes generated using the ‘Genomic Tiling’ function will perform more poorly than probes found in the Agilent HD-CGH Probe Database. Agilent very strongly recommends using the Agilent HD-CGH Probe Database and not the ‘Genomic Tiling’ option. Only for regions where there are not enough HD probes available in the database should ‘Genomic Tiling’ be considered.

All HD probes in the database (except for probes in regions in which no optimal Tm probes exist) have been Tm matched and have a predicted performance score (based on Tm, GC content, a hairpin ΔG, sequence complexity, and metrics to measure homology with the rest of the reference genome). The eArray pair-wise reduction algorithm will pick the best HD probes based on the user-selected average HD probe spacing per interval or the total number of HD probes.

Additionally, during design the HD probes have passed a Tm filter, are annotated such that a user can choose between different similarity filtering options (non-unique probe filter, perfect match filter, or similarity score filter), and if there are catalog probes present in search results they can be preferentially selected. In contrast, using eArray’s ‘Genomic Tiling’ feature probes are picked at a fixed spacing and there is no chance to Tm balance or optimize probes selected for by performance. Probes created should perform no better than those picked at random. The only options to improve probe performance are probe trimming and skipping of repeat masked regions.

See the figure below for an example of CGH data from Agilent HD-CGH probes compared to ‘Genomic Tiling’ probes. The median log2 ratios of the HD-CGH probes are closer to the expected value of -1 with a smaller spread when compared to the ‘Genomic Tiling’ probes.

When designing CGH microarrays, how can I avoid GC-rich, high-Tm or repeat regions?

In the CGH HD Probe Search, which ‘Similarity Filter’ will work for my design? What are the consequences of using or not using the filters?

When ‘No Filter’ is selected any probe may be selected, regardless of similarity to other genomic sites. Keep in mind that data from “non-unique” probes will be harder to interpret, and it can be beneficial to limit the maximum number of perfect genomic hits by using the Non-Unique Probe Filter option.

When the Perfect Match Filter (or when the maximum number of perfect genomic hits is set to 1 in the Non-Unique Probe Filter) is selected, probes with more than one perfect match to the genome are excluded and as a result it will not be possible to find probes in Segmental Duplication or Pseudo-Autosomal Regions (PAR).

The Similarity Score Filter is the most stringent filter. This filter excludes probes with significant similarity to other sites in the genome and there will be genomic regions where no probes can be found.

In the CGH HD Probe Search, how do I target exons only, not just genes?

Using the Table Browser utility from the UCSC genome browser, make sure the proper species and genome build are selected from the drop-down lists. It is important that the build matches eArray. For “track”, select desired gene definition track, Agilent recommends “CCDS”, “RefSeq Genes”, or “UCSC genes”. Follow the UCSC instructions on how to restrict the search. Options include filtering on genomic regions or track identifiers (names/accessions). Make sure the output contains chrom, exonStarts and exonEnds. This will give you all of the exon coordinates for the regions you defined.

To input these locations into an eArray HD Probe Search, split the line into pieces (one for each exon) and adjust the start coordinate (the output is 0-based, eArray expects 1-based). The orientation (i.e., strand information) does not matter in an eArray probe search. For example, the following line defining three exons (chrom; exon count; exon starts; exon ends):

chr1; 3; 2450045, 2451144, 2451409; 2451048, 2451310, 2451544

Can be converted to the format needed in eArray.:

chr1:2450046-2451048
chr1:2451145-2451310
chr1:2451410-2451544

See also

Working with probes in eArray