自定义微阵列设计指导 |
|
安捷伦提供以下针对不同应用的指导,来帮助您创建最优的微阵列设计:
更多指导请参见eArray常见问题 (FAQs).
安捷伦建议您在生成自定义Gene Expression微阵列时,设计并包含以下类型的探针设计集:
Agilent的negative control probes for optimal background-subtraction with the Agilent Feature Extraction software. Agilent negative control probes are included in Agilent’s QC grid. If the use of customized negative controls probes is desired, we recommend that they be designated as non control probes for the purposes of Agilent’s Feature Extraction software.
Replicated non-control probes for use in the Multiplicative Detrending step of the Agilent Feature Extraction software. The Multiplicative Detrending step detects and corrects for trends in array uniformity and uses replicated non-control probes as a default. A minimum of 15 probes should be replicated 5-10 times on each 微阵列 design. If replicated probes are not used, the default settings should be adjusted in the Feature Extraction protocol.
Probe set representing non-differentially expressed genes for use in accurate normalization of 微阵列 experiments, where typical normalization assumptions about differential expression are not met due to a relatively low probe count or strong bias in differential expression. These probes should span the full range of signal intensity for optimal normalization. Agilent recommends that a minimum of 1% of the non control probes be part of this list for each custom design. If prior knowledge of non-differentially expressed genes is unavailable, we recommend that these probes be selected to randomly cover the dynamic range of the experiment. For these custom Gene Expression designs, 微阵列 data should be normalized using data from these control probes. For custom “whole genome” type arrays, inclusion of a normalization gene list is generally not necessary.
The process for generation and use of a DyeNorm Gene List for two-color 微阵列 data analysis is described in the Feature Extraction Software User Guide. The non-differentially expressed gene list can also be used for one-color 微阵列 data normalization in downstream applications such as GeneSpring GX, as described in the GeneSpring GX Software User Manual.
更多关于基因表达(Gene Expression)微阵列的指导请参见eArray常见问题 (FAQs).
Agilent’s High-Definition Comparative Genomic Hybridization (HD-CGH) database provides users the flexibility to create custom 微阵列 designs for analysis of genome regions of interest to them at the resolution of their choosing. Agilent recommends including the following types of probe sets when generating custom CGH 微阵列 designs:
Normalization control probes that represent non-aberrant/non-variant regions for the purpose of accurate normalization of the data using Agilent’s Feature Extraction software. Agilent provides a control probe group which can be used or the user can select a minimum of 1000 probes. The normalization probe group needs to be specified in Agilent’s Feature Extraction software for the purpose of proper normalization.
Replicate probes for determining the Reproducibility QC metric in Agilent’s Feature Extraction and DNA Analytics software. The Reproducibility metric is calculated as the Median % CV of background-subtracted signal for these replicate probes after outlier rejection. If you choose not to use this probe group, it is important to include a set of replicate probes for calculating reproducibility. Agilent recommends a minimum of 1,000 probes replicated five times.
更多关于自定义CGH微阵列的指导请参见eArray常见问题 (FAQs).
Agilent recommends that you take the following guidelines under consideration when designing your array-based DNA capture 微阵列s:
Use as tight a probe spacing as your application allows. Agilent recommends a probe spacing of 3-bp (between probe starts). This is because Agilent has done the most extensive testing on 3-bp probe-spacing, which enables targeting of 700kb-800kb of the genome, depending on the specific number of independent regions that you target. Agilent has also obtained successful capture results through limited testing of larger probe spacing (including 15 bp and 20 bp), but has observed lower average read depths. In these limited tests, the drop in read-depth has been a little better than expected (only 75% of the expected drop, on average). Depending on both your application and on your sequencing throughput, the lower read-depth may or may not be adequate for your needs.
Pad (extend) your intervals by 100bp-200bp on either side. Although large numbers of reads are typically observed at interval endpoints, Agilent has observed that optimal capture depths are achieved 100 bp – 200 bp inside the interval boundaries. We therefore recommend that you extend your intervals by 100bp-200bp, depending on your needs. For example, if you are targeting exons, we recommend not to use exon endpoints directly, but rather to use a set of coordinates that start 100bp before each exon’s start and 100bp after each exon’s end. If you use eArray’s Genomic Tiling utility to select your probes, you may use it to extend your intervals.
Be aware of duplicated and "repeat" regions. Agilent does not check that all probes target unique regions of the genome. The only exceptions are probes that contain known repeat regions (as identified by Repeat Masker), which are omitted from the design if you use eArray’s Genomic Tiling utility to design your probes. Note that probes that target duplicate regions may produce spurious results or they may produce reads that are too ambiguous to map by your sequencing software. In practice, we observe that such probes are relatively rare. To an extent, this issue is a natural consequence of the duplication that occurs in complex genomes.
For more information about design considerations for array-based DNA capture, please consult Agilent’s application note Complementing Next Generation Sequencing Technologies: Capture and Release Assay Using Agilent DNA 微阵列s (#5989-8700EN).
Agilent的microRNA微阵列解决方案提供一个健全且精度高的方法来从全部RNA中监测microRNA。安捷伦已经介绍了“人类”、“老鼠”和“田鼠” 微阵列 solution provides a robust and sensitive method for detection of microRNAs from total RNA. Agilent has introduced Human, Mouse and Rat catalog arrays, which have been designed and empirically tested to provide sensitive and specific measurements of all microRNAs from the Sanger miRBase database for these species.
eArray enables the design of custom microRNA arrays: researchers may design 微阵列s measuring the microRNAs of their choosing from the Sanger miRBase database. The design principles used in the design of our catalog arrays, outlined below, are also applied for these custom arrays. This approach reduces uncertainty around the design of custom arrays while continuing to provide the most sensitive and robust assay to meet the needs of researchers. eArray allows the flexibility for researchers to study the microRNAs of their choice on the 8x15K format.
Agilent microRNA array design principles
Before designing a custom microRNA array, it is useful to understand some of the underlying principles of the Agilent microRNA platform:
Probe design and labeling methods that are linked. The mature microRNAs are labeled via the ligation of a Cy3 conjugated pCp molecule to the 3’ end of the microRNA. This labeling reaction introduces an additional “ C” base to the 3’ end of all of the labeled RNA molecules. During probe design, we take advantage of this “ C” base, by adding an additional “ G” to the 5’ end of the active probe sequence. The addition of this G: C base pair to the probe: microRNA interaction helps stabilize the interaction, and provides some additional selectivity to labeled mature microRNAs.
Multiple probes and probe replicates for each microRNA. Each microRNA represented on an Agilent microRNA array is measured by multiple probes. In addition, each probe sequence is replicated multiple times. This replication allows for both improved robustness, as outlier features are removed during data summarization in Feature Extraction, and improved sensitivity, as the presence of the probe replicates helps drive the hybridization reaction towards equilibrium.
Robust data summarization. The data summarization procedures used in the Agilent Feature Extraction software allow for the summarization of the multiple probes and probe replicates into a robust measurement for each microRNA. This measurement, the “ TotalGeneSignal,” is found in both of the Feature Extraction output files: the full text and “ GeneView” files. Details as to how the data summarization is completed can be found in the Feature Extraction Reference Guide.
Guidance for design of microRNA custom 微阵列s
Agilent has predesigned probes to all the mature microRNAs for all species* in the most recent Sanger miRBase release. Both the identification of the appropriate probe sequences and the methods implemented for custom microRNA 微阵列 design are designed to provide a robust sensitive and specific measurement of your microRNAs of interest. All designs are based on the 8x15K format: in any microRNA design, customers have the ability to create “sets” to accommodate designs requiring more than 15,000 features.
[*NOTE: Although Agilent has designed probes for all species in the Sanger miRBase database, users should be aware that the Agilent labeling protocol requires an accessible hydroxyl group on the 3’ end of the microRNA for ligation of a dye-conjugated pCp molecule. microRNAs from some species (mostly plants) have a 3’ modification which may interfere with the Agilent labeling method. Researchers interested in studying 3’ modified microRNAs likely will need to use alternative labeling methods and should plan to conduct experiments to optimize the assay conditions.]
General Design Guidance
Probe search: Searching for probes in eArray is based on the premise that each microRNA is represented by multiple probes. Probes are therefore returned in groups, based on the microRNA to which they are designed. Searches for probes can be performed against multiple miRBase builds, but probes for a microRNA present in the latest build will be returned only when including that build in the search. See below for more information on multiple miRBase build designs.
微阵列 Formats: At the present time, only the 8x15K format is enabled in eArray, because microRNA probe and assay design, including probe replicates, RNA input requirements, and labeling reaction conditions are based on that format. For users wishing to measure more microRNAs than can be accommodated on this format, a “微阵列 Set” can be created. More information on creating 微阵列 sets can be found below.
微阵列 Layout: Selected probes will be laid out based on the design principles outlined above. Users have the option of representing each microRNA by 16 or 20 features. Twenty features will result in slightly more robust data, while 16 features will allow for the inclusion of more microRNAs per array. All probes will be randomly distributed on the array which results in probe replicates being spread across the array yielding the most robust downstream data. Any space not used for the selected probes will be filled by a structural control and considered “blank.” These “blank” features will be ignored in the downstream analysis.
Guidance for specific cases
Creating an array from multiple miRBase build designs
The eArray probe database contains probes designed to the current Sanger miRBase release. It also contains probes to microRNAs that were present in selected former database builds that may have been removed or changed in the intervening periods. For the earliest Sanger miRBase builds, eArray only contains probes for selected species (9.1, human only; 10.1 and 11.0, human, mouse and rat).
It is possible to design an
array containing probes to microRNAs from multiple database builds.
The systematic ID given to the “old-build specific” microRNAs
will be appended with the database version (e.g., hsa-miR-139_v9.1).
Note that not all microRNA sequence changes will result in new
probes being designed (e.g., addition of one base to the 5’ end).
If the probe sequences did not change, the microRNA is considered
unchanged.
Due to the data processing steps in Feature Extraction, each microRNA sequence being measured must have a unique systematic name, to enable proper TotalGeneSignal calculation.
Combined with the fact that some microRNAs may be present in multiple database build with the same name, but different sequences, requires us to alter the previous names.
microRNA sequences from previous builds that are unchanged in the current build are searchable in the most recent build only. microRNAs with name changes between builds are searchable in both builds, but the primary annotation is based on the newest build.
Creating multi-species arrays – It is possible to design a multi-species array using eArray. Due to the similarity of certain cross-species microRNAs and the design of the microRNA assay, the layout of such designs needs to be done very carefully. This system is designed to allow you as much flexibility as possible to design multi-species arrays while maintaining the Agilent microRNA 微阵列 design principles.
Primary species must be identified.
When microRNAs are selected for a given design from multiple species
and those microRNAs are identical, probes to only one of those
microRNAs will be incorporated in the design. A species priority
order will be used to determine which probes will be incorporated
in the design, with the primary species getting the first priority.
Users must choose the primary species when creating a 微阵列. Remaining
priority species is according to Agilent’s pre-defined species
priority order.
Probes for all microRNAs of the first priority species will be incorporated in the design.
Probes for all microRNAs for the 2nd priority species that are not already measured by the existing probes are added to the design.
Probes for the microRNAs for the remaining species are added to the design according to the species priority order.
Annotation considerations. Probes will be annotated on the array in line with the aforementioned priority order.
Creating a “微阵列 Set.” – For users who wish to measure more than the number of microRNAs than can be accommodated on one design, eArray has the ability to generate a microRNA “微阵列 Set.” In this instance, probes for the different microRNAs selected will be incorporated across the number of arrays needed to include all the selected microRNAs. 微阵列 designs that are part of sets cannot be split into individual designs within eArray.
Agilent已经为多物种microRNA微阵列设计建立起了以下的物种优先级列表。详细信息,请参阅上面的创建多物种微阵列。
优先级顺序 |
物种名称 |
常用名称 |
1 |
Homo sapiens |
human |
2 |
Mus musculus |
mouse |
3 |
Rattus norvegicus |
rat |
4 |
Pan troglodytes |
chimp |
5 |
Macaca mulatta |
rhesus monkey |
6 |
Gallus gallus |
chicken |
7 |
Oryza sativa |
rice |
8 |
Ornithorhynchus anatinus |
platypus |
9 |
Physcomitrella patens |
moss |
10 |
Populus trichocarpa |
poplar |
11 |
Arabidopsis thaliana |
arabadopsis |
12 |
Danio rerio |
zebrafish |
13 |
Canis familiaris |
dog |
14 |
Caenorhabditis elegans |
worm |
15 |
Xenopus tropicalis |
frog |
16 |
Drosophila melanogaster |
fruit fly |
17 |
Vitis vinifera |
grape |
18 |
Bos taurus |
cow |
19 |
Monodelphis domestica |
opossum |
20 |
Fugu rubripes |
fugu (Japanese pufferfish) |
21 |
Tetraodon nigroviridis |
pufferfish |
22 |
Zea mays |
corn |
23 |
Caenorhabditis briggsae |
worm |
24 |
Pan paniscus |
bonobo |
25 |
Gorilla gorilla |
gorilla |
26 |
Pongo pygmaeus |
orangutan |
27 |
Drosophila erecta |
fruit fly |
28 |
Chlamydomonas reinhardtii |
alga |
29 |
Drosophila ananassae |
fruit fly |
30 |
Drosophila sechellia |
fruit fly |
31 |
Sorghum bicolor |
sorghum |
32 |
Drosophila yakuba |
fruit fly |
33 |
Drosophila virilis |
fruit fly |
34 |
Glycine max |
soybean |
35 |
Macaca nemestrina |
pig-tailed macaque |
36 |
Drosophila pseudoobscura |
fruit fly |
37 |
Drosophila grimshawi |
fruit fly |
38 |
Drosophila mojavensis |
fruit fly |
39 |
Drosophila willistoni |
fruit fly |
40 |
Drosophila persimilis |
fruit fly |
41 |
Drosophila simulans |
fruit fly |
42 |
Selaginella moellendorffii |
spikemoss |
43 |
Oikopleura dioica |
tunicate |
44 |
Sus scrofa |
boar |
45 |
Schmidtea mediterranea |
flatworm |
46 |
Ateles geoffroyi |
spider Monkey |
47 |
Bombyx mori |
silkworm |
48 |
Anopheles gambiae |
mosquito |
49 |
Apis mellifera |
honey bee |
50 |
Brassica napus |
rapeseed |
51 |
Lagothrix lagotricha |
brown woolly monkey |
52 |
Tribolium castaneum |
beetle |
53 |
Saguinus labiatus |
tamarin |
54 |
Pinus taeda |
pine |
55 |
Ciona intestinalis |
tunicate |
56 |
Triticum aestivum |
wheat |
57 |
Epstein Barr |
human virus |
58 |
Medicago truncatula |
clover |
59 |
Solanum lycopersicum |
tomato |
60 |
Ciona savignyi |
tunicate |
61 |
Mouse cytomegalovirus |
mouse virus |
62 |
Rhesus lymphocryptovirus |
rhesus virus |
63 |
Mareks disease |
chicken virus |
64 |
Mareks disease |
chicken virus |
65 |
Saccharum officinarum |
sugarcane |
66 |
Kaposi sarcoma-associated |
human virus |
67 |
Lemur catta |
ring-tailed lemur |
68 |
Human cytomegalovirus |
human virus |
69 |
Gossypium hirsutum |
cotton |
70 |
Mouse gammaherpesvirus |
mouse virus |
71 |
Symphalangus syndactylus |
gibbon |
72 |
Pygathrix bieti |
black snub-nosed monkey |
73 |
Herpes Simplex |
human virus |
74 |
Rhesus monkey |
rhesus monkey |
75 |
Xenopus laevis |
frog |
76 |
Human immunodeficiency |
human virus |
77 |
Ovis aries |
sheep |
78 |
BK polyomavirus |
human virus |
79 |
Dictyostelium discoideum |
amoeba |
80 |
Gossypium rammindii |
cotton |
81 |
JC polyomavirus |
human virus |
82 |
Simian virus |
monkey virus |
83 |
Brassica oleracea |
wild mustard |
84 |
Brassica rapa |
cabbages |
85 |
Cricetulus griseus |
Chinese hamster |
86 |
Carica papaya |
papaya |
87 |
Gossypium herbecium |
cotton |