Create a SureGuide gRNA design

Use the SureGuide design wizard to devise sequences for guide RNAs (gRNAs), to be used with CRISPR/Cas9, targeting a defined DNA region. The wizard takes you through the steps of the design creation process, which are described in detail below. The end-result is a set of design files that includes the sequences and positions of the gRNAs, the secondary hits for each gRNA, the corresponding single guide RNA (sgRNA) sequences, and the sequences of the DNA template to be used to synthesize the sgRNAs if using in vitro transcription.

To open the SureGuide design wizard:

·         At the top of the screen, click Create Designs > SureGuide.

The wizard window opens to Step 1.

Steps of the SureGuide wizard

Step 1: Define Design

In this step, complete the fields described below to define the design.

Design Name

Type a name for your design into the field. Alphanumeric characters, hyphens, underscores, and spaces are permitted. The name must be unique within your workgroup.

Species

Specify the species of the targets. The default selection is H. sapiens.

To change the selection, click Select. The Select Species dialog box opens, allowing you to select from a list of available species.

Build

If multiple genome builds are available for the selected species, select the desired genome build in the provided drop-down list.

If no drop-down list for build selection is provided, that indicates that only one genome build is currently available for the selected species. That build is indicated below the Species field.

 NOTE  SureDesign allows you to upload custom genomes to use when creating SureGuide designs. Click Manage Genomes to view the available custom genomes and upload new genome files.

CRISPR Application

Select the appropriate CRISPR application for the design.

·        CRISPRa - CRISPR activation (for up-regulation of a target)

·        CRISPRi - CRISPR interference (for down-regulation of a target)

·        CRISPRko - CRISPR gene knockout (for knockout of a target)

Category

Specify if you want the gRNA library amplified or unamplified. Amplified libraries are composed of linear DNA fragments with end-sequences that make the oligos suitable for cloning into Agilent's pSGLenti vector backbone. Unamplified libraries are also composed of linear DNA fragments, but the end-sequences flanking the gRNA-encoding region are fully customizable. In order to create a plasmid library from an unamplified gRNA library, you must first PCR-amplify the gRNA library and then clone the amplicons into a suitable vector.  

If you select Unamplified, then the Category setting includes an adjacent check box labeled IVT. Mark the check box if you want SureDesign to include T7 promoter elements in the flanking sequences of the gRNAs.

Create In

Specify the folder in which you want to save this design. The default selection is the top-level folder for your workgroup.

To change the selection, click Select to open the Select Folder dialog box, and mark the folder in which you want to save the new design. This dialog box lists the available folders within your workgroup and, if you are a member of any collaborations, lists the collaboration folders to which you have access. (If you later decide you want to change the folder location of the design, you can move it to another folder.)


Click Next to advance to Step 2.


Step 2: Input Target

In this step, define the DNA regions of interest for the gRNAs (referred to in the wizard as a "targets") by providing the following information:

Targets

In the Targets text area, enter one or more identifiers for the target using either of the following approaches:

·        Type or paste the target identifiers.

·        Click Upload to browse to a text file (*.txt) that contains the target IDs.

The permitted identifiers are:

·        For target genes:

Gene name - enter the gene name (not case-sensitive) as it appears in one or more of the selected databases; example: brca1; see SureDesign gene finder for information on how SureDesign maps a gene name to a specific genomic location

Transcript ID - enter the transcript ID (not case-sensitive) as it appears in one or more of the selected databases; examples: NM_007294, OTTHUMT00000348798, or ENST00000357654; note that SureDesign ignores version numbers included in the transcript ID

Gene ID - enter the numerical NCBI gene ID; example: 672

GO ID - enter the GO Id; example: GO:0048040

·        For target genomic intervals:

Genomic coordinates - enter the chromosome number and range of nucleotides using the UCSC browser format or BED format.

You can add a string of text, no spaces, after the target genomic interval to be used as the target ID (e.g. chr1:1-100 geneX).

Databases

Below the Databases heading, mark the genome annotation databases that you want SureDesign to use to obtain genomic coordinate information for your specified targets. The databases that you have to choose from are dependent on the species you selected in the Define Design step. For H. sapiens, the available database sources are:

RefSeq - US National Center for Biotechnology Information (NCBI)

Ensembl - European Bioinformatics Institute and the Wellcome Trust Sanger Institute

CCDS - Consensus Coding Sequence project (CCDS) of the US National Center for Biotechnology Information (NCBI)

Gencode - US National Human Genome Research Institute (NHGRI) and the Wellcome Trust Sanger Institute

VEGA - Vertebrate Genome Annotation project of the Human and Vertebrate Analysis and Annotation (HAVANA) group at the Wellcome Trust Sanger Institute

SNP - dbSNP database from the National Institutes of Health (NIH)

CytoBand - CytoBand file from the UCSC Genome Browser

Regions of Interest

Specify the specific regions within the targets for which you want to select gRNAs. Use the options below the Regions of Interest heading:

·        Entire Transcribed Region - Select this option to include gRNAs for the entire genomic sequence (exons, introns, and UTRs) of your target genes.

·        Coding Exons - Select this option to include gRNAs only for the translated regions of the target genes. If you want to include only the first coding exon and exclude all other coding exons, make sure the check box labeled First Exon is marked. (If the First Exon check is not marked, then none of the coding exons for the target genes are excluded).

·        Transcription Start Site - In case of Transcription Start Site (TSS), gene finder will provide the 1st base from where transcription begins. Genes may contain multiple TSS depending on the number of transcripts. Start site will also depend on the orientation of the transcript.

 NOTE  For target genomic intervals (i.e., targets entered as genomic coordinates), SureDesign always includes the entire genomic sequence when selecting sequences for the design, regardless of your selection for the Regions of Interest.

Include Flanking Bases

In the 3' and 5' drop-down lists, select how many base pairs of flanking sequence (on the 3' and 5' ends, respectively) you want SureDesign to include with the target region when designing the gRNAs.

 NOTE  SureDesign does not include flanking bases for targets entered as genomic coordinates.


Allow Synonyms

When this check box is marked, SureDesign compares the gene names you entered into the Targets area to a table of synonyms, and may use the synonym names to map the genes to a genomic location. For example, if you entered HER2 as a target, SureDesign would identify HER2 as a product of the gene ERBB2, and use ERBB2 to map the genomic location.

In cases in which the gene name for your target is also a synonym for another gene, SureDesign treats both genes as targets when Allow Synonyms is marked. For example, if you entered DSP as a target, SureDesign would identify your target as the official gene name for desmoplakin, but it would also identify it as a synonym for the gene encoding dentin sialophosphoprotein. Consequently, the program would map the genomic location to two completely different genes, and in the next step of the wizard (Step 3: Review Targets), you would see both genomic locations listed for the target.

When the Allow Synonyms check box is cleared, SureDesign maps your targets to genomic locations using only the entered gene names.

To fully control how SureDesign maps your targets to a genomic location, enter your targets using transcript IDs, gene IDs, or SNP IDs instead of gene names. Alternatively, after you advance to the Review Targets step of the wizard, click Download to download the Regions.bed file and then edit the genomic locations listed in the file so that they accurately match those of your targets. You can then go back to the Define Targets step of the wizard and paste the genomic locations into the Targets input area.

 

Click Next to advance to Step 3.


Step 3: Target Details

This step provides a chance for you to make sure that SureDesign successfully recognized the target identifier that you entered in the Define Targets step.

If SureDesign did not accurately identify your target region

Target Details

The Target Details table lists the following information for each of the target identifiers that SureDesign was able to locate:

·        Target - The Target column lists the gene name, transcript ID, SNP ID, or genomic coordinates that you used to define the target.

·        # Regions - The # Regions column lists the number of target regions within the target.

·        Base Pairs - The Base Pairs column lists the total number of base pairs within the regions defined by the target identifier.

·        Position - The Position column lists the genomic coordinates identified for the target.

·        Group IDs - The ID(s) used to define the target (e.g., GO or KEGG ID).

 NOTE  If you entered the target sequence using a gene name, accession number, or similar identifier, click View targets in UCSC to open the UCSC Genome Browser and see the genomic location of the target identified by SureDesign.

 

Click Next to advance to Step 4.


Step 4: Enter Parameters

At this step, enter the parameters for the gRNA selection process. When you are finished making your selections, submit the design to SureDesign to begin gRNA selection.

Filters

·        PAM Sequence - Select which PAM sequence you want SureDesign to use when selecting gRNAs for the target window. You can select between NGG and NGGNG, or select Custom to enter a PAM sequence of your choice. If you select Custom, a field appears next to the PAM Sequence drop-down list where you can enter the desired custom PAM sequence.

·        Search In Strand - Select which strand of the target window sequence you want SureDesign to use when selecting gRNAs. You can select Forward (for the + strand), Reverse (for the – strand), or Both (for both + and - strands).

·        gRNA with GG at the 5' end - Mark this check box if you want SureDesign to only select gRNAs that have an endogenous GG at the 5' end of their sequence.

Scoring Algorithm

Select one or both of the available scoring algorithms.

·        Doench Score - Calculates an on-target score using the method described in Doench, et. al. (Nature Biotechnology 34, 184-191 [2016]).

·        Zhang Score - Calculates an off-target score using the method described in Zhang, et. al. (Nature Biotechnology 31, 827-832 [2013]).

Set the parameters for the selection algorithms.

·        Cut Off - This parameter sets the minimum score threshold for the gRNA sequences. gRNAs are discarded if their score is below the specified cut off. You can enter a value from 0.00 to 1.00. Note that each scoring algorithm has its own Cut Off parameter.

·        Weight - The Weight parameters set the relative weights of the two scoring algorithms, which SureDesign then uses when calculating the final scores for the gRNAs. For example, if both algorithms are selected, and both weights are set to the default value of 0.5, then the two algorithms are equally weighted when calculating the final score for a gRNA. The formula used for the final weighted score is:
(WeightDoench × ScoreDoench) + (WeightZhang × ScoreZhang)

 

To submit the design for gRNA searching:

When you are finished entering the parameters, submit the design to the SureDesign job queue and the SureDesign algorithms will search for gRNA sequences for your design.

  1. Click Find gRNAs.

    A message box opens indicating the e-mail address where Agilent will contact you when the gRNA selection job is complete. If desired, you can enter additional e-mail addresses into the provided field.

  2. Click OK in the notification message to submit the design to SureDesign.

    Your submission is placed in the SureDesign job queue.

    The wizard automatically advances to Step 5.


Step 5: Find gRNAs

At this point in the design creation process, SureDesign is processing your gRNA search job. The length of time required for SureDesign to complete the job depends on the number of jobs waiting in the queue and the parameters for finding secondary hits for the gRNAs.

Click Close Design Wizard. When you receive an e-mail from Agilent SureDesign notifying you that your gRNA search job was successfully completed, relaunch the wizard and continue creating the design:

  1. Open the SureDesign Home screen.

  2. Locate the design under Designs: In Progress, and click the Continue icon .

    The wizard window opens to Step 6.

 NOTE  You can monitor the status of your gRNA search jobs from the SureDesign Home screen.


Step 6a: Summary

The Summary step displays a summary of the gRNA search job with the following information.

·        # Regions - number of continuous genomic regions submitted to the search job

·        # Covered Regions - number of continuous genomic regions for which gRNAs were found (these regions are listed in the table at the bottom of the screen)

·        # Missed Regions - number of continuous genomic regions submitted to the search job for which no gRNAs were found

·        # Selected gRNAs - number of gRNAs currently selected for inclusion in the design

·        # Total gRNAs - total number of gRNAs identified in the search job

The table below the summary information lists the covered regions and displays the total number of gRNAs identified for that region (Total gRNAs column) as well as the number of gRNAs currently selected for inclusion in the design (Selected gRNAs column).

 

Click Review in the Action column for a region to open the Select gRNAs step. Note that you will need to access the Select gRNAs step for each region listed on the Summary step.

Step 6b: Select gRNAs

This step includes a genome browser, like the one below, that depicts the locations of the selected gRNAs within a contiguous target region.

Navigation tools

Use these tools to change the genomic location displayed in the browser.

Filter tool

Click Filter to open the Configure Filters dialog box where you can filter the selection of gRNAs based on the Zhang score.

In the Zhang Score field, type the minimum Zhang score for the gRNAs that you want to select, then click Apply. SureDesign de-selects any currently-selected gRNAs that do not meet the specified minimum. Note that lowering the Zhang Score below the default value does not cause SureDesign to select additional gRNAs.

To set the selected gRNAs back to their original settings, click Reset gRNA Selection.

Locations and ranks of gRNAs

The red and green boxes indicate the locations of the gRNAs. The red boxes are gRNAs on the (+) strand of the target, and the green boxes are gRNAs on the (–) strand of the target. Because gRNAs may overlap each other, the boxes are semi-transparent to allow for showing overlapping regions.

For each gRNA shown, there is a blue or gray box that displays the rank number of that gRNA. Blue boxes are used for gRNAs that are currently marked in the table. Gray boxes are for gRNAs that are not marked. Click here for information on how SureDesign ranks gRNAs.

When gRNAs overlap, the boxes displaying the ranks are stacked. For gRNAs on the (+) strand, the bottom number in the stack indicates the left-most gRNA. For gRNAs on the (–) strand, the top number in the stack indicates the

Click directly on a red or green box or a ranking number to select the row for that gRNA in the table at the bottom of the wizard window.

gRNAs table

The table at the bottom of the window lists the following information about each gRNA that SureDesign was able to locate for the target. For those gRNAs that you want to include in the final design, mark the check box in column 1.

Column

Description

Check box

Mark the check box to include the gRNA in the final design.

Any gRNAs not marked in this column are not included in the final design, however, they are still listed in the AllgRNA text file that is available for download after finalizing the design. See SureGuide gRNA design files available for download.

Name

The Agilent-assigned name for the gRNA. The naming convention is gRNA#N, where N is a number based on the distance of the gRNA from the start of the target window.

gRNA Sequence

The species of the target DNA. The genome build is indicated in parentheses.

Doench

The on-target Doench score calculated for the gRNA.

Zhang

The off-target Zhang score calculated for the gRNA.

#Hits

The number of sequence matches for the gRNA within the genomic region currently displayed in the browser.

Position

The genomic coordinates of the gRNA.

After you review the gRNAs, mark the ones that you want to include in the final design. Then, click Back to return to the Summary step and click the Review link for the next region in the list.

Once you have reviewed and selected gRNAs for each region, click Finalize gRNAs from the Summary step to advance to Step 7.


Step 7: Finalize gRNAs

At this point in the design creation process, SureDesign is processing your final gRNA selections and saving them to the design. The length of time required for SureDesign to finish saving your selected gRNAs depends on the number of jobs waiting in the queue and the number of selected gRNAs.

Click Close Design Wizard. When you receive an e-mail from Agilent SureDesign notifying you that the selected gRNAs were successfully saved to the design, relaunch the wizard and continue creating the design:

  1. Open the SureDesign Home screen.

  2. Locate the design under Designs: In Progress, and click the Continue icon .

    The wizard window opens to Step 8.

 NOTE  You can monitor the status of your design from the SureDesign Home screen.


Step 8: Finalize

At this step, review the flanking sequences and IVT options and make any desired edits.

IVT Synthesis

If you want SureDesign to add GG to the end of each gRNA sequence, mark the check box labeled Append GG at 5' end.

Flanking Sequence

The flanking sequences that are to be added to the 3' end and 5' end of each gRNA are displayed in the fields in the Flanking Sequences section. Review the sequences and make any desired edits as needed.

 

When ready, click Finalize Design.


Step 9: Design Complete

After clicking Finalize Design, the wizard window updates to the Design Complete step.

Use the action buttons at the bottom of the Finalize Design window to take further action on the design:

·        Click Mark as Favorite to add the design to your list of favorites. The design will appear in the Designs: Recent and Favorites dashboard on the Home screen.

·        Click Download to download one or more design files, including a formatted PDF report that summarizes key information on the design.

These action buttons are also available from the design details window.