HIV Databases HIV Databases home HIV Databases home
HIV sequence database



QuickAlign Explanation

Purpose

QuickAlign will align epitopes, functional domains, primers, binding sites, or any region of interest, to sequences from the HIV Sequence Database, or to your own alignment.

"Conventional" versus "Discontinuous" QuickAlign

Conventional QuickAlign Input one or more sequences, and each sequence will be aligned (separately) to the chosen type of alignment.

Discontinuous QuickAlign Input start/stop coordinates for any alignment, any organism. If multiple coordinates are provided, the output will be concatenated into a single spliced alignment.

Query sequence(s) (Conventional QuickAlign only)

Positions of interest (Discontinuous QuickAlign only)

Organism: other

Quickalign can be used for any query organism. In the case of non-HIV, you will need to provide your own alignment. No map image will be presented in the results.

LANL database alignment type

See Alignments page for details on premade alignment types offered.

User alignment

You may choose to provide your own alignment of any organism.

User alignment reference sequence

If you are providing your own alignment, the program needs a reference sequence in order to show a gene map with your query. The required reference sequence for HIV-1 is HXB2 (accn K03455); the required reference for HIV-2 and SIV is Mac239 (accn M33262).

If the first sequence in your file is the reference sequence, please check the box.

This option is irrelevant if your organism is non-HIV.

Summarize results by

QuickAlign results include summaries of residue frequencies by subtype, or by any other grouping you specify. You can specify the groupings automatically, based on your sequence names, either by the first n characters, or by delimited groupings within the sequence names.

You can also choose to provide your own groupings of sequence names. In the example below, three groups (B, CRF01, and other) will used for the analysis of residue frequencies.

Example:

B:
BUS_90.5_
BCA_90.2_
BDE_90.3_

CRF01:
01AEUS_81.7_
01AECA_81.5_
01DE_81.2_
CRF01HA_88.6_

other:
02AG_US_55.2
C_US_55.8
D_BE_77.9

Delete Gaps and Shift

If the "Delete gaps and shift" option is selected, then gaps placed to bring sequences into alignment will be squeezed out and the alignment shifted rightwards (toward the C-terminal end). For example, suppose your query has a one-amino acid insertion relative to most other sequences, then following alignment:

QUERY  VARELHP
REF    VAR-LHP
seq2   VAR-LHP
seq3   VAR-LFP
seq4   VAR-LMP

would be presented like this with gaps deleted:

QUERY  VARELHP
REF    QVARLHP
seq2   QVARLHP
seq3   QVARLFP
seq4   QVARLMP

Q is the amino acid one position to the left of the V. As a result of squeezing gaps and shifting characters rightward, alignments in gappy regions will look "bad."

NOTE: The delete gaps option is useful for aligning immunologically reactive epitopes, because in such cases it is particularly important to maintain the alignment of the C-terminal anchor residues.

Wide Output

For ease of reading, QuickAlign presents its alignment result in groups of 10 characters, with a maximum line width of 50 characters. If the query is longer than 50 characters, it will be continued below. However, you can force lines to be longer than 50 characters, by checking "Yes".

See sample output on Explanation of QuickAlign Results.

Calculate frequency by position and show logos

On the Results page, you will see buttons for "Summarize All" and "Summarize by Subtype". If you have selected "Calculate frequency by position", these summaries will include data showing the frequency of each nucleotide or amino acid at each position.

If "Calculate frequency by position" is selected, the Summarize pages will also contain links to "See full raw counts", which will show you the full residue counts without applying any cutoff.

Below the frequency table is a Sequence Logo (frequency graph) that shows a visual representation the frequency of each residue at each position. The height of letters indicates the relative frequence of each residue at each position. The width of a stack of letters is proportional to the fraction of valid residues in that position, i.e., columns with many gaps or unknown residues are narrow. These graphs are produced by WebLogo 3.

See sample output on Explanation of QuickAlign Results.

Cut-off for calculating frequency by position

The frequency table will show only the residues with the highest representation(s), as determined by a cutoff. If the cutoff is 100%, all residue frequencies will be shown. If the cutoff is 95%, the most frequent residues will be shown, up to a cumulative total of 95%, then all others will be presented as "other". Lumping together the infrequent residues can be a useful simplification, particularly in the case of protein sequences.

Include surrounding region

If checked, this option will display an additional 15 residues on each side of the query. These residues are taken from the appropriate reference sequence.

Remove residues from logo

This option allows you to change the appearance of the sequence logo image(s) in the 'Summarize' results.

logo, show all

show all
If the default 'show all' is chosen, the logo is presented as usual, showing the abundance of each residue in the alignment. At right, the epitope RPNNNTRKSI was aligned to the LANL HIV-1 filtered web alignment.

 
logo, remove consensus

remove: consensus of alignment
If 'remove consensus' is chosen, the images omit the most common residue at each position of the alignment. At right, the epitope RPNNNTRKSI was aligned to the LANL HIV-1 filtered web alignment.

remove: consensus of seq group
This option is similar, but removes consensus of each sequence group, rather than the consensus of the whole alignment.

 
logo, remove residues of first seq

remove: residues of 1st sequence
If 'remove 1st' is chosen, the images omit the residue that occurs in the first sequence of the alignment. For the LANL database alignments, the first sequence is always the reference sequence. For user alignments, the first sequence might be a vaccine sequence, for example, being compared to the alignment it was derived from. At right, the epitope RPNNNTRKSI was aligned to the LANL HIV-1 filtered web alignment.

remove: residues of 1st seq of seq group
This option is similar, but removes the residues of the first sequence of each sequence group, rather than the first sequence of the whole alignment.

 

Mark potential N-linked glycosylation sites

When 'yes' is chosen, any asparagine (N) occurring in the pattern NxS or NxT (x = any amino acid except proline) will appear as "O". For more information about N-linked glycosylation, see N-GlycoSite.

Additional Resources

 

last modified: Mon Mar 23 11:32 2015


Questions or comments? Contact us at seq-info@lanl.gov.

 
Operated by Triad National Security, LLC for the U.S. Department of Energy's National Nuclear Security Administration
© Copyright Triad National Security, LLC. All Rights Reserved | Disclaimer/Privacy

Dept of Health & Human Services Los Alamos National Institutes of Health