HIV Databases HIV Databases home HIV Databases home
HIV sequence database

Explanation of QuickAlign Options

Provide query sequence(s)

Provide one or more nucleotide or protein sequences in any standard format, and the tool will align each one to an alignment. This tool can be used to align epitopes, functional domains, primers, or any region of interest. If multiple input sequences are used, they do not need to be aligned or from the same region. The output will be trimmed to the region(s) of the provided sequences. If nucleotide sequences are provided, the reverse complement of the sequences will also be considered when making the best match. If a reverse-complement sequence has a better match score than the original query sequence, then the aligned position of the reverse-complement sequence will be used to retrieve the alignment, instead of the direct sequence match.

Choose region coordinates

Instead of providing a query sequence, QuickAlign can be used to simply show the alignment of any particular region from our premade alignments. Enter coordinate values (either coordinates relative to the complete HIV sequence, or relative to any gene) and the tool will extract an alignment encompassing just the region of these coordinates. To show an entire gene or region, enter the start value as "1" and the end value as "end".

LANL database alignment type

Several choices are provided. See Alignments page for additional details on premade alignment types offered.

User alignment

You may choose to provide your own alignment of HIV, SIV, or any organism.

Organism: other

Quickalign can be used for any query organism. In the case of non-HIV/SIV, you will need to provide your own alignment. No map image will be presented in the results.

My alignment includes reference sequence

If you are providing your own HIV-1, HIV-2, or SIV alignment, the program will add a reference sequence in order to align and clip your query properly. The required reference sequence for HIV-1 is HXB2 (accession# K03455); the required reference for HIV-2 and SIV is Mac239 (accession# M33262).

If the first sequence in your file is the reference sequence, please check the box. If the box is unchecked, the appropriate reference sequence will be added.

The option is irrelevant if your organism is non-HIV/SIV.

Summarize results by

QuickAlign results include summaries of residue frequencies by subtype, or by any other grouping you specify. You can specify the groupings automatically, based on your sequence names, either by the first n characters, or by delimited groupings within the sequence names.

You can also choose to provide your own groupings of sequence names. In the example below, three groups (B, CRF01, and other) will used for the analysis of residue frequencies.





Delete Gaps and Shift

If the "Delete gaps and shift" option is selected, then gaps placed to bring sequences into alignment will be squeezed out and the alignment shifted rightwards (toward the C-terminal end). For example, suppose your query has a one-amino acid insertion relative to most other sequences, then following alignment:

seq2   VAR-LHP
seq3   VAR-LFP
seq4   VAR-LMP

would be presented like this with gaps deleted:

seq2   QVARLHP
seq3   QVARLFP
seq4   QVARLMP

Q is the amino acid one position to the left of the V. As a result of squeezing gaps and shifting characters rightward, alignments in gappy regions will look "bad."

NOTE: The delete gaps option is useful for aligning immunologically reactive epitopes, because in such cases it is particularly important to maintain the alignment of the C-terminal anchor residues.

Wide Output

For ease of reading, QuickAlign presents its alignment result in groups of 10 characters, with a maximum line width of 50 characters. If the query is longer than 50 characters, it will be continued below. However, you can force lines to be longer than 50 characters, by checking "Yes".

See sample output on Explanation of QuickAlign Results.

Calculate frequency by position

On the Results page, you will see buttons for "Summarize All" and "Summarize by Subtype". If you have selected "Calculate frequency by position", these summaries will include data showing the frequency of each nucleotide or amino acid at each position.

If "Calculate frequency by position" is selected, the Summarize pages will also contain links to "See full raw counts", which will show you the full residue counts without applying any cutoff.

Below the frequency table is a Sequence Logo (frequency graph) that shows a visual representation the frequency of each residue at each position. The height of letters indicates the relative frequence of each residue at each position. The width of a stack of letters is proportional to the fraction of valid residues in that position, i.e., columns with many gaps or unknown residues are narrow. These graphs are produced by WebLogo 3.

See sample output on Explanation of QuickAlign Results.

Cut-off for calculating frequency by position

The frequency table will show only the residues with the highest representation(s), as determined by a cutoff. If the cutoff is 100%, all residue frequencies will be shown. If the cutoff is 95%, the most frequent residues will be shown, up to a cumulative total of 95%, then all others will be presented as "other". Lumping together the infrequent residues can be a useful simplification, particularly in the case of protein sequences.

Include surrounding region

If checked, this option will display an additional 15 residues on each side of the query. These residues are taken from the appropriate reference sequence.



last modified: Wed Jul 10 11:01 2013

Questions or comments? Contact us at

Operated by Los Alamos National Security, LLC, for the U.S. Department of Energy's National Nuclear Security Administration
Copyright © 2005-2017 LANS, LLC All rights reserved | Disclaimer/Privacy

Dept of Health & Human Services Los Alamos National Institutes of Health