Viral Epidemiology Signature Pattern Analysis

VESPA calculates the frequency of each amino acid (or nucleotide) at each position (column) in an alignment for the query and background set, and selects the positions for which the most common character in the query set differs from that in the background set.

This program can be used to quickly detect amino acids that characterize differences between two groups of sequences. It compares two groups of sequences and looks for a "signature" pattern, or the set of amino acids that is conserved among each set, but differing between the sets. It will pick out those distinguishing amino acids, and calculate their frequencies in each set.





In the above sequence alignment, the sequence names are alphabet-OK and mutant-ABCs. For the second sequence, no sequence information was available for the last two positions. The "I" in the first sequence was deleted in the second sequence. U has "mutated" to V, and A to Z. Hence the signature pattern for mutant-ABCs relative to alphabet-OK is:

signature   Z.......-...........V...**, or 3/24 characters.

The periods (.) in the above signature indicate that the two sequences agree in those positions. The Z, -, and V show where the sequences disagree defining a signature for the "mutant-ABCs" sequence. The denominator for the three amino acid signature is 24, not 26, because no sequence information was available for the last two positions.



When citing VESPA, please cite Korber and Myers (1992).

  1. Ou C-Y, Cielsielski CA, Myers G, Bandea CI, Luo C-C, Korber BTM, Mullins JI, Shochetman G, Berkelman RL, Nikki Economou AN, Witte J, Furman LJ, Satten GA, MacInnes KA, Curran JW, and Jaffe HW: Molecular epidemiology of HIV transmission in a dental practice. Science, 1992 May; 256(5060):1165-71.

  2. B. Korber and G. Myers: Signature Pattern Analysis: A Method for Assessing Viral Sequence Relatedness. AIDS Research and Hum. Retroviruses, 1992 Sep; 8(9):1549-60.
