HIV Databases HIV Databases home HIV Databases home
HIV sequence database

Sequence similarity in 6 regions of the HIV genome

These histograms are based on pairwise similarity scores from alignments of a selected number of sequences in the HIV database. All sequences are from different individuals. These numbers form an upper bound on similarities that can be considered 'reasonable', because:

  1. The sequences were gapstripped before the calculations, reducing the differences;
  2. many were derived from cultured virus ('lab strains'), and
  3. most of them are from old isolates (1984-85).

Similarities greater that 0.99 were never observed even in the protease alignment. Similarities over 98.5% were seen with the following frequencies: gp120 0%, gp41 0.05%, p17 0.05%, p24 0.5%, RT 2%, protease 7%.

last modified: Mon Oct 15 14:05 2007

Questions or comments? Contact us at

Operated by Triad National Security, LLC for the U.S. Department of Energy's National Nuclear Security Administration
© Copyright Triad National Security, LLC. All Rights Reserved | Disclaimer/Privacy

Dept of Health & Human Services Los Alamos National Institutes of Health