These histograms are based on pairwise similarity scores from alignments of a selected number of sequences in the HIV database. All sequences are from different individuals. These numbers form an upper bound on similarities that can be considered 'reasonable', because:
Similarities greater that 0.99 were never observed even in the protease alignment. Similarities over 98.5% were seen with the following frequencies: gp120 0%, gp41 0.05%, p17 0.05%, p24 0.5%, RT 2%, protease 7%.