The Molecules of HIV

Note: this site last updated in 2006

HIV genome

An article from "The Molecules of HIV" (c) Dan Stowell

The full HIV genome is encoded on one long strand of RNA. (In a free virus particle, there are actually two separate strands of RNA, but they're exactly the same!)

This is the form it has when it is a free virus particle. When the virus is integrated into the host's DNA genome (as a provirus) then its information too is encoded in DNA.

The following image shows roughly how the genes are laid out in HIV (remember that HIV-1 and HIV-2 are quite different). Click on a gene's name for more information.

A rough map of the genomic layout of HIV

This diagram is based on a fantastic map of the HIV-1, HIV-2, and SIV genomes, available at


The genes in HIV's genome are as follows:

  • gag (coding for the viral capsid proteins)
  • pol (notably, coding for reverse%20transcriptase">reverse transcriptase)
  • (NB. gag and pol together can be expressed in one long strand called "gag-pol">gag-pol")
  • env (coding for HIV's envelope-associated proteins)
  • And the regulatory genes:
  • tat
  • rev
  • nef
  • vif
  • vpr
  • vpu (N.B. not present in HIV-2)
  • vpx (N.B. not present in HIV-1)

The HIV genome also has a "Long Terminal Repeat" (LTR) at each end of its genome - not quite a gene, but a sequence of RNA/DNA which is the same at either end and which serves some structural and regulatory purposes.

Written by
Dan Stowell

Creative Commons License