|
|
HIV
JBrowse
Click on the following button to browse the HIV genome using JBrowse
*Note that the Hum.wig track is a plot of conservation scores between all 4 retroviruses
Starting sites/databases
The following sites and databases are excellent starting points if you are interested in investigating
the HIV genome
- Wikipedia
Everybody's favorite online encyclopedia. Here you will be able to find a excellent overview of HIV and external links and references.
- LANL HIV Database
A database dedicated to HIV hosted by the Los Alamos National Library. This is one of the most comprehensive online databases available for HIV. You will be able to browse sequence information, immunology information, resistance information, and vaccine trial information. The database also offers readily available tools for performing bioinformatics analysis on HIV sequences. For instance, the site offers a pre-constructed HMM for HIV and automatically configures alignment using MAFFT. All you have to do is input the query sequence! There are also alot of other nifty tools on this website like gene cutters and tree builders. Definitely worth exploring for anyone interested in HIV.
- Stanford University HIV Drug Resistance Database
This is a curated database for information regarding HIV drug resistance. It is hosted by Stanford University, where there is very active research regarding HIV antiretroviral therapy. The database offers genotype-phenotype information as well as genotype-clinical correlations. There are also several programs available to browse HIV mutations and results of antiretroviral therapy treatments (data from real patients!)
- RDI: HIV Resistance Response Database Initiative
A smaller database hosted by a research group in the UK along with international advisors and collaborators. The database contains information related to clinical applications in HIV treatment.
- AEGIS
AIDS Education Global Information System. This database focuses on the global health and educational aspect of HIV rather than the molecular/bioinformatic aspects. The database contains a wealth of information regarding HIV education and HIV devleopements targeted towards educating a general audience. The site is updated daily with news regarding HIV from around the world
- AIDS/HIV Surveilence Database
Hosted by the U.S. Census Bureau. This database contains census and survey information regarding AIDS in the United States and several other countries.
Biological Overview of HIV
*Overview information taken and summarized from Wikipedia and from the NCBI genome database for HIV-1and HIV-2
- Classification
Family: Retroviridae
Genus: Lentivirus
Species: HIV-1 and HIV-2
Out of the two known specicies, HIV-1 was initially discovered. It is more virulent and is the cause of most HIV infections today. HIV-2 is less virulent and largely confined to West Africa.
- Genomic Overview
A cross-referenced genomic overview is available at NICI for HIV-1and HIV-2 To summarize: The HIV-1 genome has a length of 9181 nucleotides, with a GC content of 42%. There are 9 proteins/protein coding genes in the genome. These are given the names
Gag,
Pol,
Vif,
Vpr,
Tat,
Rev,
Vpu,
Env, and
Nef.
The HIV-2 genome has a length of 10359 nucleotides, with a GC content of 45%. There are 9 proteins/protein coding genes in the genome. These are given the names
Gag,
Pol,
Vif,
Vpx,
Vpr,
Tat,
Rev,
Env, and
Nef.
These proteins may form dimers or cleave into subunits to server different functions. Note that multiple names have also been used in the literature to refer to these or variation/mutations of such proteins. This makes curation of information regarding these proteins a very difficult task. For instance, the encoded Env precursor protein is cleaved into gp41 and gp120 depending on the viral stage. There are many (estimated 19) different HIV proteins if we account for the subsequent cleavage after the encoding of the fore-mentioned precursor proteins.
However, it is worth noting that many portions of the NCBI cross-referenced HIV genome information is several years old. Previously, proteins were often referred to their gene names. In more recent years, research has revealed that there are several sub-proteins within each gene, which had been previously thought to encode a single proteins. For example, the Pol protein is a precursor to many important enzymatic proteins such as reverse transcriptase (RT). A schematic overview of more recent HIV proteins is shown as follows:
*image taken from Felix Voigts-Hoffman's HIV structural biology lecture
- Morphology
HIV has a roughly spherical shape with a diameter of around 120 nm. It is composed of two copies of positive single-stranded RNA that codes for the virus's nine genes enclosed by a conical protein capsid. Surrounding the capsid is a matrix that pretects the integrity of the viral RNA. Beyond the matrix is an viral envlope conposed of the Env viral protein (which is made up of a cap consisting 3 submolecules of gp120 and a stem consisting 3 submolecules of gp41). An scanning electron micrograph of the virus can be seen above. The colored picture shows the HIV virions as green buds emerging from a lymphocyte cell. The grayscale picture shows a close-up resolution of 2 HIV viruses.
- Infection Mechanism
HIV belongs to the family of lentiviruses. Like other lentiviruses, they are transmitted as single-stranded, positive-sense, enveloped RNA viruses. HIV enters macrophages by the adsorption of its surface glycoproteins onto the target cell. After HIV has bound to the target cell, the HIV RNA and various enzymes including reverse transcriptase are injected into the cell. During this process, the single stranded viral genome is transcribed into double stranded DNA and incorporated into the host genome via action of the reverse transcriptase. From there, the viral genetic material is able to propagate through the host's mechanisms of duplication, transcription, and translation. There is some variation in the infection process for HIV-1 and HIV-2. Interestingly, HIV-1 and HIV-2 appear to package there RNA differently in that HIV-1 will bind to any appropriate RNA whereas HIV-2 will preferentially bind to the mRNA which was used to create the Gag protein.
- Pathology
HIV is primarily transmitted through one of three methods: sexual, blood, or mother-to-child. The majority of HIV cases are caused as a result of sexual transmissions. This kind of transmission can occur when the sexual secretions (i.e. semen) of one partner with the genital, oral, or rectal mucous membranes of another partner. A NIAID study indicates that correct condom usage can reduce the risks of such transmissions by over 85%. HIV may also be transmitted through blood such as when HIV-infected blood comes into contact with an open wound or when blood transfusion occurs using HIV-infected blood Although this is now very rare in most developed countries as all transfusion blood are inspected for HIV, it often remains a problem in developing countries, where resources are not readily available for widespread screening practices.A more common source of blood transmission of HIV is in the case of using unsanitary needles for injections. This most often occurs with intravenous drug users and remains an issue in developed countries. A third method of transmission if mother to child, which can occur both during pregnancy, at childbirth, or through breatfeeding. Coovian, MD (PubMed) estimates that in the absence of treatment, the rate of transmission from mother to child is roughly 25%. However, there are medical treatments such as antiretroviral drugs and C-section procedure to drastically lower the transmission rate to as low as 1%.
- Engineering Applications
In terms of bioengineering retroviruses, recently lentiviruses, such as SIV, HIV, SIV and BIV, have been considered has gene therapy vectors for non-dividing or slow-growing cells as well as prolific cells. Notably, lentiviruses have been used to induce pluripotency in human cells by viral transduction of Oct4, Sox2, Klf4, and c-Myc.
- Interesting note regarding HIV and SIV
Due to the significant similarities between SIV and HIV, AIDS has been considered by some scientists as a zoonosis; hence, at some point, SIV was transferred to a human. There are a number of theories but these zoonosis theories are not proven and are still questioned. One such zoonosis theory is the Hunter theory in which the blood of killed or eaten chimps enters the wounds of a hunter, and SIV by chance is able to adapt and develop into HIV. Other theories include the contaminated needle theory, the colonialism theory, and oral polio vaccine theory. In part, this is fodder for the flame of research to understanding SIV. Moreover, the simian AIDS phenotype looks very similar to the human AIDS phenotype, so efforts to cure SIV may have implications to an HIV cure.
There are numerous strains of SIV, which affect different species of primates. The earliest discovered strain was found in macques; others discovered later include gorilla, chimpanzee, sooty mangabey, mandrilla, etc. Likewise two strains of HIV: worldspread HIV-1 and predominately Western Africa HIV-2, are broken down into subgroups from A to K. This has led to work in SIV and HIV phylogeny and understanding the evolution of each strain of each virus, in hopes of finding the link between the two, and possibly the root of both viruses.
Interestingly a research group has merged the SIV and HIV viruses to form a chimeric SHIV virus. In 1992, Shibata and Adachi succesfully created a HIV-1/SIV chimera, called the NM-3 virus, which was found to infect in a macque, in hopes of finding a vaccine for HIV-1. With that technology, more recently, a group transfected stem cell with a “937-bp antisense HIV-1 envelope sequence” to inhibit the replication of SIV/HIV-1 chimera viruses with HIV-1 envelopes in rhesus bone marrow cells. And in 2003, another group modified HIV Env by replacing some of its sequence with HIV-1/SIV chimera sequence in order to increase the trimerization for gp140 protein of HIV-1, implicitly in hopes of using this new virus to induce the creation better antibodies in a host for a possible HIV-1 vaccine.
Protein Alignments
For more information on protein alignments. You can check out my protein alignment tutorial availabe on my BioWiki space.
The gag protein, pol polypolyprotein, and envelope polyproteins appear to be highly conserved within the four immunodeficiency viruses. Below are multiple sequence alignments with conservation scores between the the gag, pol, and evelope proteins of BovineIV, FelineIV, HIV-1, HIV-2, and SimianIV. The multiple sequence alignments were generated using MAFFT and viewed in JalView. A tree is also contructed using a average distance method between the sequences. Please refer to individual sections of the viruses to browse their genomes and find information specific/pertinent to that individual virus.
MSA and tree for the Gag polyprotein
Multiple sequence alignment
MSA and tree for the Pol polyprotein
Multiple sequence alignment
MSA and tree for the Env polyprotein
Multiple sequence alignment and trees
From the analysis of these three proteins alone, we cannot conclude the evolutionary relationship between these four retroviruses. More genes must be brought in to make more reliable conclusions about the evolutionary relationship at the species level. Jenn Brophy (who is working on the annotation of Rhinovirus) suggests an evolutionary species tree in the context of more viruses. Her suggested tree can be accessed at:
http://biowiki.org/view/Fall09/Fall09VirusPhylogeneticTree
SATCHMO alignment for HIV Reverse Transcriptase
SATCHMO is a progressive alignment program that uses a different algorithm from MAFFT. MAFFT utilizes a combination of progressive and iterative alignment approaches. I have chosen the Reverse Transciptase protein to further investigate using MSA and HMM approaches.
HMM Construction for HIV Reverse Transcriptase
The HMM using constructed by first gathering homologs for HIV reverse transcriptase using PSI-BLAST. The SAM w0.5 program is then used to construct a HMM from the gathered homologs using 3 iterations. The exact commands to do so are convered in my protein alignment tutorial. After HMM construction, the reverse transciptase of all 4 of our lentiviruses were aligned against the HMM and visualized using Belvu.
As one can see, the two different approaches to alignment produced different results. SATCHMO utilizes a progressive alignment and constructs an HMM at each node of the tree. Whereas in the second HMM approach, a HMM was manually constructed from the homologs and aligned against the four sequences of interest. Nevertheless, the two methods appear to agree as a whole. That is, both results indicate that there is strong patterns of conservation between the reverse transcriptase of the 4 different viruses. In particular, it can be noted that the region of resudes 80 to 130 are particularly conserved. There might be a functionally important domain at this location. Further bioinformatics analysis and ultimately experimental validation must be performed to support such an hypothesis.
Protein structures visualization
I have made links from my JBrowse to several HIV proteins. Here, I will post PYMOL visualizations of some important HIV proteins for which I was able to find a PDB structure. The visualizations are all linked to their PDB database entry. Note that many protein structures are not solved for the isolated protein. Rather, they are solved for the protein in conjunction with another molecules. This may be due to protein crystalization difficulties or difficulties in isolating the protein. It may also be that the conjunction structure is intentionally solved so that researchers can examine the interactions between the HIV protein and other proteins or molecules (i.e. antiretroviral drugs).
Molecular structure for the HIV-1 gp120 trimer in the unliganded state
PDB ID: 3DNN
HIV GP41 CORE STRUCTURE
PDB ID: 1AIK
HIV-1 CAPSID PROTEIN (P24) COMPLEX WITH FAB25.3
PDB ID: 1AFV
crystal structure of HIV reverse transcriptase in complex with inhibitor 3
PDB ID: 3IOR
HELICAL STRUCTURE OF POLYPEPTIDES FROM THE C-TERMINAL HALF OF HIV-1 VPR, NMR, 20 STRUCTURES
PDB ID: 1BDE
*Note that here, the protein fragment is visualized as sticks rather than a cartoon schematic. This is because the cartoon schematic looks rather dull since it is only a single alpha helix.
Community
My contributions
- Contructed team.retro website
- Wrote protein alignment tutorial
- Helped with comparisons of lentiviruses between team members through tree construction and alignments
- Contributed to community resource pools
Acknowledgements
Much thanks to Kamran Ali in getting JBrowse up and running. He provided excellent help and assistance in annotating and debugging with JBrowse. Thanks to OMGBrowse for webspace to host our website and their awesome communal JBrowse. Thanks in general to the excellent students of BioE131 who were very open and helpful in contributing and distributing community resources such as tutorials.
return to top
|
|
|