Bioinformatics - Part 3

The Human Genome Project

Human Genome ProjectDiscovering and detecting genetic diseases, as well as treatment with gene therapy requires information from our DNA. This field of research is expecting a lot from the Human Genome Project (HGP), an international cooperation which started in the late 1980's.

This project reached its objective in 2001: sequencing all of the 3ยท109 base pairs of the human genome. It is now possible to find the sequence of any human gene in a databank.

Although this represents enormous progress for research, this is only the tip of the iceberg. The next step, identifying the genes and annotating a function of each of them, will take many more years. Numerous laboratories are now working on this challenge. You can compare the current situation to learning a new language: you know the words, but no grammar... Or you can compare it to cooking: you have all your ingredients, but you still lack a recipe.

Health or disease, click for a larger versionThe HGP pays special attention to the natural variation in DNA-sequence between the genomes of different individuals. On average, two persons differ in 1 out of 1000 nucleotides. These differences, which lead to diversity in the population, are called "snips" (SNP: Single Nucleotide Polymorphism). They determine, for example, the color of your hair, your eyes or your skin.

In some cases, the variation of a single nucleotide in a gene can lead to malfunction of the protein it codes for, thus causing a disease. In that case, we talk about a real mutation, which gives rise to diseases such as diabetes, cancer, vascular diseases and, unfortunately, many more. The databank OMIM contains information about all know mutations in humans. Almost every day, new DNA-tests are developed to easily detect such diseases.

Variations in DNA-sequences are probably also involved in how people respond differently to the same treatment. Understanding how this works may greatly improve the current efforts in drug development.

Homology - How different are we?

To this day, over 50 animal genomes have been sequenced completely. This includes man, the chimpanzee, the mouse, the rat, the dog, the cow, the opossum, and the chicken. In addition, hundreds of bacterial genomes are known. Because these are much smaller, it is quicker and easier to sequence them.

All these genome sequencing projects gave rise to some surprising results: the difference between the chimpanzee and human genomes is only 1%, and human and dog genomes differ only 7.5%.

People look a lot like dogs

HomologyBioinformatics relies heavily on two concepts: homology and evolution. We all descend from one common ancestor. This ancestor had specific proteins, which have evolved over time and species. Because of that common ancestry, our genes must be similar to those of other species.  We sequence many genomes in order to compare genes and proteins. This gives us information about how species evolved, but it also enables us to transfer information from one species to another...

Imagine that scientist sequence a human gene (gene X), coding for protein X, with an unknown function. In the monkey, a similar gene (gene Y) is coding for protein Y, the function of which is known.
If the amino acid sequence of protein X is sufficiently similar (homologous) to protein Y, a prediction can be made about the function of protein X: the function, and probably also the 3D structure, will be the same as protein Y. This kind of information transfer is used frequently in bioinformatics.

In conclusion

You now know why bioinformaticians use computers and you have seen some key concepts of bioinformatics. The easiest way to get a real feel of what bioinformatics can do for you, is using the tools it provides. You can search databanks to solve a crime and look at protein structures to better understand diseases. You can even design drugs in 3D. It all happens in the "in practice section" of this site.

Previous | In practice