
Discovering and detecting genetic diseases, as well as treatment with gene therapy requires information from
our DNA. This field of research is expecting a lot from the Human Genome Project (HGP), an international cooperation which started in the
late 1980's.
This project reached its objective in 2001: sequencing all of the 3ยท109 base pairs of the human genome. It is now possible to find the sequence of any human gene in a
databank.
Although this represents enormous progress for research, this is only the tip of the iceberg. The next step, identifying the genes and annotating a function of each of them, will take many more years.
Numerous laboratories are now working on this challenge.
You can compare the current situation to learning a new language: you know the words, but no grammar... Or you can compare it
to cooking: you have all your ingredients, but you still lack a recipe.
The
HGP pays special attention to the natural variation in DNA-sequence between the genomes of different individuals. On average, two persons differ in 1 out of 1000
nucleotides. These differences, which lead to diversity in the population, are called "snips" (SNP: Single Nucleotide Polymorphism). They determine, for example, the color of your hair, your eyes or your skin.
In some cases, the variation of a single nucleotide in a gene can lead
to malfunction of the protein it codes for, thus causing a disease. In
that case, we talk about a real mutation, which gives rise to diseases
such as diabetes, cancer, vascular diseases and, unfortunately, many
more. The databank OMIM contains information about all know mutations in humans. Almost every day, new DNA-tests are developed to easily detect such diseases.
Variations in DNA-sequences are probably also involved in how people
respond differently to the same treatment. Understanding how this works
may greatly improve the current efforts in drug development.
To this day, over 50 animal genomes have been sequenced completely. This
includes man, the chimpanzee, the mouse, the rat, the dog, the cow, the opossum,
and the chicken. In addition, hundreds of bacterial genomes are known. Because these are much smaller, it is quicker and easier to sequence them.
All these genome sequencing projects gave rise to some surprising results: the difference between the
chimpanzee and human genomes is only 1%, and human and dog genomes
differ only 7.5%.
Bioinformatics relies
heavily on two concepts: homology and evolution. We all descend from one common ancestor. This ancestor had specific proteins, which have evolved over time and species.
Because of that common ancestry, our genes must be similar to those of
other species. We sequence many genomes in order to compare genes and
proteins. This gives us information about how species evolved, but it
also enables us to transfer information from one species to another...
Imagine that scientist sequence a human gene (gene X), coding for
protein X, with an unknown function. In the monkey, a similar gene (gene
Y) is coding for protein Y, the function of which is known.
If the amino acid sequence of protein X is sufficiently similar (homologous)
to protein Y, a prediction can be made about the function of protein X:
the function, and probably also the 3D structure, will be the same as
protein Y. This kind of information transfer is used frequently in
bioinformatics.
You now know why bioinformaticians use computers and you have seen some key concepts of bioinformatics. The easiest way to get a real feel of what bioinformatics can do for you, is using the tools it provides. You can search databanks to solve a crime and look at protein structures to better understand diseases. You can even design drugs in 3D. It all happens in the "in practice section" of this site.