Islamic Educational, Scientific and Cultural Organization - ISESCO -
Home Director General Education Sciences Culture CPID Cooperation Secretariat of GC & EC

Notice

 

 Chapter 4

 Methodology of Sequencing

the Human Genome

 

 

The techniques that were used to read the whole Human Genome (3 billions letter of code) are very sophisticated and time consuming. The main tools adopted by these techniques are :

1- Genetically modified bacteria (GM bacteria) that harboured a piece of Human Genome.

2- Huge computers that can perform millions of calculations

3- Robots that organise and collate information generated by the computers.

Though the concept of sequencing the Human DNA is the same as pioneered by the double Nobel Laureate Fred Sanger in 1977, who developed to read the 5.375 letters of the genetic code of a simple Virus. Now as Sanger remarked “there is a lot of robotic now- we had to measure out things with pipettes and test-tubes”.

The DNA molecule is too big to be read in one step, and the scanning tunnelling microscope that can take picture of the DNA is not perfect yet. Thus the current method that was adopted to read the whole Genome is to break it down into manageable readable limited number of letters “base-pairs”. Once these small pieces –500 letter long- are sequenced and read, then how to put them together again in the right position, through overlapping sequences.

There are two approaches to sequence the whole Genome :

A- The approach that was first adopted by Human Genome projects consortium, which is funded by public money. Their strategy was based on 2-steps. They call it two-step shotgun process. The rationale behind their strategy is that the Human genetic code is so huge that an intermediate step is needed to get a rough map of the Genome as illustated in Fig (18 ) and explained below :

* The 3 billion letter Genome are broken down by shotgun into pieces of DNA, their length varies between 40000 and 200000 letters.  Each of these fragments is tagged with a unique identification tag which help to identify the order of the fragment.

* The tagged fragment is ligated to a bacterial artificial chromosome (BAC)

* The BAC is then cloned into a bacterium to make more copies of that fragment. Whenever the bacterium divides, it multiplies not only its own genetic message but also the foreign piece of DNA, which has been inserted. As a result of that, millions of copies of BAC can be made and can be studied in further details needed for genetic map.

* Landmarks of these pig pieces are identified, so overlapping pieces can be identified and the Genome then put back together resulting in what they called a map. As Dr. Sulston  the director of Sanger centre at Cambridge and major contributor to Genome project, said, “you construct the Genome as a jigsaw puzzel, at the level of 40000-200000 bases”. Another analogy drawn by Steve Jones (1993), “The positions of the cuts (Like those of the words and, but and banana) provide a set of landmarks along the DNA. Once we know where they are we have made a first step to making a physical map of the book itself based on the order of the letters and words it contains. The process is close to that carried out by the students who stormed the American Embassy in Tehran after the fall of the Shah. With extraordinary labour they pieced together secret documents which had been put through a shredding machine. By seeing how the individual fragments fitted together the students reconstituted a long, complicated and compromising message”. 

* This preliminary chromosomal map is used to locate the smaller pieces that are sequenced “read” from each end to end of the segment.

The second step of this strategy is as follow :

1- Each fragment that was ligated to BAC in previous steps is further broken down randomly into smaller pieces.

* Each of these smaller pieces is ligated a gain into a ring of DNA called plasmid or gene taxi that is capable of transferring that piece of DNA into a bacterium and replicated into million copies (refer to outline of genetic engineering technique).

* Sequencing technique pioneered by Sanger in 1975 can then sequence these pieces from both ends of the fragment. The principles of sequencing technique are :

- This technique depends on the ability of DNA molecule to copy itself when a special enzyme is provided along with a mixture fed with the A, T, C & G bases.

- The reaction involves growing copies of gradually lengthening radio-labelled pieces of a DNA strand (primer) from one end to the other.

- Four separate experiments (each using different base) are started at the same time. Each begins the process at the same place in the DNA. By chemical trickery, some of the growing strands are stopped each time a base is added.

- This produces a set of DNA pieces with different length, each stopped at specific base.

- The Electrophoresis of the mixtures on the same gel gives four parallel lines of DNA fragments in increasing length.

- Reading across and down the gel gives the order of the bases. Refer to illustration in Fig (19) (Harms and Damen, 1998). Though the reading process in Human Genome project is computerised and the labelled letters glow in a laser beam with a lot of robotics involvement, which is beyond the scope of this book. 

B- The second approach adopted by Celera Genomics which, is faster and used a huge super computer where millions of calculations are performed reducing the time significantly compared to public funded project. The man behind this private company is Dr. Graig Venter who devised a way to blow the whole Genome with “whole Genome shotgun”.

* The Genome end up into many small pieces ranging from 2000-10000 base pairs (letters) length.

* These small pieces are then sequenced using large computerised sequencing machines, regardless of their position on the chromosomes.

* Using supercomputer and clever computer programmes to compare the 3 billion letter of code sequenced and to find the overlapping regions. Once these regions are founded then the whole Genome are reassembled again.

Thought Celera admitted that they rely on the map produced by the public funded project, which is accessible through the Internet.

Dr. Sulston claimed that their data is to help everybody in this field.

The combination of these two complementary Genome sequence and assembly approaches has greatly reduced the time necessary to finish the sequence of whole Human Genome by 5-years. The time proposed earlier to finish the whole project was 15-years, but with the help of new techniques pioneered by Celera (Capillary electrophoresis and Super computers) reduced that time significantly. Details on the methodology of Human Genome Project can be found in the web sites mentioned in the references section.      

 

 
Untitled Document