The chromosome structures are at the same time instrumental
in bringing about the development they foreshadow. They are
law-code and executive power—or, to use another simile, they
are architect’s plan and builder’s craft—in one.
Erwin Schr¨odinger [220, p. 22]
Since the discovery of chromosomal inheritance in the early 20-th century it has
fascinated biologists that in contrast to, say, rocks or clocks, living organisms carry
with them their own miniaturized part list, the genome. This specifies in its DNA sequence the primary structure of all proteins that make an organism as well as the primary structure of catalytically active RNA molecules. From 1995 onward, a rapidly
growing number of genomes of free-living organisms have been sequenced in their
entirety. This means that the succession of all bases, or base pairs (bp) since DNA is
double stranded, of the organisms concerned is known. Figure 1.1 shows a sample
of 106 organisms whose genomes have been sequenced. It contains representatives
of all three domains of life, archebacteria (archaea), bacteria, and eucaryotes, which
include humans. Organisms that are mentioned elsewhere in this book are marked by
an arrow. There is a heavy bias toward bacteria, which is partly due to their medical
importance. Another reason is that bacterial genomes are small, e.g. bp for
the human pathogen Haemophilus influenzae, and hence easier to sequence than for
example the human genome, which is more than 1,000 times larger ( bp).
Reading and writing of the genome form the molecular basis of life.
1.1 Reading and Writing
In every organism constant reading of the genome by the molecular genetic machinery of the cell is fundamental to sustaining its vital processes. The concomitant flow
of information from DNA to proteins, but never in the reverse direction, is summarized by the well-known central dogma of molecular biology (Fig. 1.2).
Computational biology is traditionally concerned with two questions that build
directly on the central dogma: where are the genes and what are their functions?
In fact, these questions can be understood as an attempt to reproduce in silico the
molecular genetic machinery of a cell. The enzymes making up this machinery locate
specific genes with great precision. The subsequent expression of a particular gene
takes place in a context of hundreds or even thousands of other genes active in that