a1 Max-Delbrück-Center for Molecular Medicine, 13125 Berlin-Buch, Germany and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
a2 Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK
a3 Groupe de Cancerogenese, IBMC de CNRS, 75084 Strasbourg, France
It has become standard practice to compare new amino-acid and nucleotide sequences with existing ones in the rapidly growing sequence databases. This has led to the recurring identification of certain sequence patterns, usually corresponding to less than 300 amino-acids in length. Many of these identifiable sequence regions have been shown to fold up to form a ‘domain’ structure; they are often called protein ‘modules’ (see definitions below). Proteins that contain such modules are widely distributed in biology, but they are particularly common in extracellular proteins.