Overview

The Paris-Sud yeast structural genomics pilot-project: from structure to function
We present here the outlines and results from our yeast structural genomics (YSG) pilot-project. A lab-scale platform for the systematic production and structure determination is presented. In order to validate this approach, 250 non-membrane proteins of unknown structure were targeted. Strategies and final statistics are evaluated. We finally discuss the opportunity of structural genomics programs to contribute to functional biochemical annotation.

1. Introduction

As systematic genomic sequencing provides a wealth of sequences that code for proteins of unknown structure and function, the gap between our knowledge of one- and three-dimensional structures becomes wider every day [1]. However, we also know that the space of protein folds is much smaller than the space of sequences, and reliable 3D models can be obtained by homology modeling or other theoretical approaches [2] and [3] when the sequence identity is of the order of 30-40%. To do that for all proteins in a genome, we must have representative structures for all protein families. At the end of the nineties, it was estimated that 10,000 new structures were needed to cover the vast majority of the protein families if properly selected [4]. This was nearly one order of magnitude more structures than we had at the time, but could be achieved in less than 10 years by a systematic, worldwide concerted effort of protein crystallographers and NMR scientists.

The term of structural genomics (or more correctly, structural proteomics) was coined to describe the set of technologies needed to accelerate structure determination and achieve that goal. Large-scale structural genomics initiatives were launched in 1998-2000 in the US by the National Institute of Health (NIH), and in Japan by the Riken Laboratory [5] and [6]. In Europe, several local initiatives [7], [8] and [9] preceded the approval of project Structural Proteomics in Europe (SPINE) under the Fifth Framework Program. On our part, we have designed a pilot-project that could be completed in the limited time of 3 years, by a single research group on the Orsay campus of Université Paris-Sud and the neighboring CNRS campus in Gif-sur-Yvette [10]. We wanted an eukaryote as our model organism, given that prokaryotes, especially hyperthermophilic bacteria and Archae, were already largely covered by the US and Japanese initiatives. The compact genome, almost free of introns, of the yeast Saccharomyces cerevisiae made it the obvious choice, and in September 2000, we embarked in the yeast structural genomics (YSG) pilot-project, an adventure that effectively reached its completion at the beginning of 2004.

We gave YSG two major goals. First, we wanted to construct an efficient pipeline for protein production and structure determination, and test it on a limited number of yeast open reading frames (ORFs). We selected as targets some 250 ORFs coding for proteins that had less than 25% sequence identity to each other and to proteins of known structure. At that stage, we also had to eliminate very large proteins and membrane proteins, and rely on existing technology as much as possible, given our limited manpower. Nevertheless, we wanted to develop novel methods and concentrated our efforts on two points: developing a laboratory information management system (LIMS), and a strategy for efficient protein refolding from inclusion bodies [11].

The second goal of YSG was more general in nature: we believed that determining the 3D structure of a protein is an efficient approach to its function at the molecular level [12], and hoped to provide information on the function of at least some of our targets, about half of which had little or no functional annotation. We could do that by observing similarities to functionally known proteins that sequence analysis did not detect, and by identifying unexpected ligands in the crystalline protein [13]. Four years later, we have examples of each of these situations [14] and [15], and confirm that structural studies help prepare, but in no way replace, the detailed biochemical study that is needed to complete the functional assignment of a protein.

We present here the general outline of the project and of the results we achieved. Additional information on YSG can be found on our web site (http://www.genomics.eu.org), on other programs and structural genomics in general, on the web site of the Protein Data Bank (http://www.rcsb.org/pdb/strucgen.html).

2. Automatization and the structural genomics platform

Automatization is a major tool in all structure genomics programs that aim to high-throughput. Whereas our ambitions were initially very limited by budget and staff, we did build a structural genomics platform for protein production and crystallization which, although incomplete, efficiently performs the more difficult steps of the process. We adopted from the start a protein production protocol as simple and robust as possible that could be used for all ORFs (see below). Except for a small fermentation unit, no specific equipment was used and purification remained a manual process until a robot for automatic cloning, transformation and expression was introduced at the end of the pilot-project.

In contrast, we developed robotics for protein crystallization early in the project. We now use two liquid dispensing automates to prepare Greiner® micro-crystallization plates that have 96 wells and three cups for protein drops per well. A Tecan® Genesis automate dispenses 100-150 μl of mother liquor solution in the wells; it can also prepare protein drops of volume larger than 500 nl. A volume of protein is pipetted by the eight needles, followed by the addition of the equivalent volume of precipitant solution and deposited into the cups of the crystallization plate. The protein and precipitant can either be homogenized by successive aspiration/distribution or deposited without further mixing. The process is repeated 12 times to fill the plate. As three different crystallization cups are available, three different protein solutions can be tested in parallel; usually three different protein concentrations, or protein solutions differing in composition (purification, buffer, pH, ionic strength or ligands).

Low volume (100-200 nl) crystallization drops are prepared by a Cartesian® Microsys automate in plates prefilled by the Tecan® robot [16]. The precipitant is deposited first in all the cups, followed by line dispensing of the protein solution in the 96 wells. Mixing the protein and precipitant is achieved only by the impact of the protein drop projected into the precipitant drop. The apparatus is enclosed in a humidity-controlled chamber to limit drop evaporation.

This combination of two robots can prepare about 10 plates with three times 96 protein drops each, that is about 3000 crystallization experiments per day. This impressive number of samples must be repeatedly observed in the following days to detect the presence of crystals, or at least of hits: promising conditions that must be further explored. Together with BioTom® in Evry, we developed the Biostore storage and visualization automate to do that. This automate, now connected to our LIMS data base (see below), can store up to 600 96-well plates, and take video images of all drops in a plate in about 90 s. Visual inspection is time consuming, and image analysis for the automatic detection of crystals is an important issue, but existing software is still far from reliable, and false negatives imply that some crystals can be missed.

3. A strategy for gene cloning, protein production and purification

Initially, all procedures in these steps were designed to be performed manually, but in a way that could be implemented on a cloning robot (Quevillon-Cheruel et al., in press). The experimental flowchart consists of five steps (Fig. 1): (i) target ORFs are cloned in a bacterial expression vector, (ii) the expression level and solubility of the protein is estimated; (iii) when inclusion bodies are produced, a multiple layer decision strategy is implemented; (iv) the recombinant proteins are produced on a larger scale after optimizing a synthetic culture medium and introducing appropriate labels for X-ray or NMR; (v) the proteins are purified and characterized. Except for (iii) and specific features mentioned below, this flowchart is that of many other structural genomics projects.

(JPEG)
Figure 1
General flowchart of the south Paris YSG pilot-project.

Target ORFs are cloned into an E. coli expression vector derived from the pET vector and designed to add a six-histidine tag directly to the C-terminal of the protein. As none of our target ORFs contain introns, genomic DNA of the S288C S. cerevisiae strain is used as a template for cloning. Expression is under the control of the T7 promoter and inducible by IPTG (Novagen). In a first phase, parameters for recombinant protein expression are optimized to yield highly expressed soluble proteins in 5 ml cultures (manually) or 0.5 ml cultures in 96-well plates (automated). Optimization includes testing: (i) different E. coli strains: BL21(DE3), the Rosetta(DE3) strain, which co-expresses rare tRNAs, the Gold(DE3) strain with or without a pLysS plasmid; (ii) induction at different temperatures: 37, 25 or 15 °C, the lower temperature reducing the production of inclusion bodies; (iii) co-expression of chaperone proteins [17]; (iv) the duration of induction: 3 h or overnight; (v) cell lysis by freezing/thawing or chemical lysis. The entire procedure has now been implemented on a cloning-purification automate based on the RoboSeq® 4204S robot from MWG AG Biotech. This automate can also perform a test purification on a Ni-NTA resin, useful to estimate the solubility of the purified protein or to screen solubilization buffers.

All soluble expressed proteins are purified in a matter of hours under the same standard protocol consisting in two chromatographic steps: a Ni-NTA resin exploiting the six-histidine tag, and a size-exclusion column which also provides information on the oligomeric state of the protein. The purity and integrity of the sample are then checked by SDS-PAGE and mass spectrometry.

The histogram of Fig. 2 shows that the cloning step of our standard protocol is highly efficient: more than 90% of our target ORFs have been successfully, and easily, cloned. Small-scale expression in BL21 (DE3) and derivative strains is also efficient, with 88% of our constructs yielding between 0.1 and 100 mg of recombinant protein per l of culture. Early on, we had tried the ligase-free “Topo-TA cloning” of Invitrogen, and found expression to be generally less efficient than with our protocol (Quevillon-Cheruel et al., in press). On the other hand, protein production could be up-scaled in only 40% of the cases to yield the quantity required for crystallization screenings: typically 100 μl of stable homogeneous protein at 5-10 mg/ml. Moreover, a large fraction of the highly expressed proteins ended in inclusion bodies, about 60% in early trials and 35% after optimization of the conditions. We have, therefore, developed a strategy to recover proteins from inclusion bodies that includes in vitro refolding, co-expression of chaperones, and in vitro expression using cell-free extracts. Co-expression of chaperones proved to be the most efficient of this procedure, as it increased by 10% to 90% the solubility of 17 out of 29 proteins in a test set of ORFs that gave inclusion bodies [11]. This strategy has become standard in our project.

(JPEG)
Figure 2
Statistics of the YSG pilot-project.

4. Crystallogenesis

All structures in YSG were determined by X-ray crystallography. This made crystallization both an essential step and a major bottleneck in structure determination, and explains why we invested in automatization early in the project. A variety of chemicals and physicochemical conditions can cause proteins to crystallize, but none is known a priori and many conditions must be screened for each new protein: precipitants, pH, additives. The manual set up of crystallization drops is very costly in manpower, and crystallization robots are one of the major breakthroughs of structural genomics. In addition to accelerating the experiments, robots use less protein par test, so that more tests can be done on each sample and the success rate of the crystallization screening increases. Results from our project show that it can be rather high, nearly 50% of proteins entering crystal trials giving crystals of some sort. However, the first hits are often not single crystals suitable for diffraction experiments, but crystalline precipitates, needles, thin plates, micro- or poly-crystals.

Optimizing crystal conditions is a time consuming process that involves reproducing manually the crystallization conditions found by the robot and making systematic variations around them. We have developed a high-throughput optimization protocol that uses exclusively the liquid handling robots to perform all these steps (Leulliot et al., submitted). It involves setting a series of screens exploring one, two, three or four parameters (1D, 2D, 3D and 4D grid-screens), testing hundreds of additives, salts, buffers (additive-, salt- and pH-screens) and potential cryo-protectants (cryo-screen). The robot is supplied with a few separately prepared stock solutions, which it combines to achieve the appropriate gradients, dilutions and mixes. Large-scale crystal production can also be set up by the robots. Fig. 3 shows examples of crystals of ORF66 (top) and ORF228 (bottom) that improved from the first crystalline hit (left) through two successive optimization steps (middle), to yield harvestable crystals (right). This strategy also improves reproducibility and reduces protein consumption, thanks to the nanoliter drop dispensing technology. Moreover, several hits on one or more proteins can be optimized in parallel, and a more extensive search of optimization conditions can be performed.

(JPEG)
Figure 3a
Images of successfully optimized crystals.
(JPEG)
Figure 3b
Structures solved as part of our pilot-project.

5. Solving protein structures

Of a total of 60 proteins that were purified to homogeneity in our pilot-project, 22 crystallized and 14 yielded structures, an overall yield of about 25%. Solving a crystal structure relied on synchrotron radiation for data collection and the anomalous dispersion method for phase determination. The process proved very efficient once diffracting crystals were obtained, and we are grateful to the European Synchrotron Radiation Facility (ESRF) and its staff in Grenoble, for providing excellent facilities.

With two exceptions, we used selenium anomalous dispersion to solve the phase problem [18] on crystals grown from seleno-methionine labeled proteins and diffraction patterns measured near the selenium X-ray absorption edge. Two of our target proteins (YDR533c and YER010c) had too few methionines in their sequence to yield a good anomalous dispersion signal, but we could introduce additional ones by site directed mutagenesis [15]. The substituted positions were selected by comparing the sequence to close homologs with more methionine residues. The first exception was the His6 protein (one methionine out of 261 residues). Crystals of this protein could not be obtained from the methionine mutant, and we failed to phase by multiple isomorphous replacement (MIR) on crystals soaked in heavy metal salts, or by molecular replacement using possible structural homologs as models. Eventually, we solved this structure by exploiting the anomalous signal of the nine sulfur atoms (one methionine and eight cysteines) present in the protein, on accurate data collected on beam line BM14 at ESRF. The second exception was the D-ribose-5-phosphate isomerase enzyme whose structure was solved by the molecular replacement method, using the structure of the E. coli orthologue (solved by another structural genomics program) as search model [19].

Of the 14 structures that were solved within our pilot-project, only one has a totally new fold (YGL148w). Eight have a fold that had been already observed in other proteins, but could not be predicted from the sequence. Five more proteins have folds that could be predicted by threading methods, although sequence similarity was too weak to build a reliable model. A similar ratio of about one new fold per 10 novel protein structures has been observed in other structural genomics program. Even though our target ORFs had been selected to have no structural homolog, almost all of our structures proved similar to at least one already in the Protein Data Bank. The root-mean-square deviations were around 2 Å over half of the main chain or more, which suggests that the structures could have been solved by molecular replacement. We made extensive attempts to do that on those proteins for which we had fold predictions with good scores, with no success. Improved methods to detect structural homologs based on sequence, and more powerful molecular replacement tools would make structure resolution faster and simpler by avoiding the time and money consuming seleno-methionine labeling step.

6. HalX: a laboratory information management system (LIMS) for structural genomics projects

Structural genomics projects create a vast amount of very diverse data. The large number and variety of the target proteins, the diversity of the experimental steps leading to the final structures, the abundance of new lab practices and new protocols, make it very hard to follow and analyze all the experiments without the use of a powerful informatics tool. All structural genomics projects very soon found that they needed a high-level information storage and tracking system. Moreover, such a system allows extensive data mining, which can be used to improve protocols by learning from the experimental observations. Thus, the Japanese Riken project could show by mining their own data that the expression system can influence the crystallization of a protein [20] and, in the US, the NIH-funded North-East Structural Genomics center derived a decision-tree from their experimental data, helping the choice of targets, which have a greater chance to yield soluble and stable proteins [21].

Commercial LIMS are expensive, lack flexibility and are designed for industrial use. They are poorly adapted to an academic research environment. Thus, we developed our own system, which we named HalX. HalX is designed to record all types of experiments from cloning to structure determination in an ordered manner allowing extensive data mining. Any experimental protocol can be introduced in its data base through a “brick” system: experiments are regarded as bricks that users pile up to construct a protocol. The experimental parameters (temperature, volume, concentration, etc.) need to be defined only once. Whole protocols may be used as default, or they may be edited to make new, slightly different versions. Rather than linear successions of experiments, the protocols are graphs that much better represent the reality of experimentation.

HalX is the ground for an active international collaboration with the European Bioinformatics Institute in UK and the Weizmann Institute in Israel, which concerns both the data model and the user interface. The LIMS is already in use in other structural genomics centers that provide precious user’s feedback for further development. Moreover, it is not just tailored on the needs of structural genomics, but it aims to serve structural biology labs in general.

HalX is a free-source software distributed under GPL licence. Documentation, a demo version and the complete source code are available on our web site (http://www.halx.genomics.eu.org). We are now working on connections to other tools, especially sequence analysis tools, to be able to update, either automatically or on the fly, the general information concerning the targets.

7. Conclusion and perspectives

The Paris-Sud YSG pilot-project illustrates that structural genomics can be done on the scale of a medium-sized structural biology group. Whereas the number of targets was tailored to the size of our group, the yield in terms of proteins purified and structures solved is comparable to that of other structural genomics initiatives. As we obtained structures, we paid increasing attention to function and could infer at least some functional properties for several of our targets that had no functional annotation. Table 1 summarizes our findings. In most cases, the inference was based on fold similarity with other proteins and/or the observation that a ligand was bound in the crystal. In a few cases, we performed additional experiments and were able to demonstrate an activity in vitro. It may or may not correspond to the protein function in vivo, but it constitutes a testable hypothesis and, hopefully, a useful contribution to our knowledge of the structure-function relationship in proteins.

Table1: ORFs whose structures were determined in the pilot-project

ORF Annotation FoldResolution (Å) Remarks (a) PDB code(s) Reference
Known function
YGL148w Chorismate synthase New fold 2.2 Homotetramer 1R52  [22]
YIR029w Allantoicase Two jelly roll β-sheet motifs 2.6 Homohexamer 1SG3  [23]
YDR435c PPM1: carboxy methyl transferase for protein phosphatase 2A catalytic subunit SAM-dependent methyl transferase fold 1.8 Free form and complexes with SAM and AdoHCy 1RJD  [24]
YOR357c Sorting nexin GRD19 Phox homology domain 2.0 Free form and complex with D-myo-phosphatidyl-inositol 3-phosphate 1OCS  [25]
YIL020c His6: phosphoribosyl-aminoimidazole-carboxamide ribotide isomerase β/α barrel 1.3 Phased from sulfur anomalous signal (nine S atoms for 261 residues) None To be published
YOR095c Ribose-5-phosphate ketol-isomerase α/β fold 2.1 Two α/β domains None To be published
Function derived from structure
YGR205w ATP-dependent kinase Nucleotide binding fold 2.25 Resemblance to E. coli pantothenate kinase. Binds ATP. Substrate remains to be identified 1ODF  [26]
YLR011w NAD(P)H-dependent FMN reductase Flavodoxin fold 2.0 Homodimer. Binds FMN. Ferricyanide reductase activity 1T0I  [14]
YML079w None Jelly roll fold from cupin superfamily 1.75 Homodimer. Binds adenine and guanine None Submitted
YDR533c Class II of the Hsp31 family α/β hydrolase fold 1.85 Homodimer. Catalytic Cys-His-Glu triad 1QVV  [15]
YFL030w Alanine:glyoxylate aminotransferase Class V PLP-dependent enzyme with fold-type I 2.6 Homodimer. Binds PLP. Enzymatic activity None Submitted
Unknown function
YHR029c None Kinked double hotdog fold 2.2 Domain duplication. Domain swapping None Meyer et al., proteins, accepted
YER010c None Three-layer β/β/α structure 1.7 Homotrimer None To be published
YHR049w None α/β hydrolase fold 1.7 Catalytic serine None To be published

Acknowledgments

This work was supported by research grants from the Réseau National des Genopoles, the SPINE program, and the Association de Recherche contre le Cancer (grant M. Graille). We are grateful to Dr. Roger Fourme for continuous interest in the project.

(a) Ligand binding and catalytic activities were determined by our group after solving the X-ray structure.

[1] S.K. Burley, S.C. Almo, J.B. Bonanno, M. Capel, M.R. Chance and T. Gaasterland et al., Nat. Genet. 23 (1999), pp. 151-157.

[2] D. Vitkup, E. Melamud, J. Moult and C. Sander, Nat. Struct. Biol. 8 (2001), pp. 559-566.

[3] E.N. Baker, V.L. Arcus and J.S. Lott, Appl. Bioinformatics 2 (2003), pp. S3-S10.

[4] R. Sanchez, U. Pieper, F. Melo, N. Eswar, M.A. Marti-Renom, M.S. Madhusudhan, N. Mirkovic and A. Sali, Nat. Struct. Biol. 7 (2000) (Suppl.), pp. 986-990.

[5] E. Lattman, Proteins 34 (2004), pp. 611-615.

[6] S. Yokoyama, H. Hirota, T. Kigawa, T. Yabuki, M. Shirouzu, T. Terada, Y. Ito, Y. Matsuo, Y. Kuroda, Y. Nishimura, Y. Kyogoku, K. Miki, R. Masui and S. Kuramitsu, Nat. Struct. Biol. 7 (2000) (Suppl.), pp. 943-945.

[7] R. Vincentelli, C. Bignon, A. Gruez, S. Canaan, G. Sulzenbacher, M. Tegoni, V. Campanacci and C. Cambillau, Acc. Chem. Res. 36 (2003), pp. 165-172.

[8] C. Abergel, B. Coutard, D. Byrne, S. Chenivesse, J.B. Claude, C. Deregnaucourt, T. Fricaux, C. Gianesini-Boutreux, S. Jeudy, R. Lebrun, C. Maza, C. Notredame, O. Poirot, K. Suhre, M. Varagnol and J.M. Claverie, J. Struct. Funct. Genomics 4 (2003), pp. 141-157.

[9] U. Heinemann, Ernst Schering Res Found Workshop (2001), pp. 101-121.

[10] S. Quevillon-Cheruel, B. Collinet, C.Z. Zhou, P. Minard, K. Blondeau, G. Henkes, R. Aufrere, J. Coutant, E. Guittet, A. Lewit-Bentley, N. Leulliot, I. Ascone, I. Sorel, P. Savarin, I.L. de La Sierra Gallay, F. De la Torre, A. Poupon, R. Fourme, J. Janin and H. Van Tilbeurgh, J. Synchrotron Radiat. 10 (2003), pp. 4-8.

[11] L. Tresaugues, B. Collinet, P. Minard, G. Henckes, R. Aufrere, K. Blondeau, D. Liger, C.Z. Zhou, J. Janin, H. Van Tilbeurgh and S. Quevillon-Cheruel, J. Struct. Funct. Genomics 5 (2004), pp. 195-204.

[12] C. Zhang and S.H. Kim, Curr. Opin. Chem. Biol. 7 (2003), pp. 28-32.

[13] T.I. Zarembinski, L.W. Hung, H.J. Mueller-Dieckmann, K.K. Kim, H. Yokota and R. Kim et al., Proc. Natl. Acad. Sci. USA 95 (1998), pp. 15189-15193.

[14] D. Liger, M. Graille, C.Z. Zhou, N. Leulliot, S. Quevillon-Cheruel, K. Blondeau, J. Janin and H. Van Tilbeurgh, J. Biol. Chem. 279 (2004), pp. 34890-34897.

[15] M. Graille, S. Quevillon-Cheruel, N. Leulliot, C.Z. Zhou, I.L. De la Sierra Gallay, L. Jacquamet, J.L. Ferrer, D. Liger, A. Poupon, J. Janin and H. Van Tilbeurgh, Structure (Camb) 2 (2004), pp. 839-847.

[16] G. Sulzenbacher, A. Gruez, V. Roig-Zamboni, S. Spinelli, C. Valencia, F. Pagot, R. Vincentelli, C. Bignon, A. Salomoni, S. Grisel, D. Maurin, C. Huyghe, K. Johansson, A. Grassick, A. Roussel, Y. Bourne, S. Perrier, L. Miallau, P. Cantau, E. Blanc, M. Genevois, A. Grossi, A. Zenatti, V. Campanacci and C. Cambillau, Acta Crystallogr. D Biol. Crystallogr. 58 (2002), pp. 2109-2115.

[17] K. Nishihara, M. Kanemori, H. Yanagi and T. Yura, Appl. Environ. Microbiol. 66 (2000), pp. 884-889.

[18] W.A. Hendrickson and M.M. Teeter, Nature 290 (1981), pp. 107-113.

[19] E.S. Rangarajan, J. Sivaraman, A. Matte and M. Cygler, Proteins 48 (2002), pp. 737-740.

[20] S. Yokoyama, Curr. Opin. Chem. Biol. 7 (2003), pp. 39-43.

[21] C.S. Goh, N. Lan, S.M. Douglas, B. Wu, N. Echols, A. Smith, D. Milburn, G.T. Montelione, H. Zhao and M. Gerstein, J. Mol. Biol. 336 (2004), pp. 115-130.

[22] S. Quevillon-Cheruel, N. Leulliot, P. Meyer, M. Graille, M. Bremang, K. Blondeau, I. Sorel, A. Poupon, J. Janin and H. Van Tilbeurgh, J. Biol. Chem. 279 (2004), pp. 619-625.

[23] N. Leulliot, S. Quevillon-Cheruel, I. Sorel, M. Graille, P. Meyer, D. Liger, K. Blondeau, J. Janin and H. Van Tilbeurgh, J. Biol. Chem. 279 (2004), pp. 23447-23452.

[24] N. Leulliot, S. Quevillon-Cheruel, I. Sorel, I.L. de La Sierra-Gallay, B. Collinet, M. Graille, K. Blondeau, N. Bettache, A. Poupon, J. Janin and H. Van Tilbeurgh, J. Biol. Chem. 279 (2004), pp. 8351-8358.

[25] C.Z. Zhou, I.L. de La Sierra-Gallay, S. Quevillon-Cheruel, B. Collinet, P. Minard, K. Blondeau, G. Henckes, R. Aufrere, N. Leulliot, M. Graille, I. Sorel, P. Savarin, F. De la Torre, A. Poupon, J. Janin and H. Van Tilbeurgh, J. Biol. Chem. 278 (2003), pp. 50371-50376.

[26] I.L. de La Sierra-Gallay, B. Collinet, M. Graille, S. Quevillon-Cheruel, D. Liger, P. Minard, K. Blondeau, G. Henckes, R. Aufrere, N. Leulliot, C.Z. Zhou, I. Sorel, J.L. Ferrer, A. Poupon, J. Janin and H. Van Tilbeurgh, Proteins 54 (2004), pp. 776-783.