Access the full text.
Sign up today, get DeepDyve free for 14 days.
S. Altschul, Thomas Madden, A. Schäffer, Jinghui Zhang, Zheng Zhang, W. Miller, D. Lipman (1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic acids research, 25 17
J. Ward, J. Sodhi, L. McGuffin, B. Buxton, David Jones (2004)
Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.Journal of molecular biology, 337 3
David Jones (1999)
Protein secondary structure prediction based on position-specific scoring matrices.Journal of molecular biology, 292 2
L. Iakoucheva, C. Brown, J. Lawson, Z. Obradovic, A. Dunker (2002)
Intrinsic disorder in cell-signaling and cancer-associated proteins.Journal of molecular biology, 323 3
L. McGuffin, K. Bryson, David Jones (2000)
The PSIPRED protein structure prediction serverBioinformatics, 16 4
A. Dunker, Z. Obradovic (2001)
The protein trinity—linking function and disorderNature Biotechnology, 19
David Jones, J. Ward (2003)
Prediction of disordered regions in proteins from position specific score matricesProteins: Structure, 53
Vol. 20 no. 13 2004, pages 2138–2139 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/bth195 The DISOPRED server for the prediction of protein disorder Jonathan J. Ward, Liam J. McGuffin, Kevin Bryson, Bernard F. Buxton and David T. Jones Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK Received on February 19, 2004; revised and accepted on March 20, 2004 Advance Access publication March 25, 2004 ABSTRACT high resolution X-ray crystal structures. Disorder was identi- Summary: Dynamically disordered regions appear to be fied with those residues that appear in the sequence records relatively abundant in eukaryotic proteomes. The DISOPRED but with coordinates missing from the electron density map. server allows users to submit a protein sequence, and returns This is an imperfect means for identifying disordered residues, a probability estimate of each residue in the sequence being since missing coordinates can also arise as an artifact of the disordered. The results are sent in both plain text and graph- crystallization process, although this has the benefit of being ical formats, and the server can also supply predictions of a simple automatic procedure that does not require further secondary structure to provide further structural information. experimental study of the protein. Availability: The server can be accessed by non-commercial DISOPRED2 initially runs a PSI-BLAST search (Altschul users at http://bioinf.cs.ucl.ac.uk/disopred/ et al., 1997) over a filtered sequence database. Each residue Contact: [email protected] is then encoded by the profile for a window of 15 positions in the sequence and classified using a neural network. The classifier is trained using a support vector machine learning INTRODUCTION algorithm and outputs a probability estimate of the residue Most efforts in structural bioinformatics have been directed being disordered. at the prediction of globular protein structure but there is The server makes several options available to the user, an increasing appreciation of the importance of disordered including the option of returning the hits and/or the alignments regions in the function of many proteins (Iakoucheva et al., from the PSI-BLAST search. 2002; Dunker and Obradovic, 2001). Disordered regions The server also provides a facility for setting the estimated are dynamically flexible and are distinct from irregular loop false positive (FP) rate of the classifier. This allows the user to secondary structures, which are static in solution. Disorder alter the precision and recall characteristics of the prediction, prediction is also likely to be a valuable tool for identi- which are defined as fying flexible regions that may hinder successful protein TP crystallization. Precision = , TP + FP The DISOPRED server uses a knowledge-based method to predict dynamically disordered regions from the amino TP Recall = , acid sequence. The method is developed from the original TP + FN DISOPRED predictor (Jones and Ward, 2003), which was where TP is the number of disordered examples correctly assessed in the most recent CASP5 experiment (5th Critical classified. FP and FN are the numbers of over- and under- Assessment of techniques for Structure Prediction). predictions of disorder, respectively. Receiver operating char- acteristic curves and precision/recall tables are included in the PREDICTION OF DISORDERED REGIONS help section. WITH DISOPRED The classifier’s performance was benchmarked on targets Single letter amino acid sequences can be submitted to the from CASP5. This gave accuracy (Q ) estimates of ∼93.1% DISOPRED server with the results delivered to the user by with a Matthew’s correlation coefficient of 0.51 for the FP rate email. The server uses the DISOPRED2 dynamic disorder threshold of 5%. prediction method (Ward et al., 2004), which is trained on The results are sent in plain text format along with hyper- text links to postscript, portable document format and jpeg To whom correspondence should be addressed. images. These graphics show plots of the sequence disorder 2138 Bioinformatics 20(13) © Oxford University Press 2004; all rights reserved. DISOPRED server to DISOPRED and can be included with little computa- tional overhead. If this option is checked, links to graphical representations of the predictions are provided using the PSIPREDView Java application (McGuffin et al., 2000). REFERENCES Altschul,S.F., Madden,T.L., Schaffer,A.L, Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI- BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. Dunker,A.K. and Obradovic,Z. (2001) The protein trinity-linking function and disorder. Nat. Biotech., 19, 805–806. Iakoucheva,M., Brown,C.J., Lawson,J.D., Obradovic,Z. and Dunker,A.K. (2002) Intrinsic disorder in cell-signalling and cancer-associated proteins. J. Mol. Biol., 323, 573–584. Jones,D.T. (1999) Protein secondary structure prediction based on Fig. 1. Disorder profile of the intracellular loop region of gliotactin position-specific scoring matrices. J. Mol. Biol., 292, 195–202. from Drosophila. The plot shows position in the sequence against Jones,D.T. and Ward,J.J. (2003) Prediction of disordered regions probability of disorder; 140 of the residues are classified as dis- in proteins from position-specific scoring matrices. Proteins, 53 ordered at the default threshold. 573–578. McGuffin,L.J., Bryson,K. and Jones,D.T. (2000) The PSIPRED protein structure prediction server. Bioinformatics, 16, profile (Fig. 1) which show the user to set arbitrary decision 404–405. thresholds by visual inspection. Ward,J.J., Sodhi,J.S., McGuffin,L.J., Buxton,B.F. and Jones,D.T. PSIPRED secondary structure predictions are also included (2004) Prediction and functional analysis of native disorder in to provide further structural information on the protein proteins from the three kingdoms of life. J. Mol. Biol., 337, (Jones, 1999). PSIPRED predictions use identical inputs 635–645.
Bioinformatics – Oxford University Press
Published: Mar 25, 2004
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.