SELECTpro Solo: Model Selection and Sidechain Prediction ################################################################################ Overview of SELECTpro Solo ################################################################################ SELECTpro is a purely structure-based method for scoring models and selecting the most native-like model(s) from model sets of any size and diversity. This downloadable version of SELECTpro does not include the feature predictors used by the SELECTpro server to assess models. The full version of SELECTpro (SELECTpro 1.0) is also available for download as a stand alone application from the IGB download page: http://download.igb.uci.edu/ What this version contains: * The SELECTpro executable and some high level scripts to run it on model sets. * Scripts for saving side-chain conformations predicted by SELECTpro. * The same rotamer library as SELECTpro 1.0. What this version DOES NOT contain: * The feature predictors SSpro, ACCpro, CMAPpro. Why would I want SELECTpro Solo and not SELECTpro 1.0? 1. Disk Space: * SELECTpro Solo requires only 13 Mb of space * SELECTpro 1.0 requires 1.6 Gb of space 2. Flexible Feature Sources: * If you plan to use methods other than those utilized by the SELECTpro server for predicting secondary structure, solvent accessibility, and residue-residue contacts then you don't need the feature predictors installed locally. Contact: Dr. Pierre Baldi School of Information and Computer Sciences University of California Irvine pfbaldi@ics.uci.edu ################################################################################ SELECTpro Reference: ################################################################################ Randall A, Baldi P: SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERS. BMC Structural Biology 2008, in press. ################################################################################ System Requirements ################################################################################ Platform: Linux Software: Perl Disk Space: 13 Mb ################################################################################ Install SELECTpro for Linux ################################################################################ 1) unzip the tarball e.g. tar xzf selectpro_solo.tar.gz 2) change the directory to unzipped selectpro_solo directory e.g. cd selectpro_solo 3) open configure.pl and set the $install_dir to the selectpro_solo installation dir. e.g. /home/your_home_dir/selectpro_solo/ 4) execute configure.pl to configure and install the package. e.g. ./configure.pl Installaltion is done! ################################################################### Test SELECTpro ################################################################### 1) test scoring all models in a directory with SELECTpro: cd $install_dir/test/ ../bin/selectpro_score_dir.sh ./pxml/T0288.no_coords.pxml $install_dir/test/models/ T0288.test.results The scores for all complete models in the directory provided in the second argument should appear in the output file: T0288.test.results. Compare this file with: ./results/T0288.individual.results The two files should be identical. 2) test scoring a single model already loaded into the pxml file and save the resulting model with side-chains predicted by selectpro: cd $install_dir/test/ ../bin/selectpro_score_model_save_sc.sh ./pxml/T0288.Zhang-Server_TS1.pxml T0288.ZS_TS1.test.results T0288.ZS_TS1.test.pdb T0288.ZS_TS1.test.results contains individual energy term scores. Compare to: ./results/T0288.ZS_TS1.results T0288.ZS_TS1.test.pdb contains the predicted side-chains. Compare to: ./sidechains/T0288.ZS_TS1.pdb ################################################################## Descriptions of sub directories ################################################################## bin: shell scripts and SELECTpro executable script: perl scripts for prediction and file processing test: test selectpro. ################################################################### Example File Formats ################################################################### ################################################################### SELECTpro Executable Usage ################################################################### The SELECTpro executable is: $install_dir/bin/selectpro It takes two input parameters: [0] pxml file [1] rotamer library Example usage: ./selectpro prot.pxml rotamer_library_file The executables returns two files: prot.pxml.en : tab-delimited energy terms prot.pxml.pdb : model with side-chains predicted by selectpro The default rotamer library for the high-level command scripts is: $install_dir/data/rotamers/rotamer_library.txt ################################################################### SELECTpro High Level Commmand Scripts ################################################################### To reformat feature prediction file to pxml format required by the executable: /selectpro_solo/script/feature2pxml.pl prot.features prot.pxml [0] input file: file containing predicted features with the following format line 1: name line 2: primary sequence line 3: predicted secondary structure (3-class: 'H' helix, 'E' beta, 'C' other) line 4: predicted solvent accessibility (2-class: 'e' exposed, '-' buried) lines 5+: binary contact map Example .features files format: /test/features [1] output file: pxml file Example .pxml file format: /test/pxml/ NOTE: The contact map is optional. If it is omitted, selectpro will just return 0.0 for E_PRED-CM_fn and E_PRED-CM-fp. To score a single model: /selectpro_solo/bin/selectpro_score_model.sh prot.model.pxml prot.model.results [0] input file: pxml file with model coordinates [1] output file: tab-delimited energy terms To score a single model and save the predicted sidechains: /selectpro_solo/bin/selectpro_score_model_save_sc.sh prot.model.pxml prot.model.results prot.model.pdb [0] input file: pxml file with model coordinates [1] output file: tab-delimited energy terms [2] output file: .pdb file with side-chains predicted by selectpro To score all of the models in a directory: /selectpro_solo/bin/selectpro_score_dir.sh prot.pxml prot.model.results prot.model.pdb [0] input file: pxml file with no coordinates [1] models directory: directory containing models to be scored [2] output file: tab-delimited energy terms, one model per line selectpro_score_dir_sum.sh is equivalent to selectpro_score_dir.sh, but returns the sum only. To score all of the models in a list: /selectpro_solo/bin/selectpro_score_dir.sh prot.pxml prot.model.results prot.model.pdb [0] input file: pxml file with no coordinates [1] input file: list of model files to be scored [2] output file: tab-delimited energy terms, one model per line selectpro_score_list_sum.sh is equivalent to selectpro_score_list.sh, but returns the sum only. ################################################################### SELECTpro Output ################################################################### The summary output from the high-level scripts run on a list of files or a directory contain the model name in the first column and inidividual energy terms in additional columns. The output file from the selectpro executable (.en file) contains the data in the same order, but without the header. The short name for each term, followed by a more detailed description, are presented here: Reduced Representation Energy Terms PRED-SS_h: Residues predicted as helical by SSpro are penalized if they do not helical in the model. PRED-SS_s: Residues predicted as beta by SSpro are penalized if they are not beta in the model. PRED-ACC: Residues predicted as buried by ACCpro are penalized if they are exposed in the model, and residues predicted as exposed are penalized if they are buried. PRED-CM_fn: Pairs of residues predicted to be in contact by CMAPpro are penalized if they are not in contact in the model. PRED-CM_fp: Pairs of residues predicted not to be in contact by CMAPpro are penalized if they are in contact in the model. BETA: Residues of beta-strands predicted by SSpro are penalized if they do not pair with other beta residues. BB-REP: Repulsive term for explicitly represented atoms in model. CT-REP: Repulsive term for side-chain centroids. STAT-ENV: Statistical term for burial/exposure of residue side-chains. STAT-PW-CI: Context independent statistical term for pairwise interactions. STAT-PW-CD: Context dependent statistical term for pairwise interactions. ROG: Models with radius of gyration higher than the value estimated from the sequence length are penalized. All-Heavy Atom Representation Energy Terms SC-HB: Side-chain donor and acceptor atoms that are at least partially buried are penalized if they fail to make hydrogen bonds. LEN-JONES: van der Waals forces with a damped repulsive effect. SOLVATION: Implicit solvation model. ELECTRO: Repulsion and attraction of charged groups. ##################################################################### Release notes ##################################################################### SELECTpro Solo 1.0: released on 11/17/2008 First released version -----------------------------------------------------------------------