################################################################################ Name: PPVED - Plant Protein Variation Effect Detector Version: 1.0 Author: Xiangjian Gou (Sichuan Agricultural University) Date: 2021/01/20 Function: PPVED predicts the functional effect of single amino acid substitution in plants using the ensemble of XGBoost models. Contact: Xiangjian Gou Sichuan Agricultural University xjgou@stu.sicau.edu.cn Yanli Lu Sichuan Agricultural University luyanli@sicau.edu.cn ################################################################################ Next, this document will teach you how to download, install and use PPVED. 1. Download You can download PPVED from http://www.ppved.org.cn/packages/PPVED-1.0.tar.gz 2. Install PPVED was installed and tested on a 64 bit CentOS Linux server. You should need at least 60Gb of disk space to complete the installation of PPVED. The installation of PPVED is divided into the following five steps: (1) Decompress PPVED. tar zxvf PPVED-1.0.tar.gz cd PPVED-1.0 (2) Check whether Perl and R languages exist. perl -v Rscript --version (3) Install dependent seven prediction software. Read the file 'install/install_dependent_software.txt' to learn how to install the seven prediction software. When you finish the installation, please modify the content of the file 'required_software.txt' to suit your system. (4) Install dependent R package 'xgboost'. First, we assume that you intend to install the R package in the path '/datadisk/PPVED-1.0/rpackage': cd rpackage echo "export R_LIBS=/datadisk/PPVED-1.0/rpackage:\$R_LIBS" >> ~/.bashrc source ~/.bashrc Before installing the package 'xgboost', you may need to install several dependent packages (e.g. data.table, magrittr and stringi), of course, these packages may already exist in your system: R CMD INSTALL -l /datadisk/PPVED-1.0/rpackage data.table_1.13.6.tar.gz R CMD INSTALL -l /datadisk/PPVED-1.0/rpackage magrittr_2.0.1.tar.gz tar zxvf stringi.tar.gz #note: here, package 'stringi' is not installed through source code. Finally, install the package 'xgboost': R CMD INSTALL -l /datadisk/PPVED-1.0/rpackage xgboost_0.90.0.2.tar.gz (5) Set the search path of shared library files. echo "export LD_LIBRARY_PATH=.:\$LD_LIBRARY_PATH" >> ~/.bashrc source ~/.bashrc Congratulations, the installation is complete. Please do not move any file of PPVED ! 3. Test (1) Change the current working directory to the subdirectory 'example', and then: perl ../PPVED.pl -i example.input -o example.test #The output file 'example.test' should be identical to the provided file 'example.output'. (2) You can see the format of input file by using the following command: perl PPVED.pl -e (3) You can also see the help document of PPVED by using the following command: perl PPVED.pl -h 4. Format of output The local version of PPVED supports batch calculation, and the output file contains a total of five columns: col1: the single amino acid substitution number col2: the name of protein col3: the single amino acid substitution col4: the predicted probability score, the value is between 0-1 col5: the predicted binary classification (functional: score >= 0.5; neutral: score < 0.5)