Program Name: GeneStitch Version: 1.2 Developer: Yu-Wei Wu Mina Rho Thomas Doak Yuzhen Ye Affiliation: School of Informatics and Computing, Indiana University >> INTRODUCTION GeneStitch is a tool to assemble genes using network matching algorithm. Given an already-assembled dataset, it is capable of assembling contigs together to form more complete genes with the help of a reference gene set. Currently the assembly software that GeneStitch support is SOAPdenovo. Users are required to firstly assemble the input dataset using SOAPdenovo then feed the assembled data into GeneStitch. GeneStitch's home on the web: http://omics.informatics.indiana.edu/hmp/GeneStitch/ >> 3rd-party Software Several 3rd-party software (licensed under GNU General Public Licensing) were incorporated in GeneStitch. - RAPSearch2: an even faster RAPSearch that supports multi-threading (http://omics.informatics.indiana.edu/mg/RAPSearch2/) - FragGeneScan: an application for finding (fragmented) genes in short reads. (http://omics.informatics.indiana.edu/FragGeneScan/) >> INSTALLATION After decompressing the tarball file, please type './compile_GeneStitch' under the decompressed directory. The executables of GeneStitch and other 3rd-party software will be compiled. The GeneStitch executable will then be placed under the decompressed GeneStitch directory. >> RUNNING GeneStitch Usage: GeneStitch -input [seq filename] -db [database fasta filename] -kmer_len [kmer length] [-thread (thread num, default 1)] Parameters: -input: the soapdenovo output prefix -db: the reference gene set (in nucleotide fasta format) -kmer_len: the kmer setting for the soapdenovo assembly (optional parameters) -thread: The thread number provided to GeneStitch and used by RAPSearch (GeneStitch currently does not support multi-thread). example (assume the prefix of SOAPdenovo is sra_data, and the kmer length is 31): GeneStitch -input sra_data -db db_nuc.fa -kmer_len 31 >> Output Given a SOAPdenovo assembly with any given header, GeneStitch will produce 3 files, including .assem, .path, and .log. header.assem: the assembled gene sequences header.path: the contigs contributed to the assembled genes header.log: log file >> License GeneStitch is a free software licensed under GNU General Public License. >> ACKNOWLEDGEMENTS Development was supported by NIH 1R01HG004908 and NSF DBI-0845685 to YY