Program Name: GeneStitch Version: 1.0 Developer: Yu-Wei Wu Mina Rho Thomas Doak Yuzhen Ye Affiliation: School of Informatics and Computing, Indiana University >> INTRODUCTION GeneStitch is a tool to assemble genes using network matching algorithm. Given an already-assembled dataset, it is capable of assembling contigs together to form more complete genes with the help of a reference gene set. Currently the assembly software that GeneStitch support is SOAPdenovo. GeneStitch's home on the web: http://omics.informatics.indiana.edu/hmp/GeneStitch/ >> INSTALLATION To make an executable file, simple type "make." The executable file "GeneStitch" will be created and placed in the directory. >> RUNNING GeneStitch Usage: GeneStitch -input [seq filename] -db [db filename] -kmer_len [kmer length] -merge_graph [yes(default)/no] [-db_protein (protein db filename)] [-BLAST_path (full path of blastall program)] [-use_FragGeneScan (full path of run_FragGeneScan.pl)] Parameters: -input: the soapdenovo output prefix -db: the reference gene set (in nucleotide) -kmer_len: the kmer setting for the soapdenovo assembly -merge_graph: whether graphs will be merged or not (optional parameters) -db_protein: the reference gene set (in amino acid). This option will force the BLAST search to run BLASTX instead of BLASTN. One should note that the fasta header of the protein gene set need to be EXACTLY THE SAME as the nucleotide gene set. -BLAST_path: if the blast path is not set in the $PATH$ parameter, you can also set the path in this parameter. -use_FragGeneScan: here you can specify the path of FragGeneScan so that the GeneStitch program will first predict genes using FragGeneScan and then do further analysis. example (assume the prefix of SOAPdenovo is sra_data, and the kmer length is 31): (use only nucleotide database) GeneStitch -input sra_data -db db_nuc.fa -kmer_len 31 (use both nucleotide and protein database) GeneStitch -input sra_data -db db_nuc.fa -db_protein db_pro.fa -kmer_len 31 (use both nucleotide and protein database and set the BLAST path) GeneStitch -input sra_data -db db_nuc.fa -db_protein db_pro.fa -kmer_len 31 -BLAST_path /bin/ncbi/blast/bin/blastall (use only nucleotide database and use FragGeneScan) GeneStitch -input sra_data -db db_nuc.fa -kmer_len 31 -use_FragGeneScan /bin/FragGeneScan/run_FragGeneScan.pl >> ACKNOWLEDGEMENTS Development was supported by NIH 1R01HG004908 and NSF DBI-0845685 to YY