<table><tr><td width=88><img src="%PUBURL%/%WEB%/SharedImages/wip.gif"></td> <td valign="center"><b><font color="red" size=+2>Warning!</font><br><font size="+1">This page is under construction. The instructions are currently not complete!</font></b></td></tr></table> ---+!! Mutual Information Support for RNA Secondary Structure Models Predicting base interactions in RNA structures from the phylogenetic-significance of mutual information between columns. %TOC% ---++ Overview: What do these programs do? Reference. ---++ Installation ---+++ Install a modified version of rate4site The program <b>[[http://www.tau.ac.il/~itaymay/cp/rate4site.html][rate4site]]</b> is used to infer a phylogenetic tree with per-site substitution rates from the observed sequence alignment. In order to properly function on RNA alignments, <b>rate4site</b> requires a minor source code modification to deal with gap characters as a separate state. For convenience, I have included a modified version of the complete source release for download here. | [[%ATTACHURL%/rate4site.tgz][%ICON{download}% Download <b>rate4site</b>]] | version 2.01 (Nov06) modified | 18 December 2008 | The included Makefile is for compiling under Windows. Instead, compile using this command: <code><div style="border-color: grey; border-style: solid; border-width: 1px; padding:1px;"> g++ -Dunix -DDOUBLEREP -o rate4site -O3 *.cpp </div></code> Finally, copy the new <b>rate4site</b> executable to a bin directory (e.g. =/usr/local/bin=) or add its location to your $PATH so that the mutual information scripts can employ it. ---+++ Install the esl-weight program from Infernal Chances are that you already have Infernal installed if you routinely work with RNA alignments. If not, download Infernal, compile, and install according to the included instructions. | [[http://infernal.janelia.org/][%ICON{external}% Official Infernal Site]] | These scripts only need the <b>esl-weight</b> utility program that is included in the easel subpackage. This program will be compiled by default, but may NOT be installed by default. For infernal-1.0rc5, you can find the binary at =infernal-1.0rc5/easel/miniapps/esl-weight=. Manually move this program into your path (e.g. to =/usr/local/bin=) or add its location to your $PATH so that the mutual information scripts can employ it. ---+++ Install <nop>BioPerl These scripts use modules for handling phylogenetic trees from [[http://www.bioperl.org/][BioPerl]]. Download and install BioPerl according to the instructions. Be sure that you add BioPerl to your Perl library path (e.g. by setting $PERL5LIB). | [[http://www.bioperl.org/][%ICON{external}% Official BioPerl Site]] | ---+++ Install mutual information scripts Finally, the download the scripts themselves. | [[%ATTACHURL%/mi.tgz][%ICON{download}% Download <b>MI Scripts</b>]] | version 1 | 21 December 2008 | You should be able to run these Perl scripts from their current location or add them to your $PATH. ---++ Usage <code><div style="border-color: grey; border-style: solid; border-width: 1px; padding:1px;"> Usage: mutual_information_significance.pl -i stockholm.stk -o stockholm.mi.stk [-r 200 -n 300] </div></code> The input alignment is processed before MI is calculated. First, all columns that are >50% gaps (taking into account relative sequence weights) are removed. Second, identical sequences in the alignment are removed. Third, sequences that share the most identity with other sequences are removed until fewer than a certain number of sequences remain in the alignment. These steps reduce the number of columns and sequences that must be considered in further calculations and usually do not affect the calculated MI significance scores. The two parameters that you may want to adjust are =-r= which is how many different random alignments to generate to estimate the _p_-value significance of the actual MI score between each pair of columns (default = 200) and =-n= to control the maximum number of sequences allowed after pruning to the most diverse. The input stockholm alignment *MUST* have an RF line. See the example alignments that are included and the Infernal documentation for a description of this line. This line will be generated by default if the Infernal program =cmalign= was used to generate the Stockholm file. You may want to alter or construct this line yourself. Only columns that contain non-gap characters in the RF line will be considered when removing redundant sequences. Be forewarned that depending on (1) the number of sequences, (2) the number of columns in your alignment, and (3) the number of resamplings requested for estimating <i>p</i>-values that this procedure can be *extremely slow* and the intermediate resampling of alignments can require *large amounts of free disk space*. If you must interrupt operation of the script, it can usually be called from within the same working directory later and execution will pick up where it left off rather than restarting, if possible. The calculation of mutual information from each resampled alignment can be parallelized (Each alignment in the =resampled-tree= directory must be used to generate a MI file in the =resampled-mi= directory). ---++ Example Two example Stockholm alignments are provided: =FMN.stk= and =SAM-I.stk=. For testing, run: <code><div style="border-color: grey; border-style: solid; border-width: 1px; padding:1px;"> mutual_information_significance.pl -i SAM-I.stk -o SAM-I.mi.stk </div></code> Several intermediate files will be created. Open the resulting file =SAM-I.mi.stk= in a text editor.
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
tgz
rate4site.tgz
r1
manage
218.9 K
2008-12-18 - 20:39
JeffreyBarrick
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
More topic actions...
Barrick Lab
>
ToolList
>
ToolsRNAStructureMutualInformation
Contributors to this topic
JeffreyBarrick
Topic revision: r3 - 2008-12-22 - 03:55:41 - Main.JeffreyBarrick
Barrick Lab
Contact
Research
Publications
Team
Protocols
Reference
Software
UT Austin
Mol Biosciences
ILS
Microbiology
EEB
CSSB
CBRS
The LTEE
iGEM team
SynBioCyc
SynBio course
NGS course
BEACON
Search
Log in
Copyright ©2025 Barrick Lab contributing authors. Ideas, requests, problems?
Send feedback