用户名: 密码: 忘记密码? 注册

[原创]又抓了bioperl中codeml.pm 模块的一个bug

作者:  时间: 2010-10-18
以下是运行codelml计算branch model的perl代码

use strict;
    use warnings;
    use Bio::Tools::Run::Phylo::PAML::Codeml;
    use Bio::AlignIO;
    use Data::Dumper;
    use Bio::TreeIO;

# my ( $treefile, $setModel ) = @ARGV; #脚本的第一个参数是树文件名,第二个是控制omega的模型


    #print "omega\tlnL\n";

    my $alignio = Bio::AlignIO->new(
        -format => 'fasta',
        -file => 'seqfile.fa',
        -dir => '.'
    );

    my $aln = $alignio->next_aln;

    my $treeio = Bio::TreeIO->new(
        '-format' => 'newick',
        '-file' => 'treeH2.txt'
    );
    my $tree = $treeio->next_tree;
    my $codeml = Bio::Tools::Run::Phylo::PAML::Codeml->new();
    $codeml->alignment($aln);
    $codeml->tree($tree);
    #$codeml->save_tempfiles(1);

    $codeml->set_parameter( "runmode", 0 );
    $codeml->set_parameter( "model", 2 );

    my ( $rc, $parser ) = $codeml->run();
    my $result = $parser->next_result;
foreach my $model ( $result->get_NSSite_results ) {
    print $model->{'likelihood'},"\n";
}
;


以下是树文件

(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C #1,(X04752Mus #1,U07177Rat #1)#1)#1,(U95378Sus #1,U13680Hom #1)#1)#1,(X53828OG1,U28410OG2)))))

出现如下错误:
Can't call method "get_NSSite_results" on an undefined value at /media/Study/evolution/paml/ex3/ex3.pl line 33, <GEN3> line 256.
 at /media/Study/evolution/paml/ex3/ex3.pl line 33

单步执行调试发现,codelml.pm模块先将在tmp目录下生成,配置文件mlc,树文件,序列文件,然后运行codeml,一番调试发现原来问题出在树文件上,bioperl生成的树文件变成了这样

(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),((("AF070995C #1",("X04752Mus #1","U07177Rat #1")#1)#1,("U95378Sus #1","U13680Hom #1")#1)#1,(X53828OG1,U28410OG2)))))

待bioperl生成配置文件,序列文件,树文件完毕后,暂停程序,从终端切换到这个tmp目录,运行codeml,得到错误如下:
sizeof(size_t) = 4 bytes
CODONML in paml version 4.4, January 2010
1 ambiguous codons are seen in the data:
 ---


      528 bytes for distance
   313296 bytes for conP
        0 bytes for fhK
  5000000 bytes for space

Seq #6 (AF070995C) is missing in the tree
原来是生成的树文件多了"(冒号)导致codeml不能识别树文件.打开vim去掉"(冒号),运行ok.
原来如此,定位到codeml.pm模块生成树文件处,488行处

487       $treeout->close();
488       close($temptreeFH);

添加如下代码

     rename $temptreefile,$temptreefile.".bak";
     open MYTREEFHOLD,$temptreefile.".bak" or die "can't open $temptreefile";
     open MYTREEFHNEW,">$temptreefile" or die "can't open $temptreefile";
     while (<MYTREEFHOLD>) {
          s/"//g;
         print MYTREEFHNEW $_;
     }
     close MYTREEFHOLD;
     close MYTREEFHNEW;

再次运行脚本 ok,结果如下:
-5985.635237
附件是运行的序列,树,配置文件,和修改后的codeml.pm
文件:codelml.tar.gz
大小:13KB
下载:下载