Facile, rapide et gratuit

Créez votre site maintenant

Je crée mon site
Site gratuit créé sur
Lorem Ipsum

ABC_GWH

A cookbook to study Genome Wide Heterogeneity in introgression rates

Utensils

The following archive contains the three scripts I used in the Ciona instestinalis study.

 


ABC tools
760 ko
ABC_tools.zip

Priorgen

Developped by Jeffrey Ross-Ibarra in C++, and somewhat modified to investigate GWH.

This script requires two C++ libraries Boost and GSL.

You can compile it by typing in the priorgen folder something like :

g++ -o priorgen priorgen.cc -I /home/roux/Programmes/boost_1_47_0/ -L /home/roux/Programmes/boost_1_47_0/libs -lboost_program_options-mt -lgsl -lgslcblas -lm -static -O3  -I /home/roux/Programmes/gsl-1.15/

 

I know, it's frustrating, I had the same feeling the first time. You just have to adapt this line according the username of your session (mine is roux), according to the path to Boost.

If it doesn't work, you can try to use the compiled file I provided by :

chmod +x priorgen

./priorgen --help or ./prior --version

If the released priorgen works, please create a symbolic link into your bin :

sudo ln -s WholeCurrentPathToPriorgen /usr/bin/priorgen

 

In the archive you will find a folder named example_files. To run priorgen, for example, you can use the following command line :

priorgen --arg-file argfile_halleri_lyrata.txt --bpfile bpfile_halleri_lyrata.txt --seed 143 --prior parameters.txt

 

Please, play with the different arguments in the argfile :

model = iso, sym, allo or island

mig = homo or hetero

dir = sym or asym

 

The created parameters.txt contains multilocus parameters while the standard output will be read by msnsam

Msnsam

The lifesaving version by Ross-Ibarra of the lifesaving program MS by Hudson.

To compile it, just write :

./clms

You can simply test it :

./msnsam 10 1 -t 5


If it works :

sudo ln -s WholeCurrentPathToMsnsam /usr/bin/msnsam

priorgen | msnsam

If you have selected the 'iso' model in argfile:

priorgen --arg-file argfile_halleri_lyrata.txt --bpfile bpfile_halleri_lyrata.txt --seed 143 --prior parameters.txt | msnsam tbs 200 -t tbs -r tbs tbs -I 2 tbs tbs 0 -m 1 2 tbs -m 2 1 tbs -n 1 tbs -n 2 tbs -ej tbs 2 1 -eN tbs tbs

 

If you have selected the 'island' model in argfile:

priorgen --arg-file argfile_halleri_lyrata.txt --bpfile bpfile_halleri_lyrata.txt --seed 143 --prior parameters.txt | msnsam tbs 200 -t tbs -r tbs tbs -I 2 tbs tbs 0 -m 1 2 tbs -m 2 1 tbs -n 1 tbs -n 2 tbs -ej tbs 2 1 -eN tbs tbs


If you have selected the 'sym' model in argfile:

priorgen --arg-file argfile_halleri_lyrata.txt --bpfile bpfile_halleri_lyrata.txt --seed 143 --prior parameters.txt | msnsam tbs 200 -t tbs -r tbs tbs -I 2 tbs tbs 0 -m 1 2 0 -m 2 1 0 -n 1 tbs -n 2 tbs -ema tbs 2 0 tbs tbs 0 -ej tbs 2 1 -eN tbs tbs


If you have selected the 'allo' model in argfile:

priorgen --arg-file argfile_halleri_lyrata.txt --bpfile bpfile_halleri_lyrata.txt --seed 143 --prior parameters.txt | msnsam tbs 200 -t tbs -r tbs tbs -I 2 tbs tbs 0 -m 1 2 tbs -m 2 1 tbs -n 1 tbs -n 2 tbs -eM tbs 0 -ej tbs 2 1 -eN tbs tbs

 

Why 200 ? Because 10 multilocus simulations of 20 loci.


mscalc

To compile it, just write :

gcc *.c -o mscalc -lm

 

Please, don't pay any attention to the warning messages :

spinout.c: In function ‘get_initial_conditions_dynamics’:
spinout.c:448: warning: format ‘%ld’ expects type ‘long int’, but argument 4 has type ‘int’
spinout.c:547: warning: format ‘%ld’ expects type ‘long int’, but argument 4 has type ‘int’

 

If you succesfully compiled mscalc :

sudo ln -s WholeCurrentPathToMscalc /usr/bin/mscalc

 

It requires two input files : spinput.txt + the output from ms/msnsam

 

The released version compute summary statistics without orientate mutations by using an outgroup. If you want to take into account an outgroup, just send me an e-mail

The whole stuff

mknod myfifo p

mscalc <myfifo &

./priorgen --arg-file argfile_halleri_lyrata.txt --bpfile bpfile_halleri_lyrata.txt --seed 143 --prior parameters.txt | msnsam tbs 200 -t tbs -r tbs tbs -I 2 tbs tbs 0 -m 1 2 tbs -m 2 1 tbs -n 1 tbs -n 2 tbs -ej tbs 2 1 -eN tbs tbs >myfifo &

 

The output containing the summary statistics is ABCstat.txt

Reminder

If you change the number of multilocus simulations, change the argfile, the spinput.txt and the second argument of msnsam.


Personnally, I run on a cluster of CPUs hundreds of jobs of 100,000 multilocus iterations. So, If you plan to do the same thing, don't forget to change the seed value of priorgen among different jobs  ;)

 

The R package 'abc' is essential for the rest of the analyses [link]