Text S1 - University of Cape Town - Python text to array

Complementary coevolution between paired nucleotides

The scripts for complementary coevolution analysis were written in Python, C and R programming. They allow to test for associations between sites predicted to be base-paired and sites that have significant degrees of complementary coevolution.

System and software requirement:

- This analysis requires a computer cluster and MPI libraries

- Python () must to be installed

- The HYPHY package should be installed on the system

Step 1: Preparing the input files

Source:

1) Detecting recombination:

Use RDP4 (available: ) to detect recombination within your sequence alignment, and after completion chose the option “Save distributed alignment (with recombinant regions separated)”. This will move the recombinant regions from alignment to the bottom of the alignment as different sequences.

2) Renaming sequences:

Place the “Editing_Seq_Names.py” script within a folder together with all the distributed alignments obtained in (1) and run the script. It will rename all sequences contained in each alignment and rename alignment files by replacing “.fas” with “E.fas”. This is to avoid sequence names containing special characters that would cause PhyML and HYPHY to crash.

3) Generating recombination free alignments:

For each of the renamed distributed alignments, the “Split_Alignment_Draw_ML_Trees.py” script will be edited by specifying the distributed alignment name, the number of sequences within the original alignment (before recombination detection) and the length of the alignment. Run the script “Split_Alignment_Draw_ML_Trees.py” which will split the alignment into recombination free sub-alignments (in phylip format) and draw a maximum likelihood tree for each sub-alignment.

N.B. run one distributed alignment at a time, and each time keep aside the generated sub-alignments and trees.

Step 2: Run the coevolution script in HYPHY

Source:

Here, the sub-alignments and ML trees obtained from each distributed alignment are used separately.

1) Place these sub-alignments and corresponding ML trees within the directory where the “Coevolution_script.c” is located and run the “mk_submission_sh.py” script to generate indexed python scripts (Submission{x}.py) and a shell script “array.sh” to be used running all Submission{x}.py as an array.

2) Run the submission shell “array.sh” to submit all the submission scripts created. For each of the sub-alignments and its corresponding tree, the “Coevolution_script.c” script will generate output files containing p-values and λ (λ>1 indicates tendency to complementary coevolution while λ ................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Text S1 - University of Cape Town

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches