Making and Evaluating Multiple Sequence Alignments
BALiBASE is a database of manually edited multiple sequence alignments (MSAs) used to evaluate new multiple sequence alignment algorithms. In this exercise we will use MSAs from BALiBASE to evaluate the performance of several alignment programs.
1. Select a protein sequence file from the folder in the disk “compartir”. Note that the sequences are unaligned and are in FASTA format.
2. Make a ClustalW alignment. Navigate to the ClustalW server at EBI http://www.ebi.ac.uk/Tools/msa/clustalw2/ Copy and paste your protein sequences into the box and click the Submit button. When the program has finished, you will see the aligned sequences. Click the ‘Download Alignment File‘ button to save your aligned sequences to the computer.
3. Make a multiple sequence alignment with Muscle (http://www.ebi.ac.uk/Tools/muscle/index.html). For the Output Format parameter, select ‘ClustalW (strict)‘.
4. Perform a T-COFFEE CORE score evaluation of your ClustalW and Muscle alignments. Go the the T-COFFEE homepage (http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi). Click on the Regular CORE Evaluation button. Copy and paste your aligned sequences into this form and click Submit. When the evaluation is finished, click the score_html link to view the evaluation. Note the overall score for this alignment (SCORE=).
5. Import the sequence file from step 1 into Geneious. Align the sequences using the Geneious alignment method. Export the alignment from Geneious in Clustal format and then perform a T-COFFEE CORE score evaluation as you did in step 4.
6. You should now have 3 T-COFFEE CORE scores, one for each alignment algorithm. Which program produced the highest score?