Protein Multiple Alignments: Sequence-based vs Structure-based Programs

Mathilde Carpentier (MNHN)
Thursday, January 30, 2020 - 10:30 to 12:00
Room Aurigny
Talk abstract: 

Motivation: Multiple sequence alignment programs have proved to be very useful and have already been evaluated in the literature yet, not alignment programs based on structure or both sequence and structure. In the present article we wish to evaluate the added value provided through considering structures. Results: We compared the multiple alignments resulting from 25 programs either based on sequence, structure, or both, to reference alignments deposited in five databases (BALIBASE 2 and 3, HOMSTRAD, OXBENCH and SISY- PHUS). On the whole, the structure-based methods compute more reliable alignments than the sequence-based ones, and even than the sequence+structure-based programs whatever the databases. Two programs lead, MAMMOTH and MATRAS, nevertheless the performances of MUSTANG, MATT, 3DCOMB, TCOFFEE+TM ALIGN and TCOFFEE+SAP are better for some alignments. The advantage of structure-based methods increases at low levels of sequence identity, or for residues in regular secondary structures or buried ones. Concerning gap management, sequence-based programs set less gaps than structure-based programs. Concerning the databases, the alignments of the manually built databases are more challenging for the programs.