BMC Bioinformatics 2021

This site accompanies the publication Using sound to understand protein sequence data: new sonification algorithms for protein sequences and multiple sequence alignments by Martin et. al. 2021 in BMC Bioinformatics.

The paper presents five algorithms for the sonification of protein sequence and multiple sequence alignment (MSA) data. Algorithms I, II, and III sonify protein sequences. Algorithms IV and V sonify multiple sequence alingments.

Some details of the algorithms will be included here, but for a full explanation of the sonifications please see the paper.

Code and documentation is available from GitHub


We used a questionaire to research the effectiveness of these methods. We set tasks for our participants to complete using two of the sonification algorithms (Algorithms I and IV). You are welcome to access this questionnaire by the link above and try the tasks - it should take about 15 mins to complete. Download the pdf for clickable links to sound examples. Please note we are not collecting responses.

Protein Examples

Major Prion Protein - Homo sapiens (Human)

>sp|P04156|PRIO_HUMAN Major prion protein OS=Homo sapiens OX=9606 GN=PRNP PE=1 SV=1

The example of the Major Human Prion protein demonstrates the effectiveness of sonification in identifying Amino Acid Repeats (AARs). 

Transmembrane protein 14C - Homo sapiens (Human)

This is an example of the protein algorithms using a transmembrane protein. Information about the protein can be found at the uniprot website here:

>sp|Q9P0S9|TM14C_HUMAN Transmembrane protein 14C OS=Homo sapiens OX=9606 GN=TMEM14C PE=1 SV=1

Algorithm I

Algorithm II

Algorithm III


Insulin (globular protein)

>sp|P01308|INS_HUMAN Insulin OS=Homo sapiens OX=9606 GN=INS PE=1 SV=1


Histone (Intrinsically Disordered Protein)

>sp|P62805|H4_HUMAN Histone H4 OS=Homo sapiens OX=9606 GN=H4C1 PE=1 SV=2


Multiple Sequence Alignment Examples

For each of the examples below, both a gappy and a compact MSA are given as examples for comparison. These differ in the technique used to make the multiple alignment. Gappy MSAs were generated using MUSCLE 3.8.31 (-gapopen -3). Compact MSAs were generated using MUSCLE 3.8.31 (-gapopen 1). For each pair of gappy and compact MSAs, the same unaligned sequences were used as input.


Compact Visualisation (AliView)

Gappy Visualisation (AliView)

Algorithm IV


Algorithm V


Compact Visualisation (AliView)

Gappy Visualisation (AliView)

Algorithm IV


Algorithm V