A Self-Consistent Sonification Method to Translate Amino Acid Sequences into Musical Compositions and Application in Protein Design Using Artificial Intelligence

Chi Hua Yu; Zhao Qin; Francisco J. Martin-Martinez; Markus J. Buehler

doi:10.1021/acsnano.9b02180

A Self-Consistent Sonification Method to Translate Amino Acid Sequences into Musical Compositions and Application in Protein Design Using Artificial Intelligence

Chi Hua Yu, Zhao Qin, Francisco J. Martin-Martinez, Markus J. Buehler^*

^*Corresponding author for this work

Chemistry

Massachusetts Institute of Technology

Research output: Contribution to journal › Article › peer-review

88 Citations (Scopus)

Abstract

We report a self-consistent method to translate amino acid sequences into audible sound, use the representation in the musical space to train a neural network, and then apply it to generate protein designs using artificial intelligence (AI). The sonification method proposed here uses the normal mode vibrations of the amino acid building blocks of proteins to compute an audible representation of each of the 20 natural amino acids, which is fully defined by the overlay of its respective natural vibrations. The vibrational frequencies are transposed to the audible spectrum following the musical concept of transpositional equivalence, playing or writing music in a way that makes it sound higher or lower in pitch while retaining the relationships between tones or chords played. This transposition method ensures that the relative values of the vibrational frequencies within each amino acid and among different amino acids are retained. The characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale that consists of 20 tones, the “amino acid scale”. To create a playable instrument, each tone associated with the amino acids is assigned to a specific key on a piano roll, which allows us to map the sequence of amino acids in proteins into a musical score. To reflect higher-order structural details of proteins, the volume and duration of the notes associated with each amino acid are defined by the secondary structure of proteins, computed using DSSP and thereby introducing musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this sonification method and use AI to generate musical compositions, capturing the innate relationships between amino acid sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences, thereby obtaining de novo protein designs that feature specific design characteristics. We illustrate the approach in several examples that reflect the sonification of protein sequences, including multihour audible representations of natural proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an avenue for understanding sequence patterns, variations, and mutations and offers an outreach mechanism to explain the significance of protein sequences. The method may also offer insight into protein folding and understanding the context of the amino acid sequence in defining the secondary and higher-order folded structure of proteins and could hence be used to detect the effects of mutations through sound.

Original language	English
Pages (from-to)	7471-7482
Number of pages	12
Journal	ACS Nano
Volume	13
Issue number	7
DOIs	https://doi.org/10.1021/acsnano.9b02180
Publication status	Published - 23 Jul 2019

Keywords

artificial intelligence
molecular mechanics
protein
recurrent neural networks
sonification
structural analysis

Access to Document

10.1021/acsnano.9b02180

Cite this

@article{61038fccec6a4f418b9d3af4785e70b4,

title = "A Self-Consistent Sonification Method to Translate Amino Acid Sequences into Musical Compositions and Application in Protein Design Using Artificial Intelligence",

abstract = "We report a self-consistent method to translate amino acid sequences into audible sound, use the representation in the musical space to train a neural network, and then apply it to generate protein designs using artificial intelligence (AI). The sonification method proposed here uses the normal mode vibrations of the amino acid building blocks of proteins to compute an audible representation of each of the 20 natural amino acids, which is fully defined by the overlay of its respective natural vibrations. The vibrational frequencies are transposed to the audible spectrum following the musical concept of transpositional equivalence, playing or writing music in a way that makes it sound higher or lower in pitch while retaining the relationships between tones or chords played. This transposition method ensures that the relative values of the vibrational frequencies within each amino acid and among different amino acids are retained. The characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale that consists of 20 tones, the “amino acid scale”. To create a playable instrument, each tone associated with the amino acids is assigned to a specific key on a piano roll, which allows us to map the sequence of amino acids in proteins into a musical score. To reflect higher-order structural details of proteins, the volume and duration of the notes associated with each amino acid are defined by the secondary structure of proteins, computed using DSSP and thereby introducing musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this sonification method and use AI to generate musical compositions, capturing the innate relationships between amino acid sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences, thereby obtaining de novo protein designs that feature specific design characteristics. We illustrate the approach in several examples that reflect the sonification of protein sequences, including multihour audible representations of natural proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an avenue for understanding sequence patterns, variations, and mutations and offers an outreach mechanism to explain the significance of protein sequences. The method may also offer insight into protein folding and understanding the context of the amino acid sequence in defining the secondary and higher-order folded structure of proteins and could hence be used to detect the effects of mutations through sound.",

keywords = "artificial intelligence, molecular mechanics, protein, recurrent neural networks, sonification, structural analysis",

author = "Yu, {Chi Hua} and Zhao Qin and Martin-Martinez, {Francisco J.} and Buehler, {Markus J.}",

note = "Publisher Copyright: {\textcopyright} 2019 American Chemical Society.",

year = "2019",

month = jul,

day = "23",

doi = "10.1021/acsnano.9b02180",

language = "English",

volume = "13",

pages = "7471--7482",

journal = "ACS Nano",

issn = "1936-0851",

publisher = "American Chemical Society",

number = "7",

}

TY - JOUR

T1 - A Self-Consistent Sonification Method to Translate Amino Acid Sequences into Musical Compositions and Application in Protein Design Using Artificial Intelligence

AU - Yu, Chi Hua

AU - Qin, Zhao

AU - Martin-Martinez, Francisco J.

AU - Buehler, Markus J.

PY - 2019/7/23

Y1 - 2019/7/23

N2 - We report a self-consistent method to translate amino acid sequences into audible sound, use the representation in the musical space to train a neural network, and then apply it to generate protein designs using artificial intelligence (AI). The sonification method proposed here uses the normal mode vibrations of the amino acid building blocks of proteins to compute an audible representation of each of the 20 natural amino acids, which is fully defined by the overlay of its respective natural vibrations. The vibrational frequencies are transposed to the audible spectrum following the musical concept of transpositional equivalence, playing or writing music in a way that makes it sound higher or lower in pitch while retaining the relationships between tones or chords played. This transposition method ensures that the relative values of the vibrational frequencies within each amino acid and among different amino acids are retained. The characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale that consists of 20 tones, the “amino acid scale”. To create a playable instrument, each tone associated with the amino acids is assigned to a specific key on a piano roll, which allows us to map the sequence of amino acids in proteins into a musical score. To reflect higher-order structural details of proteins, the volume and duration of the notes associated with each amino acid are defined by the secondary structure of proteins, computed using DSSP and thereby introducing musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this sonification method and use AI to generate musical compositions, capturing the innate relationships between amino acid sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences, thereby obtaining de novo protein designs that feature specific design characteristics. We illustrate the approach in several examples that reflect the sonification of protein sequences, including multihour audible representations of natural proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an avenue for understanding sequence patterns, variations, and mutations and offers an outreach mechanism to explain the significance of protein sequences. The method may also offer insight into protein folding and understanding the context of the amino acid sequence in defining the secondary and higher-order folded structure of proteins and could hence be used to detect the effects of mutations through sound.

AB - We report a self-consistent method to translate amino acid sequences into audible sound, use the representation in the musical space to train a neural network, and then apply it to generate protein designs using artificial intelligence (AI). The sonification method proposed here uses the normal mode vibrations of the amino acid building blocks of proteins to compute an audible representation of each of the 20 natural amino acids, which is fully defined by the overlay of its respective natural vibrations. The vibrational frequencies are transposed to the audible spectrum following the musical concept of transpositional equivalence, playing or writing music in a way that makes it sound higher or lower in pitch while retaining the relationships between tones or chords played. This transposition method ensures that the relative values of the vibrational frequencies within each amino acid and among different amino acids are retained. The characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale that consists of 20 tones, the “amino acid scale”. To create a playable instrument, each tone associated with the amino acids is assigned to a specific key on a piano roll, which allows us to map the sequence of amino acids in proteins into a musical score. To reflect higher-order structural details of proteins, the volume and duration of the notes associated with each amino acid are defined by the secondary structure of proteins, computed using DSSP and thereby introducing musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this sonification method and use AI to generate musical compositions, capturing the innate relationships between amino acid sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences, thereby obtaining de novo protein designs that feature specific design characteristics. We illustrate the approach in several examples that reflect the sonification of protein sequences, including multihour audible representations of natural proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an avenue for understanding sequence patterns, variations, and mutations and offers an outreach mechanism to explain the significance of protein sequences. The method may also offer insight into protein folding and understanding the context of the amino acid sequence in defining the secondary and higher-order folded structure of proteins and could hence be used to detect the effects of mutations through sound.

KW - artificial intelligence

KW - molecular mechanics

KW - protein

KW - recurrent neural networks

KW - sonification

KW - structural analysis

UR - http://www.scopus.com/inward/record.url?scp=85068460153&partnerID=8YFLogxK

U2 - 10.1021/acsnano.9b02180

DO - 10.1021/acsnano.9b02180

M3 - Article

C2 - 31240912

AN - SCOPUS:85068460153

SN - 1936-0851

VL - 13

SP - 7471

EP - 7482

JO - ACS Nano

JF - ACS Nano

IS - 7

ER -

A Self-Consistent Sonification Method to Translate Amino Acid Sequences into Musical Compositions and Application in Protein Design Using Artificial Intelligence

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this