Music Generation for Visual Art through Emotion

Model architecture outline.

Model architecture outline.

 
 

by Xiaodong Tan and Mathis Antony.

Abstract

We explore methods of generating music from images using emotion as the connection between the visual and auditory domains. Our primary goal is to express visual art in the auditory domain. The resulting music can enrich visual art or provide a form of translation to facilitate enjoying visual art without reliance on the visual system. We use pre-trained image representations and explore two different types of music modelling methods based on RNN and Transformer architectures to build models capable of generating music given an image as input. To evaluate the performance of these methods, preliminary human and machine evaluation are conducted. The results suggest that both music generators are able to express music with an emotional connection.

The source code for the project and assets can be found on github.

This work was published in the International Conference on Computational Creativity 2020. The paper can be downloaded here.

Links

https://github.com/sudongtan/synesthesia

http://computationalcreativity.net/iccc20/papers/137-iccc20.pdf