AugmentedNet: A Convolutional Recurrent Neural Network for Automatic Roman Numeral Analysis with Improved Data Augmentation

AugmentedNet. Diagram taken from the paper.

Abstract

One of the most common ways to analyze a piece of tonal music is through Roman numeral analysis. In this paper, we present the AugmentedNet, a convolutional recurrent neural network that improves the automatic prediction of Roman numeral analysis labels. The network is characterized by a novel representation of pitch spelling, a separation of bass and chroma inputs into independent convolutional blocks, and the layout of the convolutional layers in each block (see Figure 1). The network is enhanced by a greater number of tonal tasks to solve simultaneously and synthetic training examples for data augmentation. The additional tonal tasks (bottom-right side of Figure 1) strengthen the shared representation learned through multitask learning. The synthetic training examples consist of “new” scores, which are artificially generated from the chord annotations and texturized with simple patterns, such as an Alberti bass figure.

Publication
In Proceedings of the 14th International Workshop on Machine Learning and Music
Néstor Nápoles López
Néstor Nápoles López
PhD in Music Technology

I earned my PhD researching deep learning models for automatic Roman numeral analysis. Nowadays, I am one of the developers of the Sibelius music notation software.