Master 2015 2016
Stages de la spécialité SAR
Learning at multiple time granularities for automatic musical generation

Lieu : Ircam
Encadrant : Philippe Esling, Jérôme Nika
Dates :du 08/02/16 au 15/07/16
Rémunération :Gratification en vigueur
Mots-clés : Parcours ATIAM : Informatique musicale

Cliquer ici pour vous authentifier


- Abstract : In multiple applications, time series can be scrutinized at variable time scales, which requires investigating motif mining and knowledge extraction at multiple temporal granularities. These questions are particularly critical in musical contexts, where any time instant is defined and dependent on multiple temporal contexts, spanning from few notes to the global scale of the structure of a piece. This requires to embed notions of different memory spans into current models of musical creativity. Furthermore, models should provide flexibility in their generation in order to produce interesting and non-deterministic outputs. These three aspects, namely memory, temporal representations and generativity will be investigated through the prism of manifold and variational learning, by trying to build bridges between two previously developed models. On the one hand, a generative graphical probabilistic model, currently applied to automated orchestral generation should be extended to variational learning and multi-scale temporal aspects. On the other hand, the Factor Oracle, an automaton used to represent and generate music used in machine improvisation contexts provides a unique framework to assess these new approaches to fill the temporal gap in current machine learning approaches.

- Research context : Data mining and knowledge discovery have provided solutions to a wide array of problems across a variety of scientific fields. Many of these problems involve temporal aspects, which have been increasingly scrutinized through time series data [1]. However, the last decade of advances in computing power, storage capabilities and increasingly complex sources of information have led to a critical data avalanche. The relatively young fields of data mining and knowledge discovery have to be rebuilt to withstand these extreme constraints. Furthermore, the inherent high-dimensionality of time series data makes most mining techniques fall prey to the curse of dimensionality, even at medium scales such as processing a million time series. Machine learning techniques have been mostly overlooked in the time series mining field because of this property [2]. However, the recent advent in deep learning [3] can provide a way to alleviate this problem. Reciprocally, research in deep learning seems to be mostly suffering from a lack of attention towards temporal concepts (lack of memory in neural networks [4]). This project is targeted at developing new learning methods for temporal knowledge, by focusing on multi-scale temporal aspects and evaluating the proposed approach in a musical generation setup [5].

As an application of these studies, the problem of musical arrangement is considered. Critical to this problem is the notion of memory within the models used. Indeed, musical improvisation offers a unique framework of study, as a system driven solely by local successive frame-level decisions would be bound to a lack of continuity in its generation. Several generative representations dealing with the sequential structure of an input stream have already been studied : N-grams, within the SOMax project [6], “Flow-machines” [7] or Factor Oracles [8, 9]...Those tools aimed at musical generation but only through a local sequential structure. Hence, they still exhibit some weak aspects that could be targeted such as adding global structure or combining machines built from several different sources (thus empowering the systems to gather extensive knowledge through learning). As a basis to this study, we will rely on the Factor Oracle, a temporal automaton used to represent and generate music, that is currently being researched towards adding global and reactive structures and constraints. The problem of adding global constraints to the Factor Oracles is still open and various proposition have been made : the use of an example sequence as a guide for the generation of the new sequence [10], the use of automata products to introduce external constraints to the models [11], or the application of logics and model-checking techniques [12].

All these systems share a common interest in memory and the keeping of a creative context. The application of machine learning techniques to the models presented therefore seems like a promising way of compensating for the difficulty of dealing with memory when using machine learning tools. Conversely, machine learning could help make up for the difficulty of aggregating various models and paradigms within a given generative system. The objective of this internship is therefore to mainly investigate the extension of deep learning techniques and temporal automata to automatically extract knowledge and memorial-structure from time series. These aspects will be devoted to the use of variational learning and temporal attention mecanisms so as to generate new sequences based on a corpus of musical contexts.

- Expected contributions : The main field of application for this internship will be to target an automated “input-fitted” harmonized accompaniment, as a direct way to evaluate the proposed methods and algorithms. Nonetheless, inherently associated with any machine learning tasks is the need for a consequently sized dataset of labelled examples, thus the chosen applications will also depend on the public availability of such datasets. Based on the dataset devised at the onset of the study, the following work is expected from the intern : 1. Evaluating the state-of-the-art in both oracle-type representation and statistical machine learning, in order to define a major direction to follow during the internship (choosing between research on the temporal, memorial or generative aspects, delegating the other aspects to already existing tools). 2. The intern is then expected to come up with new machine learning approaches to infer information and exchange it with a music generation system (e.g. the Factor Oracle). As a first step towards this, existing works on time series data mining will be considered [1, 14]. 3. In order to correctly deal with multiple time-scales those approaches will have to be extended towards deep temporal granularity learning. 4. Invsetigate complete extraction algorithms to find temporal motifs at variable granularities stemming from multivariate time series, connecting those to some generative system. 5. Evaluate the results via an objective appropriate quality factor (e.g. sequence likelihood).


[1] Philippe Esling and Carlos Agon. “Time-series Data Mining”. In : ACM Comput. Surv. 45.1 (Dec. 2012), 12:1–12:34. issn : 0360-0300. doi : 10.1145/2379776.2379788.

[2] Christos Faloutsos and Vasileios Megalooikonomou. “On data mining, compression, and Kolmogorov complexity.” In : Data Min. Knowl. Discov. 15.1 (Sept. 14, 2007), pp. 3–20.

[3] Yoshua Bengio. “Learning Deep Architectures for AI”. In : Found. Trends Mach. Learn. 2.1 (Jan. 2009), pp. 1–127. issn : 1935-8237. doi : 10.1561/2200000006.

[4] Vincent Michalski, Roland Memisevic, and Kishore Konda. “Modeling Deep Temporal Depen- dencies with Recurrent Grammar Cells”. In : Advances in Neural Information Processing Systems 27. Ed. by Z. Ghahramani et al. Curran Associates, Inc., 2014, pp. 1925–1933.

[5] Fabian Morchen. Algorithms For Time Series Knowledge Mining. 2006.

[6] Laurent Bonnasse-Gahot. An update on the SOMax project. Tech. rep. Internal report ANR project Sample Orchestrator 2, ANR-10-CORD-0018. Ircam - STMS, 2014.

[7] Fiammetta Ghedini, François Pachet, and Pierre Roy. “Creating music and texts with flow machines”. In : Multidisciplinary Contributions to the Science of Creative Thinking. Springer, 2016, pp. 325–343.

[8] Cyril Allauzen, Maxime Crochemore, and Mathieu Raffinot. “Factor oracle : a new structure for pattern matching”. In : 26th Seminar on Current Trends in Theory and Practice of Informatics (SOFSEM’99). Ed. by Pavelka Jan, Tel Gerard, and Bartosek Miroslav. Vol. 1725. LNCS. Milovy, Czech Republic, Czech Republic : Springer-Verlag, Nov. 1999, pp. 291–306.

[9] Gérard Assayag and Shlomo Dubnov. “Using Factor Oracles for Machine Improvisation”. In : Soft Comput. 8.9 (Sept. 2004), pp. 604–610. issn : 1432-7643. doi : 10.1007/s00500-004- 0385-4.

[10] Cheng-i Wang and Shlomo Dubnov. “Guided Music Synthesis with Variable Markov Oracle” (2014).

[11] Alexandre Donze et al. Control Improvisation with Application to Music. Tech. rep. UCB/EECS- 2013-183. EECS Department, University of California, Berkeley, Nov. 2013.

[12] Théis Bazin and Shlomo Dubnov. “Model-Checking the VMO : A logical approach to machine improvisation”. Currently in submission process.

[13] Jérôme Nika, Marc Chemillier, and Gérard Assayag. “ImproteK : introducing scenarios into human-computer music improvisation”. In : ACM Computers in Entertainment, special issue on Musical metacreation (2016).

[14] Pierre Talbot and Philippe Esling. “Multivariate time series knowledge inference”. MA thesis. IRCAM, 2014.