Herbie Hancock's Chameleon's BPM graph from the Android app 'liveBPM' (v. 1.2.0) by Daniel Bach

Listening to music seems easy.

Posted on November 16, 2018 by Mehmet Vurkaç

Listening to music seems easy; it even appears like a passive task.

Listening, however, is not the same as hearing. In listening, i.e., attending, we add cognition to perception. The cognition of musical structures, cultural meanings, conventions, and even of the most fundamental elements themselves such as pitch or rhythm turns out to be a complex cognitive task. We know this is so because getting our cutting-edge technology to understand music with all its subtleties and its cultural contexts has proven, so far, to be impossible.

Within small fractions of a second, humans can reach conclusions about musical audio that are beyond the abilities of the most advanced algorithms.

For example, a trained or experienced musician (or even non-musician listener) can differentiate computer-generated and human-performed instruments in almost any musical input, even in the presence of dozens of other instruments sounding simultaneously.

In a rather different case, humans can maintain time-organizational internal representations of music while the tempo of a recording or performance continuously changes. A classic example is the jazz standard Chameleon by Herbie Hancock off the album ‘HEADHUNTERS’. The recording never retains any one tempo, following an up-and-down contour and mostly getting faster. Because tempo recognition is a prerequisite to other music-perception tasks like meter induction and onset detection, this type of behavior presents a significant challenge to signal-processing and machine-learning algorithms but generally poses no difficulty to human perception.

Another example is the recognition of vastly different cover versions of songs: A person familiar with a song can recognize within a few notes a cover version of that song done in another genre, at a different tempo, by another singer, and with different instrumentation.

Each of these is a task that is well beyond machine-learning techniques that are exhibiting remarkable successes with visual recognition where the main challenge, invariance, is less of an obstacle than the abstractness of music and its seemingly arbitrary meanings and structures.

Consider the following aspects of music cognition.

inferring a key (or a change of key) from very few notes
identifying a latent underlying pulse when it is completely obscured by syncopation [Tal et al., Missing Pulse]
effortlessly tracking key changes, tempo changes, and meter changes
instantly separating and identifying instruments even in performances with many-voice polyphony (as in Dixieland Jazz, Big-Band Jazz, Baroque and Classical European court music, Progressive Rock, folkloric Rumba, and Hindustani and Carnatic classical music)

These and many other forms of highly polyphonic, polyrhythmic, or cross-rhythmic music continue to present challenges to automated algorithms. Successful examples of automated tempo or meter induction, onset detection, source separation, key detection, and the like all work under the requirement of tight limitations on the types of inputs. Even for a single such task such as source separation, a universally applicable algorithm does not seem to exist. (There is some commercial software that appear to do these tasks universally, but because proprietary programs do not provide sufficiently detailed outputs, whether they really can perform all these function or whether they perform one function in enough detail to suffice for studio uses is uncertain. One such suite can identify and separate every individual note from any recording, but does not perform source separation into streams-per-instrument and presents its output in a form not conducive to analysis in rhythmic, harmonic, melodic, or formal terms, and not in a form analogous to human cognitive processing of music.)

Not only does universal music analysis remain an unsolved problem, but also most of the world’s technological effort goes toward European folk music, European classical music, and (international) popular music. The goal of my research and my lab (Lab BBBB: Beats, Beats, Bayes, and the Brain) is to develop systems for culturally sensitive and culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition, and to do so for popular music styles from the Global South that are not in the industry’s radar.

Since the human nervous system is able to complete musical-analysis tasks under almost any set of circumstances, in multiple cultural and cross-cultural settings, with varying levels of noise and interference, the human brain is still superior to the highest-level technology we have developed. Hence, Lab BBBB takes inspiration and direct insight from human neural processing of audio and music to solve culturally specific cognitive problems in music analysis, and to use this context to further our understanding of neuroscience and machine learning.

The long-term goal of our research effort is a feedback cycle:

Neuroscience (in simulation and with human subjects at our collaborators’ sites) informs both music information retrieval and research into neural-network structures (machine learning). We are initially doing this by investigating the role of rhythm priming in Parkinson’s (rhythm–motor interaction) and in grammar-learning performance (rhythm–language interaction) in the basal ganglia. We hope to then replicate in simulation the effects that have been observed with people, verify our models, and use our modeling experience on other tasks that have not yet been demonstrated in human cases or that are too invasive or otherwise unacceptable.
Work on machine learning informs neuroscience by narrowing down the range of investigation.
Deep learning is also used to analyze musical audio using structures closer to those in the human brain than the filter-bank and matrix-decomposition methods typically used to analyze music.
Music analysis informs cognitive neuroscience, we conjecture, as have been done in certain cases in the literature with nonlinear dynamics.
Phenomena like entrainment and neural resonance in neurodynamics further inform the development of neural-network structures and data-subspace methods.
These developments in machine learning move music information retrieval closer to human-like performance for culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition for multicultural intelligent music systems.

NIPS 2015: Thoughts about SoundCloud, genres, clave tagging, clave gamification, multi-label classification, and perceptual manifolds

Posted on December 10, 2015December 25, 2015 by Mehmet Vurkaç

On December 9th, at NIPS 2015, I met two engineers from SoundCloud, which is not only providing unsigned artists a venue to get their music heard (and commented on), and providing recommendation and music-oriented social networking, but also, if I understand correctly, is interested in content analysis for various purposes. Some of those have to do with identifying work that may not be original, which can range from quotation to plagiarism (the latter being an important issue in my line of work: education), but also involve the creation of derivative content, like remixing, to which they seem to have a healthy approach. (At the same event, the IBM Watson program director also suggested that they could conceivably be interested in generative tools based on music analysis.)

I got interested in clave-direction recognition to help musicians, because I was one, and I was struggling—clave didn’t make sense. Why were two completely different patterns in the same clave direction, and two very similar patterns not? To make matters worse, in samba batucada, there was a pattern said to be in 3-2, but with two notes in the first half, followed by three notes in the second half. There had to be a consistent explanation. I set out to find it. (If you’re curious, I explained the solution thoroughly in my Current Musicology paper.)

Top: Surdo de terceira. Bottom: The 3-2 partido-alto for cuíca and agogô. Note that playing the partido-alto omitting the first and third crotchet’s worth of onsets results in the terceira.

However, clave is relevant not just to music-makers, but to informed listeners and dancers as well. A big part of music-in-society is the communities it forms, and that has a lot to do with expertise and identity in listeners. Automated recognition of clave-direction in sections of music (or entire pieces) can lead to automated tagging of these sections or pieces, increasing listener identification (which can be gamified) or helping music-making.

My clave-recognition scheme (which is an information-theoretically aided neural network) recognizes four output classes (outside, inside, neutral, and incoherent). In my musicological research, I also developed three teacher models, but only from a single cultural perspective. Since then, I have recently submitted a work-in-progress and accompanying abstract to AAWM 2016 (Analytical Approaches to World Music) about what would happen if I looked at clave direction from different cultural perspectives (which I have encoded as phase shifts), and graphed the results in the complex plane (just like phase shift in electric circuits).

Another motivating idea came from today’s talk Computational Principles for Deep Neuronal Architectures by Haim Sompolinsky: perceptual manifolds. The simplest manifold proposed was line segments. This is poignant to clave recognition because among my initial goals was extending my results to non-idealized onset vectors: [0.83, 0.58, 0.06, 0.78] instead of [1101], for example. The line-segment manifold would encode this as onset strengths (“velocity” in MIDI terminology) ranging from 0 (no onset) to 1 (127 in MIDI). This will let me look inside the onset-vector hypercube.

Another tie-in from NIPS conversations is employing Pareto frontiers with my clave data for a version of multi-label learning. Since I can approach each pattern from two phase perspectives, and up to three teacher models (vigilance levels), a good multi-label classifier would have to provide up to 6 correct outputs, and in the case that a classifier cannot be that good, the Pareto frontier would determine which classifiers are undominated.

Would all this be interesting to musicians? Yes, I think so. Even without going into building a clave-trainer software into various percussion gear or automated-accompaniment keyboards, this could allow clave direction to be gamified. Considering all the clave debates that rage in Latin-music-ian circles (such as the “four great clave debates” and the “clave schism” issues like around Giovanni Hidalgo’s labeling scheme quoted in Modern Drummer*), a multi-perspective clave-identification game could be quite a hit.

So, how does a Turkish math nerd get to be obsessed by this? I learned about clave—the Afro-Latin (or even African-Diasporan) concept of rhythmic harmony that many people mistake for the family of fewer than a dozen patterns, or for a purely Cuban or “Latin” organizational principle—around 1992 from the musicians of Bochinche and Sonando, two Seattle bands. I had also grown up listening to Brazilian (and Indian, Norwegian, US, and German) jazz in Turkey. (My first live concert by a foreign band was Hermeto Pascoal e Grupo, featuring former CBC faculty Jovino Santos Neto.) So, I knew that I wanted to learn about Brazilian music. (At the time, most of what I listened to was Brazilian jazz, like Dom Um Romao and Airto, and I had no idea that they mostly drew from nordestino music, like baião, xote, côco, and frevo**―not samba).

Fortunately, I soon moved to Portland, where Brian Davis and Derek Reith of Pink Martini had respectively founded and sustained a bloco called Lions of Batucada. Soon, Brian introduced us to Jorge Alabê, and then to California Brazil Camp, with its dozens of amazing Brazilian teachers. . . But let’s get back to clave.

I said above that clave is “the Afro-Latin (or even African-Diasporan) concept of rhythmic harmony that many people mistake for the family of fewer than a dozen patterns, or for a purely Cuban or ‘Latin’ organizational principle.” What’s wrong with that?

Well, clave certainly is an organizational principle: It tells the skilled musician, dancer, or listener how the rhythm (the temporal organization, or timing) of notes in all the instruments may and may not go during any stretch of the music (as long as the music is from a tradition that has this property, of course).

And clave certainly is a Spanish-language word that took on its current meaning in Cuba, as explained wonderfully in Ned Sublette’s book.

However, the transatlantic slave trade did not only move people (forcefully) to Cuba. The Yorùbá (of today’s southwest Nigeria and southeast Benin), the Malinka (a misnomer, according to Mamady Keïta for people from Mali, Ivory Coast, Burkina Faso, Gambia, Guinea, and Senegal), and the various Angolan peoples were brought to many of today’s South American, Caribbean, and North American countries, where they culturally and otherwise interacted with Iberians and the natives of the Americas.

Certain musicological interpretations of Rolando Antonio Pérez Fernández’s book La Binarización de los Ritmos Ternarios Africanos en América Latina have argued that the organizational principles of Yoruba 12/8 music, primarily the standard West African timeline (X.X.XX..X.X.X)

Bembé ("Short bell") or the standard West African timeline, along with its major-scale analog

and the Malinka/Manding timelines met the 4/4 time signatures of Angolan and Iberian music, and morphed into the organizational timelines of today’s rumba, salsa, (Uruguayan) candombe, maracatu, samba, and other musics of the Americas.

Some of those timelines we all refer to as clave, but for others, like the partido-alto in Brazil***, it is sometimes culturally better not to refer to them as clave patterns. (This is understandable, in that Brazilians speak Portuguese, and do not always like to be mistaken for Spanish-speakers.)

Conceptually, however, partido-alto in samba plays the same organizational role that clave plays in rumba and salsa, or the gongue pattern plays in maracatu: It immediately tells knowledgeable musicians how not to play.

In my research, I found multiple ways to look at the idiomatic appropriateness of arbitrary timing patterns (more than 10,000 of them, only about a hundred of which are “traditional” [accepted, commonly used] patterns). I identified three “teacher” models, which are just levels of strictness. I also identified four clave-direction categories. (Really, these were taught to me by my teacher-informers, whose reactions to certain patterns informed some of the categories.)

Some patterns are in 3-2 (which I call “outside”). While the 3-2 clave son (X..X..X…X.X…):

3-2 (outside) clave son, in northern and TUBS notation

is obvious to anyone who has attempted to play anything remotely Latin, it is not so obvious why the following version of the partido-alto pattern is also in the 3-2 direction****: .X..X.X.X.X..X.X

The plain 3-2 partido-alto pattern. (The pitches are approximate and can vary with cuíca intonation or the agogô maker’s accuracy.) "Bossa clave" in 3-2 and 2-3 are added in TUBS notation to show the degree of match and mismatch with 3-2 and 2-3 patterns, respectively.

Some patterns are in 2-3 (which I call “inside”). Many patterns that are heard throughout all Latin American musics are clave-neutral: They provide the same amount of relative offbeatness no matter which way you slice them. The common Brazilian hand-clapping pattern in pagode, X..X..X.X..X..X. is one such pattern:

The clave-neutral hand-clapping pattern in pagode, AKA, tresillo (a Cuban name for a rhythm found in Haitian konpa, Jamaican dancehall, and Brazilian xaxado)

It is actually found throughout the world, from India and Turkey, to Japan and Finland, and throughout Africa; from Breakbeats to Bollywood to Metal. (It is very common in Metal.) The parts played by the güiro in salsa and by the first and second surdos in samba have the same role: They are steady ostinati of half-cycle length. They are foundational. They set the tempo, provide a reference, and go a long way towards making the music danceable. (Offbeatness without respite, as Merriam said*****, would make music undanceable.)

Here are some neutral patterns: X…X…X…X… (four on the floor, which, with some pitch variation, can be interpreted as the first and second surdos):

Four quarter notes, clave-neutral (from Web, no source available)

….X.X…..X.X. (from ijexá):

surdo part for ijexá (from http://www.batera.com.br/Artigos/dia-do-folclore)

and XxxXXxxXXxxXXxxX. (This is a terrible way to represent swung samba 16ths. Below is Jake “Barbudo” Pegg’s diagrams, which work much better.)

Jake "Barbudo" Pegg's samba-sixteenths accent and timing diagrams (along with the same for "Western" music)

The fourth category is incoherent patterns. These are patterns that are not neutral, yet do not conform to either clave direction, either. (One of my informers gave me the idea of a fourth category when he reacted to one such pattern by making a disgusted face and a sound like bleaaahh.)

A pattern that has the clave property immediately tells all who can sense it that only patterns in that clave direction and patterns that are clave-neutral are okay to play while that pattern (that direction) is present. (We can weaken this sentence to apply only to prominent or repeated patterns. Quietly passing licks that cross clave may be acceptable, depending on the vigilance level of the teacher model.)

So, why mention all this right now? (After all, I’ve published these thoughts in peer-reviewed venues like Current Musicology, Bridges, and the Journal of Music, Technology and Education.)

For one thing, those are not the typical resources most musicians turn to. Until I can write up a short, highly graphical version of my clave-direction grammar for PAS, I will need to make some of these ideas available here. Secondly, the connection to gamification and musical-social-networking sites, like SoundCloud, are new ideas I got from talking to people at the NIPS reception, and I wanted to put this out there right away.

FOOTNOTES

* Mattingly, R., Modern Drummer, Modern Drummer Publications, Inc., Cedar Grove, NJ, “Giovanni Hidalgo-Conga Virtuoso,” p. 86, November 1998.

** While talking to Mr. Fereira of SoundCloud this evening at NIPS, he naturally mentioned genre recognition, which is the topic of my second-to-last post. (I argued about the need for expert listeners from many cultural backgrounds, which could be augmented with a sufficiently good implementation of crowd-sourcing.) I think he was telling me about embolada, or at least that’s how I interpreted his description of this MC-battle-type of improvised nordeste music. How many genre-recognition researchers even know where to start in telling a street-improvisation embolada from even, say, a pagode-influenced axé song like ‘Entre na Roda’ by Bom Balanço? (Really good swing detection might help, I suppose.)

*** This term has multiple meanings; I’m not referring to the genre partido-alto, but the pattern, which is one of the three primary ingredients of samba, along with the strong surdo beat on 2 (and 4) and the swung samba 16ths.

**** in the sense that, in the idiom, it goes with the so-called 3-2 “bossa clave” (a delightful misnomer): X..X..X…X..X..,

The "bossa clave" is a bit like an English horn; it's neither. as well as with the rather confusing (to some) third-surdo pattern ….X.X…..XX.X, Top: Surdo de terceira. Bottom: The 3-2 partido-alto for cuíca and agogô. Note that playing the partido-alto omitting the first and third crotchet’s worth of onsets results in the terceira.

which has two notes in its first half, and three notes in its second half. (Yes, it’s in 3-2. My grammar for clave direction explains this thoroughly. [http://academiccommons.columbia.edu/catalog/ac:180566])

***** See Merriam: “continual use of off-beating without respite would cause a readjustment on the part of the listener, resulting in a loss of the total effect; thus off-beating [with respite] is a device whereby the listeners’ orientation to a basic rhythmic pulse is threatened but never quite destroyed” (Merriam, Alan P. “Characteristics of African Music.” Journal of the International Folk Music Council 11 (1959): 13–19.)

ALSO, I use the term “offbeatness” instead of ‘syncopation’ because the former is not norm-based, whereas the latter turns out to be so:

Coined by Toussaint as a mathematically measurable rhythmic quantity [1], offbeatness has proven invaluable to the preliminary work of understanding Afro-Brazilian (partido-alto) clave direction. It is interpreted here as a more precise term for rhythmic purposes than ‘syncopation’, which has a formal definition that is culturally rooted: Syncopation is the placement of accents on normally unaccented notes, or the lack of accent on normally accented notes. It may be assumed that the norm in question is that of the genre, style or cultural/national origin of the music under consideration. However, in all usage around the world (except mine), normal accent placement is taken to be normal European accent placement [2, 3, 4].

For example, according to Kauffman [3, p. 394], syncopation “implies a deviation from the norm of regularly spaced accents or beats.” Various definitions by leading sources cited by Novotney also involve the concepts of “normal position” and “normally weak beat” [2, pp. 104, 108). Thus, syncopation is seen to be norm-referenced, whereas offbeatness is less contextual as it depends solely on the tactus.

Kerman, too, posits that syncopation involves “accents in a foreground rhythm away from their normal places in the background meter. This is called syncopation. For example, the accents in duple meter can be displaced so that the accents go on one two, one two, one two instead of the normal one two, one two” [4, p. 20; all emphasis in the original, as written]. Similarly, on p. 18, Kerman reinforces that “[t]he natural way to beat time is to alternate accented (“strong”) and unaccented (“weak”) beats in a simple pattern such as one two, one two, one two or one two three, one two three, one two three.” [4, p. 18]

Hence, placing a greater accent on the second rather than on the first quarter note of a bar may be sufficient to invoke the notion of syncopation. By this definition, the polka is syncopated, and since it is considered the epitome of “straight rhythm” to many performers of Afro-Brazilian music, syncopation clearly is not the correct term for what the concept of clave direction is concerned with. Offbeatness avoids all such cultural referencing because it is defined solely with respect to a pulse, regardless of cultural norms. (Granted, what a pulse is may also be culturally defined, but there is a point at which caveat upon caveat becomes counterproductive.)

Furthermore, in jazz, samba, and reggae (to name just a few examples) this would not qualify as syncopation (in the sense of accents in abnormal or unusual places) because beats other than “the one” are regularly accented in those genres as a matter of course. In the case of folkloric samba, even the placement of accents on the second eighth note, therefore, is not syncopation because at certain places in the rhythmic cycle, that is the normal—expected—pattern of accents for samba, part of the definition of the style. Hence, it does not constitute syncopation if we are to accept the definition of the term as used and cited by Kauffman, Kerman, and Novotney. In other words, “syncopation” is not necessarily the correct term for the phenomenon of accents off the downbeat when it comes to non-European music.

Moreover, in Meter in Music, Hule observes that “[a]ccent, defined as dynamic stress by seventeenth- and eighteenth-century writers, was one of the means of enhancing the perception of meter, but it became predominant only in the last half of the eighteenth century [emphasis added]. The idea that the measure is a pattern of accents is so widely held today that it is difficult to imagine that notation that looks modern does not have regular accentual patterns. Quite a number of serious scholarly studies of this music [European art music of 1600–1800] make this assumption almost unconsciously by translating the (sometimes difficult) early descriptions of meter into equivalent descriptions of the modern accentual measure” [5, p. viii] Thus, it turns out that the current view of rhythm and meter is not natural, or even traditional, let alone global. In fact, in Essential Dictionary of MUSIC NOTATION: The most practical and concise source for music notation is perfect for all musicians—amateur to professional (the actual book title) states that “the preferred/recommended beaming for the 9/8 compound meter is given as three groups of three eighth notes” [6, p. 73]. This goes against the accent pattern implied by the 9/8 meter in Turkish (and other Balkan) music, which is executed as 4+5, 5+4, 2+2+2+3, etc., but rarely 3+3+3. The 9/8 is one of the most common and typical meters in Turkish music, not an atypical curiosity. This passage is included here to demonstrate the dangers in applying western European norms to other musics (as indicated by the phrase “perfect for all musicians”).

[1]   Toussaint, G., 2005. Mathematical Features for Recognizing Preference in Sub-Saharan African Traditional Rhythm Timelines. Lecture Notes in Computer Science 3686:18-27. Springer Berlin/Heidelberg, 2005.                                                                                                                                [2]   Novotney, E. D. “The 3-2 Relationship as the Foundation of Timelines in West African Musics,” University of Illinois at Urbana-Champaign (Ph.D. dissertation), Urbana-Champaign, Illinois, 1998.
[3]   Kauffman, R. 1980. African Rhythm: A Reassessment. Ethnomusicology 24 (3):393–415.
[4]   Kerman, J., LISTEN: Brief Edition, New York, NY: Worth Publishers, Inc., 1987, p. 20.
[5]   Hule, G., Meter in Music, 1600–1800: Performance, Perception, and Notation, Bloomington, IN: Indiana University Press, 1999.
[6]   Gerou, T., and Lusk, L., Essential Dictionary of MUSIC NOTATION: The most practical and concise source for music notation is perfect for all musicians—amateur to professional, Van Nuys, CA: Alfred Publishing Co., Inc., 1996.

Culturally Situated and Image-based Genre Attribution

Posted on November 30, 2015December 1, 2015 by Mehmet Vurkaç

Genre recognition has become the holy grail of music information retrieval. What concerns me, before we worry about machine recognition of musical genre, is whether people can agree at all on what genre means, and what the various genres are. Wikipedia, Echonest, and many other sites (some now defunct) have put forth excellent information on various musical genres and their relationships to one another. My critique of the genre discussions I have encountered to date falls into two categories. One is the (necessarily, and not surprisingly) culturally narrow perspective of most work on musical genres. The other is the role non-aural, non-audio features play in the determination of genre. (These can be metadata, like release dates, or even more [sub]culturally determined information such as the clothing style of the artists.)

Let’s take the problem of narrowly culturally situated efforts first. There have been a variety of impressive resources on the Internet about the sub-sub-sub-genres of electronic music and of extreme metal. There is a wonderful degree of detail provided in these Web resources. However, the effort put into very subtle distinctions among “northern-based” musics (anything we typically understand as pop music, plus the folk and court musics of northern Europe and North America**) is rarely, if ever, matched by the knowledge available, perhaps, in English, on musics from other countries. We typically find some half a dozen genres listed for Brazil, Mexico, Japan, or Cuba, and far fewer for China, Turkey, Belize, Honduras, or Mali. This is a typical case of out-group bias, which is easy to understand; all people are subject to out-group bias. The importance of understanding biases lies in the effort to move beyond them. Are the differences among Xote, Brukdown, Özgün Müzik, and Guarapachangeo less significant than the differences between Goa Trance and Happy Hardcore, or Grindcore and Power Violence?*** Of course not, but who can know every little detail about the impossibly rich musical landscape of every culture? (That’s why we need multi-cultural teams to work on genre recognition and classification.)

The other issue is one I am only aware of in terms of “northern” (western) popular forms of music, and it is the issue of image-based, fashion-based, temporal, and geographical genre attribution. In many cases, the clothes worn by rock and pop artists seem to determine their musical genre more than the sounds created and organized into musical works by those artists. For example, Billy Idol and Avril Lavigne are thought of as Punk Rock artists. Yet, and even without appealing to DIY ethics and political content, we can tell from the aural experience that these artists make (or have made) something sufficiently aurally distant from the music of CRASS, pragVEC, Buzzcocks, X-Ray Spex, or BAD RELIGION, and that theirs are genres well removed from Punk Rock. (The artists listed do not all sound the same, but they share the elements of disaffected vocals, a lack of polish, and an overall dark despair with one another and with bands as far removed from them as Joy Division, The Paper Chase, Depeche Mode, and Sleater-Kinney, all of which have more sonic elements in common than they do with Idol or Lavigne.)

What makes the problem further difficult is that genre names are rarely descriptive, and all too often temporally and geographically limiting. Consider the genres NWOBHM (New Wave of British Heavy Metal), New Wave, Nü (new) Metal, Grunge, and Old-School Hip Hop.

Quite apart from the problem that “New Wave” actually has at least three different meanings, it is sonically possible (and common) for an artist making music thirty years after the end of the era attributed to one of these genres to make music with the same structure, affectation, instruments, sounds, and production. Which should we consider in determining genre: the year of release or the way the music sounds? New-millenial bands like Titanium Black and The Haunted, and even punk-rockers like Saviours, often play a flavor of Metal that sounds just like NWOBHM, but we are not supposed to call them that if they are from a different time, and especially, a different place. Likewise, ACCEPT and SCORPIONS (from Germany) sometimes played the same type of music, stylistically speaking, as Judas Priest, DIO, and IRON MAIDEN, but since they’re not British, we cannot refer to their music as NWOBHM. Or can we? Is it not the sounds and how they are organized that matters in determining music? (I think so.) Can anyone really tell, in a blinded listening test, whether a rhythm guitarist is German or British?

The Union Underground was a Metal band that had some success during the Nü Metal years. They had the look and the album art to be part of that era and that genre. However, listening to their music in 2008, I could not help but notice that the singing style really had little to do with Nü Metal, and quite a lot to do with Grunge, which was declared over by that time. As far as I can tell, no one talked of TUU as a late Grunge band.

An interesting pair that got me thinking further about image- and time-based genre attribution are Corrosion of Conformity and VOIVOD. Originally starting out in very disparate genres, in the farthest reaches of Hardcore Punk and Prog Metal, these two diverged in their music until their releases of the albums ‘KATORZ’ (by VOIVOD) and ‘CORROSiON OF CONFORMITY’ (by CoC, of course) in the late aughts. I find it nearly impossible to tell these two albums apart stylistically (though each is quite distant from the bands’ earlier output). When I saw CoC perform at Dante’s in Portland, they presented a marvellous synthesis of Prog agility and Punk attitude. (These two were not meant to go together, but it’s happening more and more.) Meanwhile, VOIVOD apparently drifted further and further into Punk Rock, and lost most of their Prog intricacies. Yet, if I were to stick to “what we know those bands to be,” I would be forced to attach opposite labels to songs from those two albums, which, even when I’m looking right at the readout on my display and know what I’m hearing, sound the same to me.

I mentioned Old-School Hip Hop above as well. Every now and then, you hear a new song, and it has that early, innocent flow we associate with everyone from The Jungle Brothers to MC Hammer. It’s old-school in that the time extents of rhythmic phrases in the vocals and the time extent of semantic phrases in the lyrics delivered by the same vocals coincide****.

Yet, maybe it was released in 2013. Yet, De La Soul was putting out music in 1989 that did not sound old-school; it was like what was going to happen ten years later. (I feel the same way about fu-schnickens’ 1992 album.) Some of the music in those old-school days was well ahead of its time, and some music that gets released even today brings back the old-school style. It’s the sound that counts, not the metadata.

There are many more examples, and perhaps better ones that I will add as I think of them, or hear them, but for now, I will conclude that, 1) genre studies and genre R&D need multi-cultural teams so that the level of attention to detail that is possible for Deep Psytrance vs. Gabber vs. New Romantic vs. New Wave will also be possible for ‘Bulgarian Rock’, ‘Hungarian Rock’, ‘Russian Pop’, and ‘Turkish Pop’. Sure, I’m glad someone in America even cares enough to put those on the map, but given the several hundred varieties of Electronica, Metal, and Hip Hop each, can we really believe there is only one variety of ‘Russian Pop’? (I know for a fact there are quite a few styles and genres within Turkish Pop.*****)

NOTES

* Yet, no matter how much detail each scholar, researcher, developer, or enthusiast goes into, it is likely to prove insufficient for the afficionado of that sub-sub-genre. The Echonest blog (at http://blog.echonest.com/post/52385283599/how-we-understand-music-genres) recently included the following comment: “. . . somebody, somewhere might care about (e.g. “gothic metal” vs. “symphonic metal” vs. “gothic symphonic metal”).” Yes, somebody right here not only does, finds those to be rather obvious and relevant distinctions, which are further complicated by Nightwish’s recent experiments combining symphonic, power, and folk metal.

** Why I use the term “northern” rather than “western” will be the topic of another post.

*** Yes, these are real genre names. To a connoisseur who lives Grindcore, Power Violence may sound completely different, while to an outsider neither would be distinguishable from the earliest ’80s Thrash.

****As Hip Hop matured, it became less and less common for the sentence and its rhythmic phrase to start and end together. This seems to be a result of the recognition that rhythm allows one to rhyme with any syllable in a word, not just the last syllable. So, an MC who wants to rhyme, for instance ‘crime’ with ‘time’ does not need each sentence to end with one of those words; s/he can rhyme ‘crime’ with ‘time’ in a sentence that might go “It was that time [break here] I went off to the east coast” where ‘time’, due to rhythmic phrasing, took care of the rhyme, and the rest of the sentence could still be uttered. In much old-school Hip Hop, the semantic phrases had to end at the same rhythmic stopping point.

*****As for style versus genre, let me try a quick explanation. A guitarist can play the blues in a Be-Bop context, a Psych context, or a Funk context (to name a few), and an MC/toaster can rap on a Hip Hop song, a Reggaeton, a Rock or Metal song, or even in a piece of modern “classical” music. Similarly, a drummer can play funk, swing, or shuffle in a Jazz band, Rock band, Pop group, or an experimental combo. The elements these musicians bring in are styles (blues, rapping, funk, swing, shuffle), while the complete package of the musical experience will likely fall into a genre or subgenre, like Electro Swing, Funk Metal, Be-Bop, or Chorinho.

	Mehmet Vurkaç on Look, it’s really simple…
	anon on Look, it’s really simple…
	Funeral Thank You No… on “Math is hard”: Ma…
	Mehmet Vurkaç on NIPS 2015: Thoughts about Soun…
	Elise on NIPS 2015: Thoughts about Soun…

Relative Offbeatness: Bits, Beats, and Bayes

language, logic, science, music, technology, education

Category: Music Information Retrieval

Listening to music seems easy.

NIPS 2015: Thoughts about SoundCloud, genres, clave tagging, clave gamification, multi-label classification, and perceptual manifolds

Culturally Situated and Image-based Genre Attribution