Herbie Hancock's Chameleon's BPM graph from the Android app 'liveBPM' (v. 1.2.0) by Daniel Bach

Listening to music seems easy.

Listening to music seems easy; it even appears like a passive task.

Listening, however, is not the same as hearing. In listening, i.e., attending, we add cognition to perception. The cognition of musical structures, cultural meanings, conventions, and even of the most fundamental elements themselves such as pitch or rhythm turns out to be a complex cognitive task. We know this is so because getting our cutting-edge technology to understand music with all its subtleties and its cultural contexts has proven, so far, to be impossible.

Within small fractions of a second, humans can reach conclusions about musical audio that are beyond the abilities of the most advanced algorithms.

For example, a trained or experienced musician (or even non-musician listener) can differentiate computer-generated and human-performed instruments in almost any musical input, even in the presence of dozens of other instruments sounding simultaneously.

In a rather different case, humans can maintain time-organizational internal representations of music while the tempo of a recording or performance continuously changes. A classic example is the jazz standard Chameleon by Herbie Hancock off the album ‘HEADHUNTERS’. The recording never retains any one tempo, following an up-and-down contour and mostly getting faster. Because tempo recognition is a prerequisite to other music-perception tasks like meter induction and onset detection, this type of behavior presents a significant challenge to signal-processing and machine-learning algorithms but generally poses no difficulty to human perception.

Another example is the recognition of vastly different cover versions of songs: A person familiar with a song can recognize within a few notes a cover version of that song done in another genre, at a different tempo, by another singer, and with different instrumentation.

Each of these is a task that is well beyond machine-learning techniques that are exhibiting remarkable successes with visual recognition where the main challenge, invariance, is less of an obstacle than the abstractness of music and its seemingly arbitrary meanings and structures.

Consider the following aspects of music cognition.

  • inferring a key (or a change of key) from very few notes
  • identifying a latent underlying pulse when it is completely obscured by syncopation [Tal et al., Missing Pulse]
  • effortlessly tracking key changes, tempo changes, and meter changes
  • instantly separating and identifying instruments even in performances with many-voice polyphony (as in Dixieland Jazz, Big-Band Jazz, Baroque and Classical European court music, Progressive Rock, folkloric Rumba, and Hindustani and Carnatic classical music)

These and many other forms of highly polyphonic, polyrhythmic, or cross-rhythmic music continue to present challenges to automated algorithms. Successful examples of automated tempo or meter induction, onset detection, source separation, key detection, and the like all work under the requirement of tight limitations on the types of inputs. Even for a single such task such as source separation, a universally applicable algorithm does not seem to exist. (There is some commercial software that appear to do these tasks universally, but because proprietary programs do not provide sufficiently detailed outputs, whether they really can perform all these function or whether they perform one function in enough detail to suffice for studio uses is uncertain. One such suite can identify and separate every individual note from any recording, but does not perform source separation into streams-per-instrument and presents its output in a form not conducive to analysis in rhythmic, harmonic, melodic, or formal terms, and not in a form analogous to human cognitive processing of music.)

Not only does universal music analysis remain an unsolved problem, but also most of the world’s technological effort goes toward European folk music, European classical music, and (international) popular music. The goal of my research and my lab (Lab BBBB: Beats, Beats, Bayes, and the Brain) is to develop systems for culturally sensitive and culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition, and to do so for popular music styles from the Global South that are not in the industry’s radar.

Since the human nervous system is able to complete musical-analysis tasks under almost any set of circumstances, in multiple cultural and cross-cultural settings, with varying levels of noise and interference, the human brain is still superior to the highest-level technology we have developed. Hence, Lab BBBB takes inspiration and direct insight from human neural processing of audio and music to solve culturally specific cognitive problems in music analysis, and to use this context to further our understanding of neuroscience and machine learning.

The long-term goal of our research effort is a feedback cycle:

  1. Neuroscience (in simulation and with human subjects at our collaborators’ sites) informs both music information retrieval and research into neural-network structures (machine learning). We are initially doing this by investigating the role of rhythm priming in Parkinson’s (rhythm–motor interaction) and in grammar-learning performance (rhythm–language interaction) in the basal ganglia. We hope to then replicate in simulation the effects that have been observed with people, verify our models, and use our modeling experience on other tasks that have not yet been demonstrated in human cases or that are too invasive or otherwise unacceptable.
  2. Work on machine learning informs neuroscience by narrowing down the range of investigation.
  3. Deep learning is also used to analyze musical audio using structures closer to those in the human brain than the filter-bank and matrix-decomposition methods typically used to analyze music.
  4. Music analysis informs cognitive neuroscience, we conjecture, as have been done in certain cases in the literature with nonlinear dynamics.
  5. Phenomena like entrainment and neural resonance in neurodynamics further inform the development of neural-network structures and data-subspace methods.
  6. These developments in machine learning move music information retrieval closer to human-like performance for culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition for multicultural intelligent music systems.


Happy 65th birthday, Deacy!

Thank you for Back Chat, Cool Cat, You And I, Spread Your Wings, In Only Seven Days, Misfire, Who Needs You, Under Pressure, You’re My Best Friend, One Year Of Love, If You Can’t Beat Them, Need Your Loving Tonight, and Pain Is So Close To Pleasure, not to mention AOBTD, FWBF & IWTBF. (And, really, every Queen song!).

You really nailed it whenever you wrote one. By the end of my life, I would like to have written one song that’s like one of yours.

Making sentences. . .

Winter term has been crazy, although in a good way. I’m teaching six classes this term, all of which are going great (I have awesome students), though one’s a new prep, so I haven’t had time to post here, but I recently found some silliness on my phone that I had come up with while waiting for a train: Making sentences by combining band names.

Here they are:

As Blood Runs Black The Refused Pierce The Veil

Tower of Power Was (Not Was) Built To Spill Ashes

Barenaked Ladies Poison The Well From First To Last

Bring Me The Horizon Within Temptation At The Drive-in

Blonde Redhead Of Montreal Cursed The Only Ones

(And, of course, a band name, all by itself: I Love You But I’ve Chosen Darkness)

Science, Clave, and Understanding

When Dr. Eben Alexander defended, in one of the major news magazines, his book (“proof[i]”) [1] about a spiritual non-physical afterlife realm, part of his argument was that he is a surgeon, and therefore a scientist. Surgeons are highly trained, highly specialized people who perform a very difficult and critically important service. It would be absurd not to recognize their value. Their work is without a doubt science-based, but does that make it “science”? (There are, of course, surgeons who publish scholarly work (although, I’ve noticed that in some cases, it’s not about surgery, but on fields as distant as music), and thus function as scholars, and therefore scientists.)

A scientist is not anyone who functions as a professional practitioner of a difficult and science-based field; a scientist is someone who sets up, tests, and evaluates (mostly via statistical data analysis) testable hypotheses (about anything, including the afterlife and spiritual realms, if necessary), and more importantly, does so within the guidelines of rigor, accuracy, objectivity, skepticism, and open-mindedness [2][ii]. It is worrisome to imagine that surgeons are setting up double-blinded clinical trials of surgical practices as part of their work, choosing to apply a known good technique on one patient and an as-yet-unsupported one on another patient. (In other words, I really hope surgeons do not act as scientists.) Maybe they do; I’d like to know, so please give me feedback on this question.

Assuming, though, that they don’t endanger patients’ lives for the sake of science, as we tend not to do anymore, it seems safe to assume that surgeons are highly trained specialists who practice state-of-the-art medicine. In this sense, they are not scientists. They use the findings and results of science in their practical, applied work (medicine). They must, then, fall somewhere between applied scientists and technologists (inclusive).

To say that someone who practices a specialty that is based on scientific findings is therefore a scientist is like saying a sandwich-shop employee is a farmer because they use bacon, lettuce, and tomato in their work. (The fact that surgery is far more specialized does not invalidate the argument.)

The professions that discover, invent, develop, and apply are all different. The roles can overlap—scientists do develop and build new equipment to perform their experiments, but these are not mass-produced. Anything we can purchase repeatedly on amazon or at Best Buy, say, was not made by scientists. It was designed, developed, tested, and manufactured by engineers, technologists, technicians, and other professionals, not by scientists, even if scientists were involved in the early stages. As for applied scientists, including those who work at laboratories, characterizing soil samples, say, or performing tests, they are also highly trained specialists of scientific background who are not doing science at that point. As one XKCD comic suggested [3], you can simply order a lab coat from a catalog; no one will check your publication record. Science is not solely about what you’re wearing or what degrees you have; it’s about what, exactly, you’re doing.

The public’s idea of what science is seems to be “mathy and difficult, preferable done in a lab coat while uttering multisyllabic words you don’t want to see in your cereal’s list of ingredients.” This may be a decent shortcut for pop-culture purposes, but it is not what science really is. I will not go into the inductive-method-vs-hypothetico-deductive-method-vs-what-have-you debate here because there are people who do that professionally, and do it very well. (I have been enjoying Salmon’s The Foundations of Scientific Inference [4] immensely.) What I do want to do is draw two parallels in succession, first from the preceding discussion to explanation and understanding, and from those concepts, to explanations and understanding of clave (in music).

The former has been done quite successfully in Paul Dirac-medal-and-prize-winning physicist Deutsch’s earlier book The Fabric of Reality [5]. I am not concerned here with the bulk and main point of his book, but only with his opening argument about the role of science (explanation) and what it means to understand. Deutsch criticizes instrumentalism because of its emphasis on prediction at the cost of explanation (pp. 3–7). He gives rather good examples of situations in which no scientist (or layperson, for that matter) would be satisfied with good predictions without explanations (p. 6, for example). He does not deny the role and importance of predictions, but argues that “[t]o say that prediction is the purpose of scientific theory is [. . .] like saying that the purpose of a spaceship is to burn fuel” (similar to another author’s argument that the purpose of a car is not to make vrooom–vrooom noises; they just happen to do that as part of their operation[iii]). Deutsch states that just like spaceships have to burn fuel to do what they’re really meant to do, theories have to pass experimental tests in order “to achieve the real purpose of science, which is to explain the world.” (Think about it: Why did we all, as children, get excited about science? To understand the world!)

He then moves on to explain that theories with greater explanatory power than the ones they’ve replaced are not necessarily more difficult to understand, and certainly do not necessarily add to the list of theories one has to understand the be a scientist (or an enthusiast). Theories with better explanatory power can be simpler. Furthermore, not everything that could be learned and understood needs to be: See his example of multiplication with Roman numerals (pp. 9–10). It might be fun, and occasionally necessary to have some source in which to look it up (for purposes of the history of mathematics, say), but it’s not something anyone today needs a working knowledge of; it has been superseded. His example for this is how the Copernican system superseded the Ptolemaic system, and made astronomy simpler in the process (p. 9). All of this is discussed in order to make the point that there is a distinction between “understanding and ‘mere’ knowing” (p. 10), which is where my interest in clave comes into play.

Several “explanations” of clave (sometimes even with that word in the title) that were published in recent years have been of the “mere knowing” type in which clave patterns are listed, without any explanation as to how and why they indicate what other patterns are allowed or disallowed in the idiom. Telling someone that x..x..x…x.x… is 3-2, and ..x.x…x..x..x. is its opposite, so 2-3, and (essentially) “there you go, you now know clave” does nothing towards explaining why a certain piano pattern played over one is “sick” (good) and over the other, sickening (bad) within the idiom.

Imagine if the natural sciences went about education the way we musicians do with clave. A chapter in a high-school biology book would contain a diagram of the Krebs cycle, with all the inputs, outputs (sorry for the electrical-engineer language), and enzymes given by name and formula, followed by “and now you know biochemical pathways,” without any explanation as to how it has anything to do with an organism being alive. I’m flabbergasted that musicians and music scholars find mere listings of clave son, clave rumba, [and . . . you know, the other one that won’t be named[iv]] sufficient as so-called explanations[v].

All of this reminds me of an argument I once had with a very intelligent person. I had said, in my talk at Tuesday Talks, that science is concerned with ‘why’ and ‘how’, not just ‘how’. He disagreed, which I think is because he thought of a different type of ‘why’: the theological ‘why’. I, instead, had in mind Deutsch’s type of ‘why’: “about what must be so, rather than what merely happens to be so; about laws of nature rather than rules of thumb” (p. 11). I would add, about consistency (even given Goedel, because I’m Bayesian like that, and not so solipsistic), which Deutsch mentions immediately afterwards, calling it ‘coherence’.[vi]

I understand that Hume, Goedel, and others have shown us that our confidence in science, or even math, ought not to be infinite. It isn’t. Even in a book like The God Delusion, even Richard Dawkins makes it clear that he is not absolutely certain. Scientific honesty requires that we not be absolutely certain. But we can examine degrees of (un)certainty, and specifically because of the solipsists, we have to ignore them[vii], and be imperfect pursuers of an imperfect truth, improving our understanding, all the while knowing that it could all be wrong.

To that end, I continue to test my clave hypothesis under different genres. Even if it’s wrong, it definitely is elegant.

[1] Alexander, M.D., E., Proof of Heaven: A Neurosurgeon’s Near-Death Experience and Journey into the Afterlife, Simon & Schuster, 2012.

[2] Baron, R. A., and Kalsher, M. J., Essentials of Psychology, Needham, MA: Allyn & Bacon, A Pearson Education Company, 2002.

[3] http://xkcd.com/699/ (last accessed 12/25/2015).

[4] Salmon, W. C., The Foundations of Scientific Inference, Pittsburgh, Pennsylvania: University of Pittsburgh Press, 1966.

[5] Deutsch, D., The Fabric of Reality: A leading scientist interweaves evolution, theoretical physics, and computer science to offer a new understanding of reality, New York: Penguin Books, 1997.

[i] Scientists do not speak of proof; they deal with evidence. Proofs are limited to the realm of mathematics. There are no scientific proofs; there are just statistically significant results, which are presented to laypersons as ‘proof’ because even scientists have quite a lot of difficulty interpreting measures of statistical significance, and the average person has no patience for or interest in the details of philosophy of science.

[ii] The authors of [2] give the following excellent definitions for these precise terms. Accuracy: “gathering and evaluating information in as careful, precise, and error-free a manner as possible”; objectivity: “obtaining and evaluating such information in a manner as free from bias as possible” [Ibid.]. ‘Bias’ in this case refers to the cognitive biases that are natural to human thinking and judgment, such as confirmation bias, Hawthorne effect[ii], selection bias, etc.; skepticism: the willingness to accept findings “only after they have been verified over and over”; and open-mindedness: not resisting changing one’s own views—even those that are strongly held—in the face of evidence that they are inaccurate [2]. To these we can add principles like transferability and falsifiability, and the key tools of double-blinding, randomization, blocking, and the like. Together, all these techniques and principles constitute science. Simply being trained in science and carrying out science-based work is not sufficient.

[iii] I think it was Philips in The Undercover Philosopher, but I’m not sure.

[iv] If you’ve read my post about running into cool people from SoundCloud at NIPS ’15, you’ll know what pattern I’m talking about: the English-horn-like-named pattern.

[v] Fortunately, we do have work from the likes of Mauleón and Lehmann that show causal relationships between individual notes or phrases in different instrumental lines, but since their work and mine, the trend has reverted to listing three patterns, and calling that an explanation.

[vi] Perhaps this paragraph needs its own blog post. . .

[vii] Because, according to them, they don’t exist.

NIPS 2015: Thoughts about SoundCloud, genres, clave tagging, clave gamification, multi-label classification, and perceptual manifolds

On December 9th, at NIPS 2015, I met two engineers from SoundCloud, which is not only providing unsigned artists a venue to get their music heard (and commented on), and providing recommendation and music-oriented social networking, but also, if I understand correctly, is interested in content analysis for various purposes. Some of those have to do with identifying work that may not be original, which can range from quotation to plagiarism (the latter being an important issue in my line of work: education), but also involve the creation of derivative content, like remixing, to which they seem to have a healthy approach. (At the same event, the IBM Watson program director also suggested that they could conceivably be interested in generative tools based on music analysis.)

I got interested in clave-direction recognition to help musicians, because I was one, and I was struggling—clave didn’t make sense. Why were two completely different patterns in the same clave direction, and two very similar patterns not? To make matters worse, in samba batucada, there was a pattern said to be in 3-2, but with two notes in the first half, followed by three notes in the second half. There had to be a consistent explanation. I set out to find it. (If you’re curious, I explained the solution thoroughly in my Current Musicology paper.)


Top: Surdo de terceira. Bottom: The 3-2 partido-alto for cuíca and agogô. Note that playing the partido-alto omitting the first and third crotchet’s worth of onsets results in the terceira.

However, clave is relevant not just to music-makers, but to informed listeners and dancers as well. A big part of music-in-society is the communities it forms, and that has a lot to do with expertise and identity in listeners. Automated recognition of clave-direction in sections of music (or entire pieces) can lead to automated tagging of these sections or pieces, increasing listener identification (which can be gamified) or helping music-making.

My clave-recognition scheme (which is an information-theoretically aided neural network) recognizes four output classes (outside, inside, neutral, and incoherent). In my musicological research, I also developed three teacher models, but only from a single cultural perspective. Since then, I have recently submitted a work-in-progress and accompanying abstract to AAWM 2016 (Analytical Approaches to World Music) about what would happen if I looked at clave direction from different cultural perspectives (which I have encoded as phase shifts), and graphed the results in the complex plane (just like phase shift in electric circuits).

Another motivating idea came from today’s talk Computational Principles for Deep Neuronal Architectures by Haim Sompolinsky: perceptual manifolds. The simplest manifold proposed was line segments. This is poignant to clave recognition because among my initial goals was extending my results to non-idealized onset vectors: [0.83, 0.58, 0.06, 0.78] instead of [1101], for example. The line-segment manifold would encode this as onset strengths (“velocity” in MIDI terminology) ranging from 0 (no onset) to 1 (127 in MIDI). This will let me look inside the onset-vector hypercube.

Another tie-in from NIPS conversations is employing Pareto frontiers with my clave data for a version of multi-label learning. Since I can approach each pattern from two phase perspectives, and up to three teacher models (vigilance levels), a good multi-label classifier would have to provide up to 6 correct outputs, and in the case that a classifier cannot be that good, the Pareto frontier would determine which classifiers are undominated.

Would all this be interesting to musicians? Yes, I think so. Even without going into building a clave-trainer software into various percussion gear or automated-accompaniment keyboards, this could allow clave direction to be gamified. Considering all the clave debates that rage in Latin-music-ian circles (such as the “four great clave debates” and the “clave schism” issues like around Giovanni Hidalgo’s labeling scheme quoted in Modern Drummer*), a multi-perspective clave-identification game could be quite a hit.

So, how does a Turkish math nerd get to be obsessed by this? I learned about clave—the Afro-Latin (or even African-Diasporan) concept of rhythmic harmony that many people mistake for the family of fewer than a dozen patterns, or for a purely Cuban or “Latin” organizational principle—around 1992 from the musicians of Bochinche and Sonando, two Seattle bands. I had also grown up listening to Brazilian (and Indian, Norwegian, US, and German) jazz in Turkey. (My first live concert by a foreign band was Hermeto Pascoal e Grupo, featuring former CBC faculty Jovino Santos Neto.) So, I knew that I wanted to learn about Brazilian music. (At the time, most of what I listened to was Brazilian jazz, like Dom Um Romao and Airto, and I had no idea that they mostly drew from nordestino music, like baião, xote, côco, and frevo**―not samba).

Fortunately, I soon moved to Portland, where Brian Davis and Derek Reith of Pink Martini had respectively founded and sustained a bloco called Lions of Batucada. Soon, Brian introduced us to Jorge Alabê, and then to California Brazil Camp, with its dozens of amazing Brazilian teachers. . . But let’s get back to clave.

I said above that clave is “the Afro-Latin (or even African-Diasporan) concept of rhythmic harmony that many people mistake for the family of fewer than a dozen patterns, or for a purely Cuban or ‘Latin’ organizational principle.” What’s wrong with that?

Well, clave certainly is an organizational principle: It tells the skilled musician, dancer, or listener how the rhythm (the temporal organization, or timing) of notes in all the instruments may and may not go during any stretch of the music (as long as the music is from a tradition that has this property, of course).

And clave certainly is a Spanish-language word that took on its current meaning in Cuba, as explained wonderfully in Ned Sublette’s book.

However, the transatlantic slave trade did not only move people (forcefully) to Cuba. The Yorùbá (of today’s southwest Nigeria and southeast Benin), the Malinka (a misnomer, according to Mamady Keïta for people from Mali, Ivory Coast, Burkina Faso, Gambia, Guinea, and Senegal), and the various Angolan peoples were brought to many of today’s South American, Caribbean, and North American countries, where they culturally and otherwise interacted with Iberians and the natives of the Americas.

Certain musicological interpretations of Rolando Antonio Pérez Fernández’s book La Binarización de los Ritmos Ternarios Africanos en América Latina have argued that the organizational principles of Yoruba 12/8 music, primarily the standard West African timeline (X.X.XX..X.X.X)

Bembé ("Short bell") or the standard West African timeline, along with its major-scale analog

and the Malinka/Manding timelines met the 4/4 time signatures of Angolan and Iberian music, and morphed into the organizational timelines of today’s rumba, salsa, (Uruguayan) candombe, maracatu, samba, and other musics of the Americas.

Some of those timelines we all refer to as clave, but for others, like the partido-alto in Brazil***, it is sometimes culturally better not to refer to them as clave patterns. (This is understandable, in that Brazilians speak Portuguese, and do not always like to be mistaken for Spanish-speakers.)

Conceptually, however, partido-alto in samba plays the same organizational role that clave plays in rumba and salsa, or the gongue pattern plays in maracatu: It immediately tells knowledgeable musicians how not to play.

In my research, I found multiple ways to look at the idiomatic appropriateness of arbitrary timing patterns (more than 10,000 of them, only about a hundred of which are “traditional” [accepted, commonly used] patterns). I identified three “teacher” models, which are just levels of strictness. I also identified four clave-direction categories. (Really, these were taught to me by my teacher-informers, whose reactions to certain patterns informed some of the categories.)

Some patterns are in 3-2 (which I call “outside”). While the 3-2 clave son (X..X..X…X.X…):

3-2 (outside) clave son, in northern and TUBS notation

is obvious to anyone who has attempted to play anything remotely Latin, it is not so obvious why the following version of the partido-alto pattern is also in the 3-2 direction****: .X..X.X.X.X..X.X

The plain 3-2 partido-alto pattern. (The pitches are approximate and can vary with cuíca intonation or the agogô maker’s accuracy.) "Bossa clave" in 3-2 and 2-3 are added in TUBS notation to show the degree of match and mismatch with 3-2 and 2-3 patterns, respectively.


Some patterns are in 2-3 (which I call “inside”). Many patterns that are heard throughout all Latin American musics are clave-neutral: They provide the same amount of relative offbeatness no matter which way you slice them. The common Brazilian hand-clapping pattern in pagode, X..X..X.X..X..X. is one such pattern:

The clave-neutral hand-clapping pattern in pagode, AKA, tresillo (a Cuban name for a rhythm found in Haitian konpa, Jamaican dancehall, and Brazilian xaxado)

It is actually found throughout the world, from India and Turkey, to Japan and Finland, and throughout Africa; from Breakbeats to Bollywood to Metal. (It is very common in Metal.) The parts played by the güiro in salsa and by the first and second surdos in samba have the same role: They are steady ostinati of half-cycle length. They are foundational. They set the tempo, provide a reference, and go a long way towards making the music danceable. (Offbeatness without respite, as Merriam said*****, would make music undanceable.)

Here are some neutral patterns: X…X…X…X… (four on the floor, which, with some pitch variation, can be interpreted as the first and second surdos):

Four quarter notes, clave-neutral (from Web, no source available)

….X.X…..X.X. (from ijexá):

surdo part for ijexá (from http://www.batera.com.br/Artigos/dia-do-folclore)


and XxxXXxxXXxxXXxxX. (This is a terrible way to represent swung samba 16ths. Below is Jake “Barbudo” Pegg’s diagrams, which work much better.)

Jake "Barbudo" Pegg's samba-sixteenths accent and timing diagrams (along with the same for "Western" music)

The fourth category is incoherent patterns. These are patterns that are not neutral, yet do not conform to either clave direction, either. (One of my informers gave me the idea of a fourth category when he reacted to one such pattern by making a disgusted face and a sound like bleaaahh.)

A pattern that has the clave property immediately tells all who can sense it that only patterns in that clave direction and patterns that are clave-neutral are okay to play while that pattern (that direction) is present. (We can weaken this sentence to apply only to prominent or repeated patterns. Quietly passing licks that cross clave may be acceptable, depending on the vigilance level of the teacher model.)

So, why mention all this right now? (After all, I’ve published these thoughts in peer-reviewed venues like Current Musicology, Bridges, and the Journal of Music, Technology and Education.)

For one thing, those are not the typical resources most musicians turn to. Until I can write up a short, highly graphical version of my clave-direction grammar for PAS, I will need to make some of these ideas available here. Secondly, the connection to gamification and musical-social-networking sites, like SoundCloud, are new ideas I got from talking to people at the NIPS reception, and I wanted to put this out there right away.



* Mattingly, R., Modern Drummer, Modern Drummer Publications, Inc., Cedar Grove, NJ, “Giovanni Hidalgo-Conga Virtuoso,” p. 86, November 1998.

** While talking to Mr. Fereira of SoundCloud this evening at NIPS, he naturally mentioned genre recognition, which is the topic of my second-to-last post. (I argued about the need for expert listeners from many cultural backgrounds, which could be augmented with a sufficiently good implementation of crowd-sourcing.) I think he was telling me about embolada, or at least that’s how I interpreted his description of this MC-battle-type of improvised nordeste music. How many genre-recognition researchers even know where to start in telling a street-improvisation embolada from even, say, a pagode-influenced axé song like ‘Entre na Roda’ by Bom Balanço? (Really good swing detection might help, I suppose.)

*** This term has multiple meanings; I’m not referring to the genre partido-alto, but the pattern, which is one of the three primary ingredients of samba, along with the strong surdo beat on 2 (and 4) and the swung samba 16ths.

**** in the sense that, in the idiom, it goes with the so-called 3-2 “bossa clave” (a delightful misnomer): X..X..X…X..X..,

The "bossa clave" is a bit like an English horn; it's neither.as well as with the rather confusing (to some) third-surdo pattern ….X.X…..XX.X, Top: Surdo de terceira. Bottom: The 3-2 partido-alto for cuíca and agogô. Note that playing the partido-alto omitting the first and third crotchet’s worth of onsets results in the terceira.

which has two notes in its first half, and three notes in its second half. (Yes, it’s in 3-2. My grammar for clave direction explains this thoroughly. [http://academiccommons.columbia.edu/catalog/ac:180566])

***** See Merriam: “continual use of off-beating without respite would cause a readjustment on the part of the listener, resulting in a loss of the total effect; thus off-beating [with respite] is a device whereby the listeners’ orientation to a basic rhythmic pulse is threatened but never quite destroyed” (Merriam, Alan P. “Characteristics of African Music.” Journal of the International Folk Music Council 11 (1959): 13–19.)

ALSO, I use the term “offbeatness” instead of ‘syncopation’ because the former is not norm-based, whereas the latter turns out to be so:

Coined by Toussaint as a mathematically measurable rhythmic quantity [1], offbeatness has proven invaluable to the preliminary work of understanding Afro-Brazilian (partido-alto) clave direction. It is interpreted here as a more precise term for rhythmic purposes than ‘syncopation’, which has a formal definition that is culturally rooted: Syncopation is the placement of accents on normally  unaccented notes, or the lack of accent on normally accented notes. It may be assumed that the norm in question is that of the genre, style or cultural/national origin of the music under consideration. However, in all usage around the world (except mine), normal accent placement is taken to be normal European accent placement [2, 3, 4].

For example, according to Kauffman [3, p. 394], syncopation “implies a deviation from the norm of regularly spaced accents or beats.” Various definitions by leading sources cited by Novotney also involve the concepts of “normal position” and “normally weak beat” [2, pp. 104, 108). Thus, syncopation is seen to be norm-referenced, whereas offbeatness is less contextual as it depends solely on the tactus.

Kerman, too, posits that syncopation involves “accents in a foreground rhythm away from their normal places in the background meter. This is called syncopation. For example, the accents in duple meter can be displaced so that the accents go on one two, one two, one two instead of the normal one two, one two” [4, p. 20; all emphasis in the original, as written]. Similarly, on p. 18, Kerman reinforces that “[t]he natural way to beat time is to alternate accented (“strong”) and unaccented (“weak”) beats in a simple pattern such as one two, one two, one two or one two three, one two three, one two three.” [4, p. 18]

Hence, placing a greater accent on the second rather than on the first quarter note of a bar may be sufficient to invoke the notion of syncopation. By this definition, the polka is syncopated, and since it is considered the epitome of “straight rhythm” to many performers of Afro-Brazilian music, syncopation clearly is not the correct term for what the concept of clave direction is concerned with. Offbeatness avoids all such cultural referencing because it is defined solely with respect to a pulse, regardless of cultural norms. (Granted, what a pulse is may also be culturally defined, but there is a point at which caveat upon caveat becomes counterproductive.)

Furthermore, in jazz, samba, and reggae (to name just a few examples) this would not qualify as syncopation (in the sense of accents in abnormal or unusual places) because beats other than “the one” are regularly accented in those genres as a matter of course. In the case of folkloric samba, even the placement of accents on the second eighth note, therefore, is not syncopation because at certain places in the rhythmic cycle, that is the normal—expected—pattern of accents for samba, part of the definition of the style. Hence, it does not constitute syncopation if we are to accept the definition of the term as used and cited by Kauffman, Kerman, and Novotney. In other words, “syncopation” is not necessarily the correct term for the phenomenon of accents off the downbeat when it comes to non-European music.

Moreover, in Meter in Music, Hule observes that “[a]ccent, defined as dynamic stress by seventeenth- and eighteenth-century writers, was one of the means of enhancing the perception of meter, but it became predominant only in the last half of the eighteenth century [emphasis added]. The idea that the measure is a pattern of accents is so widely held today that it is difficult to imagine that notation that looks modern does not have regular accentual patterns. Quite a number of serious scholarly studies of this music [European art music of 1600–1800] make this assumption almost unconsciously by translating the (sometimes difficult) early descriptions of meter into equivalent descriptions of the modern accentual measure” [5, p. viii] Thus, it turns out that the current view of rhythm and meter is not natural, or even traditional, let alone global. In fact, in Essential Dictionary of MUSIC NOTATION: The most practical and concise source for music notation is perfect for all musicians—amateur to professional (the actual book title) states that “the preferred/recommended beaming for the 9/8 compound meter is given as three groups of three eighth notes” [6, p. 73]. This goes against the accent pattern implied by the 9/8 meter in Turkish (and other Balkan) music, which is executed as 4+5, 5+4, 2+2+2+3, etc., but rarely 3+3+3. The 9/8 is one of the most common and typical meters in Turkish music, not an atypical curiosity. This passage is included here to demonstrate the dangers in applying western European norms to other musics (as indicated by the phrase “perfect for all musicians”).

[1]    Toussaint, G., 2005. Mathematical Features for Recognizing Preference in Sub-Saharan African Traditional Rhythm Timelines. Lecture Notes in Computer Science 3686:18-27. Springer Berlin/Heidelberg, 2005.                                                                                                                                [2]    Novotney, E. D. “The 3-2 Relationship as the Foundation of Timelines in West African Musics,” University of Illinois at Urbana-Champaign (Ph.D. dissertation), Urbana-Champaign, Illinois, 1998.
[3]    Kauffman, R. 1980. African Rhythm: A Reassessment. Ethnomusicology 24 (3):393–415.
[4]    Kerman, J., LISTEN: Brief Edition, New York, NY: Worth Publishers, Inc., 1987, p. 20.
[5]    Hule, G., Meter in Music, 1600–1800: Performance, Perception, and Notation, Bloomington, IN: Indiana University Press, 1999.
[6]    Gerou, T., and Lusk, L., Essential Dictionary of MUSIC NOTATION: The most practical and concise source for music notation is perfect for all musicians—amateur to professional, Van Nuys, CA: Alfred Publishing Co., Inc., 1996.

It’s not only the rent: Old, new, and middle Portland

There has been much ferment, uproar, and outcry against the gentrification of “the old Portland” in the weeklies of Portland, Oregon, and in conversations around town lately. The skyrocketing of rent is a well-known and much discussed issue, as is the second big migration of a certain underrepresented minority out of what has become the new standard boundaries of hip Portland. Another, somewhat less publicized reason to be concerned (except for the recent WW article) is the squeezing out of Portland’s artists, the very people who took a grimy drug-troubled city no one outside the Pacific Northwest had heard of, and turned it into the modern designer clean-living mecca of the United States. To understand this process, and what I think is going wrong, I must clarify a point of definition: Most people talking about “the old Portland” are not actually talking about old Portland; they’re referring to what I will call “middle Portland.” “The old Portland” is what you can see in the movie ‘Drugstore Cowboy’: Crime, drugs, rain, empty streets, and little to do.

I moved to Portland between the old and middle periods, in 1995. My first visit, a few years prior, had me entering the city on a Greyhound bus through the NW Industrial Zone (not exactly a pretty sight, but a necessary one), and staying at a hostel on Hawthorne just to see a famous Senegalese band before I headed back to the small town where I was going to college.

Portland was legendary: It had La Luna and Satyricon. Bands like Dead Moon, WIPERS, and Poison Idea were rumoured to play there. I could only imagine what they were like. I later found out I was pretty far off. In any case, I did eventually move to Portland in 1995 to go to grad school, preferring PSU to higher-ranking universities because I wanted to be in a city, no matter how small.

And it was small. Traffic was virtually nonexistent. People wore sweatpants everywhere, unless they cared even less and wore pajama bottoms, or cared more and wore outrageously awesome punk outfits. High-heeled shoes were unknown, unless they were worn by occasional glam holdovers. It was nothing like the Portland of 2005, what I call middle Portland, or the Portland of today, 2015, the new Portland.

In 2005, you could still get from any part of town to any other in 45 minutes by bus (Tri-Met) and 15 minutes if you drove. Downtown to Hillsboro took 20 minutes. A few years prior to that, I lived in the Brooklyn neighborhood, close-in SE, and worked in Hillsboro. My commute took about 25 minutes.

I am not listing these travel times for purposes of complaining, but only for comparison. After all, I grew up in a city of 12 million, and to this day, I’m not especially bothered by even a two-hour commute. My point is that Portland was different in 1995, and different in 2005 from both now and the way it was in 1995.

What I experienced was the development of Portland into an arts mecca, the next Seattle or Austin (from whom we stole our Music Millenium slogan), and the city collectors traveled to from as far as Japan to buy vinyl records. In 2008, when I attended a conference in Philly, a Drexel student asked me how long I lived in Portland. When I told her I’d been there since ’95, she said “Oh, so you’ve been there since before it was cool.”

Yes, I was a small part of making it that way—I’m one of the thousands of musicians and maybe tens of thousands of artists overall, that helped turn Portland into the place to be if you wanted to be cool. . . not one of the significant ones who made it big, but I was there, playing behind a few of the big names whenever I could, all because I happened to talk to everyone I met about being a drummer. There weren’t that many around, and I eventually met some awesome people who taught me, encouraged me, and occasionally called me up for something pretty awesome.

But, this story is not about me; it’s about those who are still trying to make music, make art, make films, and maybe even make it in Portland. (I was going to say “make it big” but these days, people are just trying to get by.)

And here’s the rub. When the people moving into old east-side neighborhoods start lobbying to end late-night live music or pressure their neighbors to stop practicing in their basements, they are trying to turn Portland-proper into a suburb. They moved to Portland because it’s “cool,” part of which is that it has interesting jobs and beautiful houses to live in within walking distance of bars, restaurants, and coffee shops. Many of those establishments are staffed by musicians, painters, graphic designers, theater actors, comedians, and writers. What made Portland cool in the first place was the artists! The musicians and graffiti artists are foremost among the people who made Portland visible to the rest of the world (though I should not forget the graphic artists, some of whose work reached me in my crazy third-world hometown back in the ’80s). And everyone contributed to the liberal, progressive, sometimes-so-woo-as-to-be-regressive, but always artistic culture of Portland. These people are being driven away by the rapidly rising cost of living, and also being told to stop making all that noise and mess.

This post was inspired by the entry ‘Manufactured Spaces’ (specifically pp. 47–49) in the book ‘Portlandness: A Cultural Atlas’. Created by a big team of cartographers, designers, students, and teachers, this is both a beautiful and a substantial book. The discussion of the official interpretation of quality of life drove me to add my voice to the uproar over the new Portland. True, I wasn’t born or raised there, but I spent more of my life there than anywhere else. I don’t exactly miss the old Portland, and I don’t mind many of the improvements of the new Portland. But I do worry about the destruction of middle Portland, which to me is all about the arts.

Culturally Situated and Image-based Genre Attribution

Genre recognition has become the holy grail of music information retrieval. What concerns me, before we worry about machine recognition of musical genre, is whether people can agree at all on what genre means, and what the various genres are. Wikipedia, Echonest, and many other sites (some now defunct) have put forth excellent information on various musical genres and their relationships to one another. My critique of the genre discussions I have encountered to date falls into two categories. One is the (necessarily, and not surprisingly) culturally narrow perspective of most work on musical genres. The other is the role non-aural, non-audio features play in the determination of genre. (These can be metadata, like release dates, or even more [sub]culturally determined information such as the clothing style of the artists.)

Let’s take the problem of narrowly culturally situated efforts first. There have been a variety of impressive resources on the Internet about the sub-sub-sub-genres of electronic music and of extreme metal. There is a wonderful degree of detail provided in these Web resources. However, the effort put into very subtle distinctions among “northern-based” musics (anything we typically understand as pop music, plus the folk and court musics of northern Europe and North America**)  is rarely, if ever, matched by the knowledge available, perhaps, in English, on musics from other countries. We typically find some half a dozen genres listed for Brazil, Mexico, Japan, or Cuba, and far fewer for China, Turkey, Belize, Honduras, or Mali. This is a typical case of out-group bias, which is easy to understand; all people are subject to out-group bias. The importance of understanding biases lies in the effort to move beyond them. Are the differences among Xote, Brukdown, Özgün Müzik, and Guarapachangeo less significant than the differences between Goa Trance and Happy Hardcore, or Grindcore and Power Violence?*** Of course not, but who can know every little detail about the impossibly rich musical landscape of every culture? (That’s why we need multi-cultural teams to work on genre recognition and classification.)

The other issue is one I am only aware of in terms of “northern” (western) popular forms of music, and it is the issue of image-based, fashion-based, temporal, and geographical genre attribution. In many cases, the clothes worn by rock and pop artists seem to determine their musical genre more than the sounds created and organized into musical works by those artists. For example, Billy Idol and Avril Lavigne are thought of as Punk Rock artists. Yet, and even without appealing to DIY ethics and political content, we can tell from the aural experience that these artists make (or have made) something sufficiently aurally distant from the music of CRASS, pragVEC, Buzzcocks, X-Ray Spex, or BAD RELIGION, and that theirs are genres well removed from Punk Rock. (The artists listed do not all sound the same, but they share the elements of disaffected vocals, a lack of polish, and an overall dark despair with one another and with bands as far removed from them as Joy Division, The Paper Chase, Depeche Mode, and Sleater-Kinney, all of which have more sonic elements in common than they do with Idol or Lavigne.)

What makes the problem further difficult is that genre names are rarely descriptive, and all too often temporally and geographically limiting. Consider the genres NWOBHM (New Wave of British Heavy Metal), New Wave, Nü (new) Metal, Grunge, and Old-School Hip Hop.

Quite apart from the problem that “New Wave” actually has at least three different meanings, it is sonically possible (and common) for an artist making music thirty years after the end of the era attributed to one of these genres to make music with the same structure, affectation, instruments, sounds, and production. Which should we consider in determining genre: the year of release or the way the music sounds? New-millenial bands like Titanium Black and The Haunted, and even punk-rockers like Saviours, often play a flavor of Metal that sounds just like NWOBHM, but we are not supposed to call them that if they are from a different time, and especially, a different place. Likewise, ACCEPT and SCORPIONS (from Germany) sometimes played the same type of music, stylistically speaking, as Judas Priest, DIO, and IRON MAIDEN, but since they’re not British, we cannot refer to their music as NWOBHM. Or can we? Is it not the sounds and how they are organized that matters in determining music? (I think so.) Can anyone really tell, in a blinded listening test, whether a rhythm guitarist is German or British?

The Union Underground was a Metal band that had some success during the Nü Metal years. They had the look and the album art to be part of that era and that genre. However, listening to their music in 2008, I could not help but notice that the singing style really had little to do with Nü Metal, and quite a lot to do with Grunge, which was declared over by that time. As far as I can tell, no one talked of TUU as a late Grunge band.

An interesting pair that got me thinking further about image- and time-based genre attribution are Corrosion of Conformity and VOIVOD. Originally starting out in very disparate genres, in the farthest reaches of Hardcore Punk and Prog Metal, these two diverged in their music until their releases of the albums ‘KATORZ’ (by VOIVOD) and ‘CORROSiON OF CONFORMITY’ (by CoC, of course) in the late aughts. I find it nearly impossible to tell these two albums apart stylistically (though each is quite distant from the bands’ earlier output). When I saw CoC perform at Dante’s in Portland, they presented a marvellous synthesis of Prog agility and Punk attitude. (These two were not meant to go together, but it’s happening more and more.) Meanwhile, VOIVOD apparently drifted further and further into Punk Rock, and lost most of their Prog intricacies. Yet, if I were to stick to “what we know those bands to be,” I would be forced to attach opposite labels to songs from those two albums, which, even when I’m looking right at the readout on my display and know what I’m hearing, sound the same to me.

I mentioned Old-School Hip Hop above as well. Every now and then, you hear a new song, and it has that early, innocent flow we associate with everyone from The Jungle Brothers to MC Hammer. It’s old-school in that the time extents of rhythmic phrases in the vocals and the time extent of semantic phrases in the lyrics delivered by the same vocals coincide****.

Yet, maybe it was released in 2013. Yet, De La Soul was putting out music in 1989 that did not sound old-school; it was like what was going to happen ten years later. (I feel the same way about fu-schnickens’ 1992 album.) Some of the music in those old-school days was well ahead of its time, and some music that gets released even today brings back the old-school style. It’s the sound that counts, not the metadata.

There are many more examples, and perhaps better ones that I will add as I think of them, or hear them, but for now, I will conclude that, 1) genre studies and genre R&D need multi-cultural teams so that the level of attention to detail that is possible for Deep Psytrance vs. Gabber vs. New Romantic vs. New Wave will also be possible for ‘Bulgarian Rock’, ‘Hungarian Rock’, ‘Russian Pop’, and ‘Turkish Pop’. Sure, I’m glad someone in America even cares enough to put those on the map, but given the several hundred varieties of Electronica, Metal, and Hip Hop each, can we really believe there is only one variety of ‘Russian Pop’? (I know for a fact there are quite a few styles and genres within Turkish Pop.*****)


* Yet, no matter how much detail each scholar, researcher, developer, or enthusiast goes into, it is likely to prove insufficient for the afficionado of that sub-sub-genre. The Echonest blog (at http://blog.echonest.com/post/52385283599/how-we-understand-music-genres) recently included the following comment: “. . . somebody, somewhere might care about (e.g. “gothic metal” vs. “symphonic metal” vs. “gothic symphonic metal”).” Yes, somebody right here not only does, finds those to be rather obvious and relevant distinctions, which are further complicated by Nightwish’s recent experiments combining symphonic, power, and folk metal.

** Why I use the term “northern” rather than “western” will be the topic of another post.

*** Yes, these are real genre names. To a connoisseur who lives Grindcore, Power Violence may sound completely different, while to an outsider neither would be distinguishable from the earliest ’80s Thrash.

****As Hip Hop matured, it became less and less common for the sentence and its rhythmic phrase to start and end together. This seems to be a result of the recognition that rhythm allows one to rhyme with any syllable in a word, not just the last syllable. So, an MC who wants to rhyme, for instance ‘crime’ with ‘time’ does not need each sentence to end with one of those words; s/he can rhyme ‘crime’ with ‘time’ in a sentence that might go “It was that time [break here] I went off to the east coast” where ‘time’, due to rhythmic phrasing, took care of the rhyme, and the rest of the sentence could still be uttered. In much old-school Hip Hop, the semantic phrases had to end at the same rhythmic stopping point.

*****As for style versus genre, let me try a quick explanation. A guitarist can play the blues in a Be-Bop context, a Psych context, or a Funk context (to name a few), and an MC/toaster can rap on a Hip Hop song, a Reggaeton, a Rock or Metal song, or even in a piece of modern “classical” music. Similarly, a drummer can play funk, swing, or shuffle in a Jazz band, Rock band, Pop group, or an experimental combo. The elements these musicians bring in are styles (blues, rapping, funk, swing, shuffle), while the complete package of the musical experience will likely fall into a genre or subgenre, like Electro Swing, Funk Metal, Be-Bop, or Chorinho.

Impulse Response: Mendelssohn vs. Monobloco

Mathematicians and engineers gain insight into a system by examining its behavior at the extremes. Given a mathematical expression, we take limits as a variable approaches zero and infinity. This gives us insight that is helpful in between as well.

If it’s a filter (an electrical circuit), we get insights into the behavior of such a system, even one that may never be subject to extreme conditions, by calculating (or simulating) its impulse response[1] (among other techniques).

We can also gain insights into social or cultural systems by exercising them with questions at the extremes. Here is one that can help in thinking about a cultural issue that I have been pondering for about thirty years, and reading and writing about since 2002.

Consider Bach, Beethoven, Mozart, Mendelssohn, and Rachmaninoff. You may not be a trained musician or a music professor; most people aren’t. You may not even know the music of these composers very well. However, I am willing to bet that you, the reader, hold at least some vague notion to the effect that these people have created the greatest music on Earth[2]. Everyone seems to agree that they cannot be topped. Oddly enough, people who never listen to the music of these composers seem to hold that opinion rather more strongly.

Now consider Badenya[3], Babatunde[4], Muñequitos[5], Monobloco[6], and Rose[7]. Do you believe that there is any measure by which not just you, but anyone in the world truly believes this group of five is comparable to the previous group of great Germanic and Russian composers?

If I had been nicer, and asked the question using The Beatles, Britney Spears, Michael Jackson, Madonna, and Rush, say, the politically correct instinct for diversity would likely kick in, and most people, at least in my collegiate, liberal, urban environment, would place the two groups on an equal footing. But I want to exercise the system of thought regarding “quality of music” to the extreme. Are you uncomfortable yet? Do you believe that the Afro-Latin B2M2R is really on par with the dead white European B2M2R? Do you want to, but cannot actually make yourself think or feel that way?

I think that is where most people are, or at least would be if they were interested in this question. I must admit this is much more of an old-world concern than a typical American one. Having been brought up in the old world, at the confluence of Asia and Europe, this question still matters to me after 26 years of American living. Perhaps it is my background that has given me this impression: Any Turkish person, even if they never listen to this type of music, will tell you that Bach, Beethoven, and Mozart made the greatest music in the world (closely followed by Queen . . . and who is this Mendel-something?), and that it is certainly of much better quality than what they listen to every day. This is the idea behind the differently attributed quotation, “Wagner’s music is better than it sounds.[8]

So, what am I doing about all this? As hinted at above, I have been compiling and researching scholarly material on value and judgment in music since 2002, and writing an article, a very early and embryonic version of which can be dug up by those of a worldy (wide web)-sleuth-like persuasion.

The article, in its current form, examines numerous music textbooks and reference books for qualitative and quantitative measures of the value attached to musics from different cultures, following a broad review of the musicology literature on quality, value, and sophistication. It establishes that there seems to be a cross-cultural baseline of expectation that the sophisticated cultivation of certain aspects of music are valued more highly than equally sophisticated cultivation of other aspects of music.

[1] “Impulse” sounds harmless, but it is a function that attains infinite magnitude in infinitesimal time, and as a direct result, contains all frequencies. (And yes, we can make use of such an abstract concept.)

[2] Perhaps it isn’t as popular as Beyoncé, The Beatles, Mariah Carey, or Lady Gaga, but we still, somehow, consider it the greatest.

[3] Badenya: les frères Coulibaly, a group of musicians from Burkina Faso

[4] Babatunde Olatunji, Nigerian (Yoruba) drummer influential on jazz and rock music of the last four decades

[5] Los Muñequitos de Matanzas, a famous rumba ensemble from Cuba

[6] Brazilian supergroup that pioneered a popular fusion of many traditional and popular styles

[7] Doudou N’Diaye Rose, Senegalese (Wolof) master drummer and ensemble leader

[8] For example, see http://www.quotationspage.com/quote/555.html .