Polymetric Crossrhythm as Aliasing

or Why Carriage Wheels in Westerns Appear to Rotate Backwards and What That Has to Do with Global-Southern Music and Digital Signal Processing

There is a great deal of debate in music-theory circles about what the term ‘polymeter’ means. I’m going to ignore this and use my (and a few other people’s) definition. Polymeter, to me, is when two musical streams have the same smallest subdivision in common but have different numbers of that subdivision making up one cycle.

I was just listening to Fodé Seydou Bangoura’s music when I heard yet another delightful polymeter, which I’ll call crossrhythm so that fewer people stop reading this post in anger.

In this crossrhythm, the background meter is some multiple of 3. (Whether it’s exactly 3, 6, or 12 doesn’t matter for the present purpose; it only changes the arithmetic a little, with essentially the same result.

Over this background is a solo that, given the tones and accents, has phrases in groupings of four subdivisions. I call this a polymeter; I think many people call it one type of crossrhythm. More importantly, it is what we call aliasing in digital signal processing (DSP).

Here’s how (and why):

If one stream has groupings of three, like so:

1___2___3___1___2___3___1___2___3___1___2___3___1___2___3___1

and the other has groupings of four:

1___2___3___4___1___2___3___4___1___2___3___4___1___2___3___4

and the former only “looks up” to check where the latter is each time it’s back at 1, then here’s what it sees of the second line:

1___2___3___1___2___3___1___2___3___1___2___3___1___2___3___1

1___2___3___4___1___2___3___4___1___2___3___4___1___2___3___4

which, given only when the base rhythm “looks up” at the other, is like this:

1___2___3___1___2___3___1___2___3___1___2___3___1___2___3___1

1___________4___________3___________2___________1___________4

The act of “looking up” each time the base rhythm gets to one is equivalent to the human eye’s (probably, the neural circuitry’s, rather) rate of perception (sampling). Since we can’t see as fast as the spokes of the wheels on a horse-drawn carriage spin, we see the spokes at each time point we’re able to take a visual sample, just like the first rhythm taking a sort of “downbeat sample” and seeing the second rhythm going slowly backwards: 1, 4, 3, 2, 1, 4, …

If you know DSP, you must have noticed that my example is the opposite of actual aliasing. In signal processing, when the signal we are trying to capture has higher-frequency components than how fast we can sample, we see those high-frequency components folded down to lower frequencies. To match this, the music example should look more like the following.

1___2___3___4___5___1___2___3___4___5___1___2___3___4___5___1

1___2___3___4___1___2___3___4___1___2___3___4___1___2___3___4

However, in this case, the perceived second stream is simple going very slowly, not going backwards the way the spokes do in the westerns. For that, we just need a bigger difference between the reference rhythm and the overlaid one.

1___2___3___4___5___6___7___1___2___3___4___5___6___7___1

1___2___3___4___1___2___3___4___1___2___3___4___1___2___3___4

Now we see the “spokes” slowly turning in the opposite direction: 1, 4, 3, …

When I set out the write this post, I didn’t realize that there would be some cases in which the slow (aliased) rhythmic stream would meet the reference meter without the reversal of direction. Does this happen in aliasing in DSP? It does: Aliased components show up in both positive- and negative-frequency basebands. But that doesn’t seem to answer my question because they combine to form one real-valued low-frequency signal.

I think the answer is that whenever signals (rhythms, turning wheels) are periodic and there is a phase relationship between two such entities, that phase relationship is modulo-2π: If you switch where you look up or down the cyclic waveform, you’ll see the phase shift moving forward or moving backward.

A better answer, perhaps, is that the crossrhythm examples are akin to passband sampling, where a communications signal is modulated up to a band with a lower and an upper bandlimit but the Nyquist requirement is a lot nicer than the usual more-than-doble-the-highest-frequency but simply more than double the bandwidth. In that case, I expect to see the wagon wheels going forward as well as backward, depending on which end of the band we are close to.

Mini Book Review: Doc or Quack

I saw the 2025 book Doc or Quack: Science and Anti-Science in Modern Medicine (by Sander L. Gilman, on Reaktion Books, UK) and thought I had to read it right away.

It seems like a well-researched work, but 70 pages in, I think there needs to be a re-edited or re-written edition because, so far, it is a collection of historical information the point of which is unclear. In fact, the point the author is trying to make—at least, the impression I have of what point the author is trying to make—keeps shifting. The reason for this, it seems to me, is that there are little to no connecting phrases between sentences: no ‘therefore’, ‘nonetheless’, ‘despite’, ‘moreover’, ‘yet’, ‘furthermore’, ‘similarly’, ‘in contrast’, ‘unlike’, ‘having said that’, etc. The sentences that seem to be taking us in different directions follow one another (at least for the first 70 pages) without any indication of what the author wants us to make of them.

What I’ve been able to work out is that the definition of ‘quack’ was at best arbitrary and at worst entirely based on power. I was aided in reaching this conclusion by having read half a dozen other books on the history of medicine. Those other books claimed quite clearly that so-called mainstream medicine did more harm than good prior to the twentieth century. This helps me try to nail down what the author is getting at, but I cannot be certain I’m not misunderstanding because every two or three sentences, he seems to be heading off in a new direction.

For instance, the passages discussing how the Third Reich planned to make homeopathy part of their “legitimate” institution of medicine (and then got distracted) have come up a couple of times without any clear explanation of what the author is getting at by bringing up that history.Reaktion Books needs to take another editorial pass through this work, which promises to be very enlightening, but fails to be decipherable.

My Seven Eras of Music Delivery, Your Qualia, and My Certain Lack of Free Will

I was reading someone’s post recently about what they called their seven eras of music delivery and noticed that I also have seven, although mine are different. They are:

  1. the 8-track-tape-cartridge era
  2. the era of the 45-RPM 7-inch vinyl single
  3. the cassette-and-12-inch-LP era
  4. the CD era
  5. the early days of downloading, before the MP3, with the UNIX format .au files
  6. the peak era of downloading: MP3s
  7. the era of streaming and ultra-expensive vinyl

Era 1: My earliest memories of recorded music are of 8-track cartridges that my parents and I listened to in the car. It was on these strange storage devices, the primary feature of which, I was told, was that they could skip from the middle of a song to the middle of another song with the push of a button, that I first heard Santana (off this compilation), Demis Roussos (this exact release, from which the songs I remember best are My Only Fascination and Lovely Lady of Arcadia), The Beatles’ She Loves You, Simon & Garfunkel’s El Condor Pasa and Cecilia, Fredrick Davies & Lewis Anton’s Astrology Rap, The Emotions’ I Don’t Wanna Lose Your Love, and a few selections form the musical Cabaret. I didn’t know any English at that time—I still remember the gobbledygook phonetic lyrics I sang along to She Loves You. (Şilagzu, if you can read Turkish.)

Each of those songs is still magical to me.

Era 2: At the age of 3, I’m told, I wanted the household record player to be placed in my room. At first, I mostly listened to the records of children’ stories my parents had bought for me. (Here’s an image of one, at least until it is sold and the page is no longer there.)

Then, sometime between the ages of 6 and 10, my mother and I once stayed at my aunt and uncle’s boat which was moored at the marina of a hotel on the Mediterranean, outside Antalya. Almost every night, the open-air disco would, at some point, feature the song LE FREAK by Chic. Each night, if it had not already been played while we were sitting at the outdoor disco’s dinner area, I would refuse to go to bed until I heard it. I also had gobbledygook phonetic syllables attached to that song (Frigav). That word happened to be mildly reminiscent of the name of the ice-cream bars you could only get during the intermission at a movie theater: the Alaska Frigo. The aluminum wrapper would send a terrible shock through your teeth.)

I had also taken over my parents’ 7-inch 45-RPM singles at this point: I had two singles by my favorite singer (favorite, that is, until I heard Chic) and one by my second-favorite singer. (I have yet to finish finding all six songs on MP3 or something rippable to MP3. As of early 2025, there’s one to go.)

Then, sometime around age 11 and shortly after my father died of cancer, my mother must have bought some compilation LPs of pop songs because I was suddenly listening to New Order, Shannon, The Romantics, Local Boy, Gary Low, Kajagoogoo, Indeep, Denise Edwards, Fox The Fox, Icehouse, Natasha King, Matthew Wilder, Gazebo, Chris de Burgh, Alphaville, and Madonna.

Some time after that, I went out and bought a few single-artist albums: Hot Dog by Shakin’ Stevens, the eponymous debut of (the band) Nena, and Michael Jackson’s Thriller, with Flashdance following soon afterwards. (I used to longingly stare at Purple Rain and Victory at stores, but those were always too expensive.)

In ninth grade one of our teachers arranged  a Xmas/NYE gift exchange. A classmate gave me the vinyl record of Brothers In Arms. To this day, I am very touched and grateful.

The other recording medium of this era was the cassette tape. And given how difficult it was to afford to buy LPs, most “record stores” primarily relied on another business model. They held only one copy of each record and they would make mix tapes for you from their collection. You would choose the length (and type: Chrome, etc.) of blank tape and pay a little more than the cost of the tape. You could specify everything you wanted to go onto the tape; you could specify some of it and leave the rest to them; or leave it all up to them (preferably after they’d gotten to know your taste). It was one of the ways we found out about music that was new to us (and typically, older than us).

You could also buy albums this way. I paid to have POWERSLAVE recorded onto a cheap cassette and my favorite birthday gift of my whole life is still the dubbed tape of Alphaville’s Forever Young my mom got me.

Era 4: CDs came a few years’ later to my part of the world than they did to the US. My first one was a Glenn Miller compilation, a present from my aunt. I was mesmerized by this shiny object with precise writing printed right on it. I wanted to eat it and worship it. I also remember sniffing it. (It smelled very good.)

My first five CDs still feel miraculous to me—fantastical otherworldly objects. And it continues to make me sad that so many of my friends would, in later decades, say things like “CDs are no good except as coasters.”

I still think they’re beautiful.

Era 5: Before anyone else had heard of downloading songs (that I know of), people doing work using UNIX systems started exchanging songs at some point in the mid-‘90s. This is how I became familiar with Björk (oodles of whose albums I’ve legally purchased since), Young MC, and the brilliant Ode to My Car by Adam Sandler, as well as some unidentified recordings of gamelan. I had to put them on a cassette to be able to listen to them at home. The resulting sound quality was atrocious, whether due to the tape or the .au encoding, I don’t know. By the way, I’m no audiophile. The best and most enjoyable music listening of my life was done on a mono handheld journalist-style cassette player. So, when _I_ think it sounds atrocious, it’s gotta be pretty bad.

Still, the music comes through. The brain makes sure of that. (I read on a discussion board that a music lover uses electronic equipment to listen to music while an audiophile uses music to listen to electronic equipment.)

Era 6: I have never downloaded off the original Napster, but one of the bands I was in, back when I was in three to six bands at any time, was being pirated on Napster (and getting royalty checks, too, from elsewhere—for an Australian movie). We felt pretty good about the Napster part.

Also, because I did not own an Apple device until around 2017, all my MP3s came from ripping my CD collection (and from a friend’s Nomad Zen, fully stocked with music from Jamaica and DRC, that he gave me one day, for reasons unknown). I spent most of 2002 and 2003, especially, ripping my CDs to MP3. Then, my favorite band started putting out MP3-only songs, so I started purchasing them, first at 7digital, with whom I’ve since had an unpleasant experience, and then on Qobuz (named after a Central Asian Turkic instrument, the kopuz).

Back in the early aughts, I also legally downloaded comedy from laugh.com, which seems to have disappeared long ago.

I’m still ripping (and still buying) CDs, though at a much lower rate than 20 years ago. These days, it’s because I prefer to do my own tagging. This not only avoids the typical “correcting” most database companies like to do to Portuguese song titles (which they Spanishize), but with the help of Discogs, I can locate the earliest print in the leading format of the time in the country of origin of the artist and thus know the definitive stylization for the album name and the song titles.

Era 7: I found out about Spotify in early 2008 from a friend who worked for CDBaby. I immediately looked it up. I was getting ready to make an account, but in those days I was still reading every word of every contract, agreement, document of terms and conditions, privacy statement, and such. I don’t remember now what I didn’t like about their policies but I did not make a Spotify account—and it turns out I was right. (Who could have guessed so much evil could come out of Sweden!)

I did stream from Tidal for a bit and then Qobuz briefly as well. I suppose streaming is fine if you’re not obsessed with music in the particular way that I am. For me, streaming was an anxiety-producing experience: What if I missed the name of an artist because I was busy with something else? I would have to go back. That would take away from whatever work I was trying to do and concentration I had managed to have. And what if I liked something so much that I wanted to make sure no merger, IPO, hostile takeover, or license change could take it away from me?

What if there’s so much stuff I like that I can’t function as a proper adult?

And that’s why I still buy physical media and then rip it (or purchase digital files on Bandcamp and Qobuz). I need to control the possibility of what I could listen to at any time. I want that full control right up to the moment I die.

Anyway, how about them vinyl releases? (or, as I heard some people say “vinyls” [yuck]… I just received marketing from Laufey that said “vinyls”… Needless to say I’m not buying that product.)

For a while, I was really happy with the upsurge in vinyl availability. That was back when they still came with download cards.

Have you noticed? The download cards are gone. Now, they really just expect you to pay US$35–65 for a single LP and pay more if you want it digital too. Or maybe they assume everyone is streaming, so this is a non-issue. But even streaming technology is going to be replaced by something else eventually. Maybe it won’t be era 8—maybe it’ll be more like era 15—but the streaming player will soon be replaced by an implant. And maybe it won’t stream from a server at all. Maybe we’ll all get an AI that makes up the music, the biographies and histories, the band and artist names, and everything else based on its reading of our brain waves. If it also controls our input and output ports, we could “interface” with friends who have “the same taste” in music and be discussing two entirely different fake songs, fake bands, or fake genres and not even know it because the two AIs handling our “quality time” together would sensor, filter, transform, and augment whatever the other person is saying to match what each of us would most enjoy hearing. If this sounds dystopian, think about our subjective sensory and affective experiences and how none of us has any idea (and could never have any idea) if the experiences we bond over are, to the other person, what we think they are. The philosophical term ‘qualia’ is a handle for realizing whether the question has any meaning. How could anyone—you, me, or a third party—ever know whether what I experience as the color blue is what you experience as the color blue? Maybe it’s your yellowish orange; maybe it’s your ‘sour taste’ or ‘dull pain’. Probably it’s neither. In any case, where in the chain of my sensory nerves would you have to insert yours to find out?

I don’t think there is any point of insertion at which this would work.

And I think this is similar to—in the frustration it can cause and in the realization it leads to—Gazzaniga’s point about free will: At which point along the chains of cause and effect would you want there to be this fully arbitrary freedom?

Let’s say you decide, as an exercise in free will, to refrain from urinating for as long as possible, resulting in urinating while making a presentation to the board of trustees or at a conference. Now, that would be an act of free will. How many of us are willing to do this?

I’ve always known my answer to the trolley problem. If I were to find myself in the circumstances of the trolley problem, I would be proof against free will. There are two possibilities: I pull the lever; I don’t pull the lever. (If you want to make it more general: I do the thing; I don’t do the thing—I intervene; I don’t intervene.)

If I do the one I believe I would, where’s the free will in that? That action was determined well in advance by the combination of my nature and nurture. After all, I’ve known about it for over a decade. And if I do the opposite, how would that be free will if some subconscious part of me suddenly acts in the opposite way of what I think I would do.

Either way, I clearly end up not having free will. So, maybe let’s not worry so much about the coming era of implants and sensory cocoons, with AIs repainting all our interactions with other entities to make us think we are connecting at a deep level while experiencing arbitrary AI hallucinations. We won’t know the difference.

Teaching machine learning within different fields

Everyone is talking about machine learning (ML) these days. They usually call it “machine learning and artificial intelligence” and I keep wondering what exactly they mean by each term.

It seems the term “artificial intelligence” has shaken off its negative connotations from back when it meant top-down systems (as opposed to the superior bottom-up “computational intelligence” that most of today’s so-called AI actually uses) and has come to mean what cybernetics once was: robotics, machine learning, embedded systems, decision-making, visualization, control, etc., all in one.

Now that ML is important to so many industries,application areas, and fields, it is taught in many types of academic departments. We approach machine learning differently in ECE, in CS, in business schools, in mechanical engineering, and in math and statistics programs. The granularity of focus varies, with math and CS taking the most detailed view, followed by EC and ME departments, followed by the highest-level applied version in business schools, and with Statistics covering both ends.

In management, students need to be able to understand the potential of machine learning and be able to use it toward management or business goals, but do not have to know how it works under the hood, how to implement it themselves, or how to prove the theorems behind it.

In computer science, students need to know the performance measures (and results) of different ways to implement end-to-end machine learning, and they need to be able to do so on their own with a thorough understanding of the technical infrastructure. (If what I have observed is generalizable, they also tend to be more interested in virtual and augmented reality, artificial life, and other visualization and user-experience aspects of AI.)

In math, students and graduates really need to understand what’s under the hood. They need to be able to prove the theorems and develop new ones. It is the theorems that lead to powerful new techniques.

In computer engineering, students also need to know how it all works under the hood, and have some experience implementing some of it, but don’t have to be able to develop the most efficient implementations unless they are targeting embedded systems. In either case, though, it is important to understand the concepts, the limitations, and the pros and cons as well as to be able to carry out applications. Engineers have to understand why there is a such a thing as PAC, what the curse of dimensionality is and what it implies for how one does and does not approach a problem, what the NFL is and how that should condition one’s responses to claims of a single greatest algorithm, and what the history and background of this family of techniques are really like. These things matter because engineers should not expect to be plugging-and-playing cookie-cutter algorithms from ready-made libraries. That’s being an operator of an app, not being an engineer. The engineer should be able to see the trade-offs, plan for them, and take them into account when designing the optimal approach to solving each problem. That requires understanding parameters and structures, and again the history.

Today, the field of ‘Neural Networks’ is popular and powerful. That was not always the case. It has been the case two other times in the past. Each time, perhaps like an overextended empire, the edifice of artificial neurons came down (though only to come up stronger some years later).

When I entered the field, with an almost religious belief in neural networks, they were quite uncool. The wisdom among graduate students seemed to be that neural nets were outdated, that we had SVMs now, and with the latter machine learning was solved forever. (This reminds me of the famous patent-office declaration in the late 1800s that everything that could be invented had been invented.) Fortunately, I have always benefited from doing whatever was unpopular, so I stuck to my neural nets, fuzzy systems, evolutionary algorithms, and an obsession with Bayes’ rule while others whizzed by on their SVM dissertations. (SVMs are still awesome, but the thing that has set the world on fire is neural nets again.)

One of the other debates raging, at least in my academic environment at the time, was about “ways of knowing.” I have since come to think that science is not a way of knowing. It never was, though societies thought so at first (and many still think so). Science is a way of incrementally increasing confidence in the face of uncertainty.

I bring this up because machine learning, likewise, never promised to have the right answer every time. Machine learning is all about uncertainty; it thrives on uncertainty. It’s built on the promise of PAC learning; i.e., it promises to be only slightly wrong and to be so only most of the time. The hype today is making ML seem like some magical panacea to all business, scientific, medical, and social problems. For better or worse, it’s only another technological breakthrough in our centuries-long adventure of making our lives safer and easier. (I’m not saying we haven’t done plenty of wrongs in that process—we have—but no one who owns a pair of glasses, a laptop, a ball-point pen, a digital piano, a smart phone, or a home-security system should be able to fail to see the good that technology has done for humankind.)

I left the place of the field of Statistics in machine learning until the end. They are the true owners of machine learning. We engineering, business, and CS people are leasing property on their philosophical (not real) estate.

 

Herbie Hancock's Chameleon's BPM graph from the Android app 'liveBPM' (v. 1.2.0) by Daniel Bach

Listening to music seems easy.

Listening to music seems easy; it even appears like a passive task.

Listening, however, is not the same as hearing. In listening, i.e., attending, we add cognition to perception. The cognition of musical structures, cultural meanings, conventions, and even of the most fundamental elements themselves such as pitch or rhythm turns out to be a complex cognitive task. We know this is so because getting our cutting-edge technology to understand music with all its subtleties and its cultural contexts has proven, so far, to be impossible.

Within small fractions of a second, humans can reach conclusions about musical audio that are beyond the abilities of the most advanced algorithms.

For example, a trained or experienced musician (or even non-musician listener) can differentiate computer-generated and human-performed instruments in almost any musical input, even in the presence of dozens of other instruments sounding simultaneously.

In a rather different case, humans can maintain time-organizational internal representations of music while the tempo of a recording or performance continuously changes. A classic example is the jazz standard Chameleon by Herbie Hancock off the album ‘HEADHUNTERS’. The recording never retains any one tempo, following an up-and-down contour and mostly getting faster. Because tempo recognition is a prerequisite to other music-perception tasks like meter induction and onset detection, this type of behavior presents a significant challenge to signal-processing and machine-learning algorithms but generally poses no difficulty to human perception.

Another example is the recognition of vastly different cover versions of songs: A person familiar with a song can recognize within a few notes a cover version of that song done in another genre, at a different tempo, by another singer, and with different instrumentation.

Each of these is a task that is well beyond machine-learning techniques that are exhibiting remarkable successes with visual recognition where the main challenge, invariance, is less of an obstacle than the abstractness of music and its seemingly arbitrary meanings and structures.

Consider the following aspects of music cognition.

  • inferring a key (or a change of key) from very few notes
  • identifying a latent underlying pulse when it is completely obscured by syncopation [Tal et al., Missing Pulse]
  • effortlessly tracking key changes, tempo changes, and meter changes
  • instantly separating and identifying instruments even in performances with many-voice polyphony (as in Dixieland Jazz, Big-Band Jazz, Baroque and Classical European court music, Progressive Rock, folkloric Rumba, and Hindustani and Carnatic classical music)

These and many other forms of highly polyphonic, polyrhythmic, or cross-rhythmic music continue to present challenges to automated algorithms. Successful examples of automated tempo or meter induction, onset detection, source separation, key detection, and the like all work under the requirement of tight limitations on the types of inputs. Even for a single such task such as source separation, a universally applicable algorithm does not seem to exist. (There is some commercial software that appear to do these tasks universally, but because proprietary programs do not provide sufficiently detailed outputs, whether they really can perform all these function or whether they perform one function in enough detail to suffice for studio uses is uncertain. One such suite can identify and separate every individual note from any recording, but does not perform source separation into streams-per-instrument and presents its output in a form not conducive to analysis in rhythmic, harmonic, melodic, or formal terms, and not in a form analogous to human cognitive processing of music.)

Not only does universal music analysis remain an unsolved problem, but also most of the world’s technological effort goes toward European folk music, European classical music, and (international) popular music. The goal of my research and my lab (Lab BBBB: Beats, Beats, Bayes, and the Brain) is to develop systems for culturally sensitive and culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition, and to do so for popular music styles from the Global South that are not in the industry’s radar.

Since the human nervous system is able to complete musical-analysis tasks under almost any set of circumstances, in multiple cultural and cross-cultural settings, with varying levels of noise and interference, the human brain is still superior to the highest-level technology we have developed. Hence, Lab BBBB takes inspiration and direct insight from human neural processing of audio and music to solve culturally specific cognitive problems in music analysis, and to use this context to further our understanding of neuroscience and machine learning.

The long-term goal of our research effort is a feedback cycle:

  1. Neuroscience (in simulation and with human subjects at our collaborators’ sites) informs both music information retrieval and research into neural-network structures (machine learning). We are initially doing this by investigating the role of rhythm priming in Parkinson’s (rhythm–motor interaction) and in grammar-learning performance (rhythm–language interaction) in the basal ganglia. We hope to then replicate in simulation the effects that have been observed with people, verify our models, and use our modeling experience on other tasks that have not yet been demonstrated in human cases or that are too invasive or otherwise unacceptable.
  2. Work on machine learning informs neuroscience by narrowing down the range of investigation.
  3. Deep learning is also used to analyze musical audio using structures closer to those in the human brain than the filter-bank and matrix-decomposition methods typically used to analyze music.
  4. Music analysis informs cognitive neuroscience, we conjecture, as have been done in certain cases in the literature with nonlinear dynamics.
  5. Phenomena like entrainment and neural resonance in neurodynamics further inform the development of neural-network structures and data-subspace methods.
  6. These developments in machine learning move music information retrieval closer to human-like performance for culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition for multicultural intelligent music systems.

 

You are not disinterested.

Everyone: Stop saying ‘disinterested’. You apparently don’t know what it means. It doesn’t mean ‘uninterested’.

In fact, it means you’re truly interested. ‘Disinterested’ is when you care so deeply as to want to treat the situation objectively. It is a scientific term describing the effort to rid a study of the effects of subconscious biases.

Also, please don’t say ‘substantive’ when all you mean is ‘substantial’. They’re not the same thing. Thanks. (‘Substantial’ is a good word. You’re making it feel abandoned. )

Microsoft: Fix your use of the word ‘both’.
When comparing only two files, Windows says something like “Would you like to compare both files?” As opposed to what, just compare one, all by itself? (like the sound of one hand clapping?)
The word ‘both’ is used when the default is not that of two things. It emphasizes the two-ness to show that the twoness is special, unusual. But when the default is two, you say “the two” (as in “Would you like to compare the two files?”), not ‘both’, and DEFINITELY NOT ‘the both’. (It was cute when that one famous said it once. It’s not cute anymore. Stop saying it.)
Back to ‘both’: A comparison has to involve two things, so ‘both’ (the special-case version of the word ‘two’) only makes sense if the two things are being compared to a third.
English is full of cool, meaningful nuances. I hope we stop getting rid of them.

Seriously, everyone: English is wonderful. Why are you destroying it?

 

PS: same with “on the one hand”… We used to say “on one hand” (which makes sense… either one, any one, not a definite hand with a definite article)

Overfitting, Confirmation Bias, Strong AI, and Teaching

I was asked recently by a student about how machine learning could happen. I started out by talking about human learning: how we don’t consider mere parroting of received information to be same as learning, but that we can make the leap from some examples we have seen to a new situation or problem that we haven’t seen before. Granted there need to be some similarities (shared structure or domain of discourse—we don’t become experts on European Union economics as a result only of learning to distinguish different types of wine), but what makes learning meaningful and fun for us is the ability to make a leap, to solve a previously inaccessible problem or deduce (really it’s ‘induce’) a new categorization.

In response, the student asked how machines could do that. I replied that not only do we give them many examples to learn from, but we also give them algorithms (ways to deal with examples) that are inspired by how natural systems work: inspired by ants or honeybees, genetics, the immune system, evolution, languages, social networks and ideas (memes), and even just the mammalian brain. (One difference is that, so far, we are not trying to make general-purpose consciousness in machines; we are only trying to get them to solve well-defined problems very well, and increasingly these days, not-so-well-defined problems also).

So, then the student asked how machines could make the leap just like we can. This led me to bring up overfitting and how to avoid it. I explained that if a machine learns the examples it is given all too well, it will not be able to see the forest for the trees—it will be overly rigid, and will want to make all novel experiences fit the examples in its training. For new examples that do not fit, it will reject them (if we build that ability into it), or it will make justifiable wrong choices. It will ‘overfit’, in the language of machine learning.

Then it occurred to me that humans do this, too. We’ve all probably heard the argument that stereotypes are there for a reason. In my opinion, they are there because of the power of confirmation bias (not to mention, sometimes selection bias as well—consider the humorous example of the psychiatrist who believes everyone is psychotic).

Just as a machine-learning algorithm that has been presented with a set of data will learn the idiosyncrasies of that data set if not kept from overfitting by early-stopping, prestructuring, or some other measure, people also overfit to their early-life experiences. However, we have one other pitfall compared to machines: We continue to experience new situations which we filter through confirmation bias to make ourselves think that we have verification of the validity of our misinformed or under-informed early notions. Confirmation bias conserves good feelings about oneself. Machines so far do not have this weakness, so they are only limited by what data we give them; they cannot filter out inconvenient data the way we do.

Another aspect of this conversation turned out to be pertinent to what I do every day. Not learning the example set so well is advantageous not only for machines but for people as well, specifically for people who teach.

I have been teaching at the college level since January 1994, and continuously since probably 2004, and full-time since 2010, three or four quarters per year, anywhere from two to five courses per quarter. I listed all this because I need to point out, for the sake of my next argument, that I seem to be a good teacher. (I got tenured at a teaching institution that has no research requirement but very high teaching standards.) So, let’s assume that I can teach well.

I was, for the most part, not a good student. Even today, I’m not the fastest at catching on, whether it’s a joke, an insult, or a mathematical derivation. (I’m nowhere near the slowest, but I’m definitely not among the geniuses.) I think this is a big part of why I’m a good teacher: I know what it’s like not to get it, and I know what I have had to do to get it. Hence, I know how to present anything to those who don’t get it, because, chances are, I didn’t get it right away either.

But there is more to this than speed. I generate analogies like crazy, both for myself and for teaching. Unlike people who can operate solely at the abstract level, I make connections to other domains—that’s how I learn; I don’t overfit my training set. I can take it in a new direction more easily, perhaps, than many super-fast thinkers. They’re right there, at a 100% match to the training set. I wobble around the training set, and maybe even map it to n+1 dimensions when it was given in only n.

Overfitting is not only harmful to machines. In people, it causes undeserved confidence in prejudices and stereotypes, and makes us less able to relate to others or think outside the box.

One last thought engendered by my earlier conservation with this student: The majority of machine-learning applications, at least until about 2010 or maybe 2015, were for well-defined, narrow problems. What happens when machines that are capable of generalizing well from examples in one domain, and in another, and in another, achieve meta-generalization from entire domains to new ones we have not presented them with? Will they attain strong AI as a consequence of this development (after some time)? If so, will they, because they’ve never experienced the evolutionary struggle for survival, never develop the violent streak that is the bane of humankind? Or will they come to despise us puny humans?

 

Auto-(in)correct: Emergent Laziness?

Is it Google? Is it LG? Or is it emergence?

I am leasing an LG tablet running Android to go with my phone service. I thought the large screen and consequently larger keyboard would make my life easier. The first several days of use, however, have been unreasonably annoying. The salesperson had said that this device would be slave to my LG Android cell phone, but my settings did not seem to carry over. What’s worse, no matter how much I dig through menu trees to get to certain settings I’m looking for, I can’t find them. For example, I may want autocorrect off, or I may not want the latest e-mail in my inbox to be previewed. (I prefer to see a bird’s-eye view of all the recent e-mails, packed as tightly as possible, and I can usually set this very quickly and easily, but not on this tablet.) The reasons might range from being about to go to class and teach in a few minutes and not wanting to think about that e-mail about a committee issue that just arrived right at the moment, and I don’t want Gmail to parade it in front of me.

So, the settings seem to be very well hidden, or maybe not even available to the user anymore (because that has been the trend in computer-and-Internet technology: Make the user think less, and have less control; so-called intelligent software will decide all your preferences for you).

And perhaps the software can deduce (or, more likely, induce) your preferences as they were at a certain time under a certain set of circumstances, but human beings expect the freedom to change their minds. Software doesn’t seem to allow this.

Furthermore, crowd-sourcing is considered the ultimate intelligence. I know and understand the algorithms behind most of these ideas, and totally agree that they are beautiful and awesome (and really fun). However, engineers, programmers, mathematicians, and other nerds (like me) finding something super-fun should not be how life is redesigned. The crowd-sourcing of spelling and automatic correction is leading us from artificial intelligence to natural laziness. My device wants to change “I’m” to “imma”. (Before you decry that I’m also ignorant and don’t know to put a period inside the quotation marks, read my disclaimer about switching to British/logical punctuation.) Am I now forced to appear like I have abandoned capitalization, and to have picked up an unnecessarily excessively colloquial form of spelling. And if I had, then fine, but I haven’t.

It gets worse. The learning algorithm is not learning, at least not from me. The following has now happened with several phrases and words on this new tablet, and I’ve looked further into altering this setting, to no avail.

When I type “I will”, it automatically replaces it with “I silk”. If I backspace and type “I will” again, it replaces it again. And it doesn’t learn from my actions; I have patiently (and later on, a further dozen or so times, impatiently) retyped “I will” more than 30 times, only to watch Gmail running on my Android LG device switch it back to “I silk” immediately.[1]

Where did this come from? Is there a band called “I silk”? Is this a new phrase that’s in these days, and I haven’t been overhearing my students enough to know about it?

Or is it because earlier that day, I tried to write “I seek to …” where the ‘seek’ was autocorrected to ‘silk’? (for who knows what reason)

And what happens when this behavior is pushed beyond e-mail on a tablet, and I’m not able (or allowed) to write either “I will” or “I seek” as I type a blog entry such as this on my laptop, or as I try to type an e-mail to explain what’s going wrong to Google’s tech support, or someone else’s tech support?

This really doesn’t make sense. Shouldn’t machine learning give us results that make sense? (That used to be the idea.) Now, perhaps, it’s just supposed to give us results that are popular or common. It seems we’re not building artificial intelligence; we’re building artificial commonality.

This is not a rant for elitism (which, anyway, is also used in machine learning, in evolutionary algorithms). It’s about the loss of freedom of speech, to be able to say what one is trying to say the exact way one wants to say it. The ability for clear, unequivocal communication is not something to be eliminated from the human experience; it is something to continue to strive for. Likewise, convenience over freedom (or over accuracy) is not a good choice of values. In the end, the person pushing little buttons with letters marked on them will be held responsible for the content. Shouldn’t that person be in charge of what words appear when they push the little buttons? Shouldn’t we at least be able to turn off auto-correct, or have some control over when it applies?

This is being taken away, little by little. “Oh, it’s just a tablet.” … “Oh, it’s just an e-mail. Nobody expects it to be spelled correctly.” Pretty soon, no one will be able to spell anything correctly, even if they know how to, because their devices won’t allow them to have that little bit of control.

 

[1] Also, do not even try to write in a foreign language, or mix English and Turkish in one sentence. In an early e-mail I wrote on this device, I had to repeat the letter ‘i’ (which appeared only once in the actual word) five times (for a total of six ‘i‘s) for it to stop auto-(in)correcting “geliyorum” to something like “Selma”. I had to type “geliiiiiiyorum”.

How to Reason in Circuit Analysis

The following conversation played out in my head as I was grading an exam problem that had a supernode composed of two neighboring supernodes. Many students (in introductory circuit analysis) had difficulties with this problem, so here’s what I plan to present when I explain it.

HOW TO REASON IN NODAL ANALYSIS

Q: What is the main type of equation involved in performing nodal analysis?

A: KCL equation

Q: What electrical quantity is represented in each term of a KCL equation.

A: current

Q: Are there any elements for which, if the current is not stated, we do not have a way (a defining equation[1]) to know and express the current?

A: yes

Q: What are these elements?

A: voltage sources of any type[2] that are directly between two non-reference essential nodes (NRENs)

Q: Why is that a problem?

A: There is no defining equation (like Ohm’s Law) for a source, and if it’s directly between two NRENs, then there is no other element in series with it.

Q: So what if there is no other element in series with it?

A: If there were a resistor in series with it, we could use Ohm’s Law on the resistor.

Q: Why not use Ohm’s Law on the source?

A: Ohm’s Law does not apply to sources, does not deal with sources; it’s only for resistors[3].

Q: Fine… What’s with the non-reference thing?

A: If a voltage source (of any kind) has one terminal attached to the reference node (ground), then we automatically know the voltage at the other end (with respect to ground).

 

Conclusion: If there is a voltage source between two NRENs, circle it to make a (super)node, and write KCL out of that node, without going inside it (until later, when you need another equation, at which point you use KVL).

 

[1] A defining equation is an expression that relates current through a two-terminal element to the voltage across a two-terminal element by means of the inertial aspect of the element (its capacitance, resistance, inductance, and I suppose, pretty soon, its memristance) and the passive sign convention (PSC).

[2] i.e., independent voltage source, current-dependent voltage source, voltage-dependent voltage source: It’s about the voltage aspect, not about the dependence aspect.

[3] two terminal elements with a linear current–voltage relationship; note the en dash : )

NIPS 2015: Thoughts about SoundCloud, genres, clave tagging, clave gamification, multi-label classification, and perceptual manifolds

On December 9th, at NIPS 2015, I met two engineers from SoundCloud, which is not only providing unsigned artists a venue to get their music heard (and commented on), and providing recommendation and music-oriented social networking, but also, if I understand correctly, is interested in content analysis for various purposes. Some of those have to do with identifying work that may not be original, which can range from quotation to plagiarism (the latter being an important issue in my line of work: education), but also involve the creation of derivative content, like remixing, to which they seem to have a healthy approach. (At the same event, the IBM Watson program director also suggested that they could conceivably be interested in generative tools based on music analysis.)

I got interested in clave-direction recognition to help musicians, because I was one, and I was struggling—clave didn’t make sense. Why were two completely different patterns in the same clave direction, and two very similar patterns not? To make matters worse, in samba batucada, there was a pattern said to be in 3-2, but with two notes in the first half, followed by three notes in the second half. There had to be a consistent explanation. I set out to find it. (If you’re curious, I explained the solution thoroughly in my Current Musicology paper.)

 

Top: Surdo de terceira. Bottom: The 3-2 partido-alto for cuíca and agogô. Note that playing the partido-alto omitting the first and third crotchet’s worth of onsets results in the terceira.

However, clave is relevant not just to music-makers, but to informed listeners and dancers as well. A big part of music-in-society is the communities it forms, and that has a lot to do with expertise and identity in listeners. Automated recognition of clave-direction in sections of music (or entire pieces) can lead to automated tagging of these sections or pieces, increasing listener identification (which can be gamified) or helping music-making.

My clave-recognition scheme (which is an information-theoretically aided neural network) recognizes four output classes (outside, inside, neutral, and incoherent). In my musicological research, I also developed three teacher models, but only from a single cultural perspective. Since then, I have recently submitted a work-in-progress and accompanying abstract to AAWM 2016 (Analytical Approaches to World Music) about what would happen if I looked at clave direction from different cultural perspectives (which I have encoded as phase shifts), and graphed the results in the complex plane (just like phase shift in electric circuits).

Another motivating idea came from today’s talk Computational Principles for Deep Neuronal Architectures by Haim Sompolinsky: perceptual manifolds. The simplest manifold proposed was line segments. This is poignant to clave recognition because among my initial goals was extending my results to non-idealized onset vectors: [0.83, 0.58, 0.06, 0.78] instead of [1101], for example. The line-segment manifold would encode this as onset strengths (“velocity” in MIDI terminology) ranging from 0 (no onset) to 1 (127 in MIDI). This will let me look inside the onset-vector hypercube.

Another tie-in from NIPS conversations is employing Pareto frontiers with my clave data for a version of multi-label learning. Since I can approach each pattern from two phase perspectives, and up to three teacher models (vigilance levels), a good multi-label classifier would have to provide up to 6 correct outputs, and in the case that a classifier cannot be that good, the Pareto frontier would determine which classifiers are undominated.

Would all this be interesting to musicians? Yes, I think so. Even without going into building a clave-trainer software into various percussion gear or automated-accompaniment keyboards, this could allow clave direction to be gamified. Considering all the clave debates that rage in Latin-music-ian circles (such as the “four great clave debates” and the “clave schism” issues like around Giovanni Hidalgo’s labeling scheme quoted in Modern Drummer*), a multi-perspective clave-identification game could be quite a hit.

So, how does a Turkish math nerd get to be obsessed by this? I learned about clave—the Afro-Latin (or even African-Diasporan) concept of rhythmic harmony that many people mistake for the family of fewer than a dozen patterns, or for a purely Cuban or “Latin” organizational principle—around 1992 from the musicians of Bochinche and Sonando, two Seattle bands. I had also grown up listening to Brazilian (and Indian, Norwegian, US, and German) jazz in Turkey. (My first live concert by a foreign band was Hermeto Pascoal e Grupo, featuring former CBC faculty Jovino Santos Neto.) So, I knew that I wanted to learn about Brazilian music. (At the time, most of what I listened to was Brazilian jazz, like Dom Um Romao and Airto, and I had no idea that they mostly drew from nordestino music, like baião, xote, côco, and frevo**―not samba).

Fortunately, I soon moved to Portland, where Brian Davis and Derek Reith of Pink Martini had respectively founded and sustained a bloco called Lions of Batucada. Soon, Brian introduced us to Jorge Alabê, and then to California Brazil Camp, with its dozens of amazing Brazilian teachers. . . But let’s get back to clave.

I said above that clave is “the Afro-Latin (or even African-Diasporan) concept of rhythmic harmony that many people mistake for the family of fewer than a dozen patterns, or for a purely Cuban or ‘Latin’ organizational principle.” What’s wrong with that?

Well, clave certainly is an organizational principle: It tells the skilled musician, dancer, or listener how the rhythm (the temporal organization, or timing) of notes in all the instruments may and may not go during any stretch of the music (as long as the music is from a tradition that has this property, of course).

And clave certainly is a Spanish-language word that took on its current meaning in Cuba, as explained wonderfully in Ned Sublette’s book.

However, the transatlantic slave trade did not only move people (forcefully) to Cuba. The Yorùbá (of today’s southwest Nigeria and southeast Benin), the Malinka (a misnomer, according to Mamady Keïta for people from Mali, Ivory Coast, Burkina Faso, Gambia, Guinea, and Senegal), and the various Angolan peoples were brought to many of today’s South American, Caribbean, and North American countries, where they culturally and otherwise interacted with Iberians and the natives of the Americas.

Certain musicological interpretations of Rolando Antonio Pérez Fernández’s book La Binarización de los Ritmos Ternarios Africanos en América Latina have argued that the organizational principles of Yoruba 12/8 music, primarily the standard West African timeline (X.X.XX..X.X.X)

Bembé ("Short bell") or the standard West African timeline, along with its major-scale analog

and the Malinka/Manding timelines met the 4/4 time signatures of Angolan and Iberian music, and morphed into the organizational timelines of today’s rumba, salsa, (Uruguayan) candombe, maracatu, samba, and other musics of the Americas.

Some of those timelines we all refer to as clave, but for others, like the partido-alto in Brazil***, it is sometimes culturally better not to refer to them as clave patterns. (This is understandable, in that Brazilians speak Portuguese, and do not always like to be mistaken for Spanish-speakers.)

Conceptually, however, partido-alto in samba plays the same organizational role that clave plays in rumba and salsa, or the gongue pattern plays in maracatu: It immediately tells knowledgeable musicians how not to play.

In my research, I found multiple ways to look at the idiomatic appropriateness of arbitrary timing patterns (more than 10,000 of them, only about a hundred of which are “traditional” [accepted, commonly used] patterns). I identified three “teacher” models, which are just levels of strictness. I also identified four clave-direction categories. (Really, these were taught to me by my teacher-informers, whose reactions to certain patterns informed some of the categories.)

Some patterns are in 3-2 (which I call “outside”). While the 3-2 clave son (X..X..X…X.X…):

3-2 (outside) clave son, in northern and TUBS notation

is obvious to anyone who has attempted to play anything remotely Latin, it is not so obvious why the following version of the partido-alto pattern is also in the 3-2 direction****: .X..X.X.X.X..X.X

The plain 3-2 partido-alto pattern. (The pitches are approximate and can vary with cuíca intonation or the agogô maker’s accuracy.) "Bossa clave" in 3-2 and 2-3 are added in TUBS notation to show the degree of match and mismatch with 3-2 and 2-3 patterns, respectively.

 

Some patterns are in 2-3 (which I call “inside”). Many patterns that are heard throughout all Latin American musics are clave-neutral: They provide the same amount of relative offbeatness no matter which way you slice them. The common Brazilian hand-clapping pattern in pagode, X..X..X.X..X..X. is one such pattern:

The clave-neutral hand-clapping pattern in pagode, AKA, tresillo (a Cuban name for a rhythm found in Haitian konpa, Jamaican dancehall, and Brazilian xaxado)

It is actually found throughout the world, from India and Turkey, to Japan and Finland, and throughout Africa; from Breakbeats to Bollywood to Metal. (It is very common in Metal.) The parts played by the güiro in salsa and by the first and second surdos in samba have the same role: They are steady ostinati of half-cycle length. They are foundational. They set the tempo, provide a reference, and go a long way towards making the music danceable. (Offbeatness without respite, as Merriam said*****, would make music undanceable.)

Here are some neutral patterns: X…X…X…X… (four on the floor, which, with some pitch variation, can be interpreted as the first and second surdos):

Four quarter notes, clave-neutral (from Web, no source available)

….X.X…..X.X. (from ijexá):

surdo part for ijexá (from http://www.batera.com.br/Artigos/dia-do-folclore)

 

and XxxXXxxXXxxXXxxX. (This is a terrible way to represent swung samba 16ths. Below is Jake “Barbudo” Pegg’s diagrams, which work much better.)

Jake "Barbudo" Pegg's samba-sixteenths accent and timing diagrams (along with the same for "Western" music)

The fourth category is incoherent patterns. These are patterns that are not neutral, yet do not conform to either clave direction, either. (One of my informers gave me the idea of a fourth category when he reacted to one such pattern by making a disgusted face and a sound like bleaaahh.)

A pattern that has the clave property immediately tells all who can sense it that only patterns in that clave direction and patterns that are clave-neutral are okay to play while that pattern (that direction) is present. (We can weaken this sentence to apply only to prominent or repeated patterns. Quietly passing licks that cross clave may be acceptable, depending on the vigilance level of the teacher model.)

So, why mention all this right now? (After all, I’ve published these thoughts in peer-reviewed venues like Current Musicology, Bridges, and the Journal of Music, Technology and Education.)

For one thing, those are not the typical resources most musicians turn to. Until I can write up a short, highly graphical version of my clave-direction grammar for PAS, I will need to make some of these ideas available here. Secondly, the connection to gamification and musical-social-networking sites, like SoundCloud, are new ideas I got from talking to people at the NIPS reception, and I wanted to put this out there right away.

 

FOOTNOTES

* Mattingly, R., Modern Drummer, Modern Drummer Publications, Inc., Cedar Grove, NJ, “Giovanni Hidalgo-Conga Virtuoso,” p. 86, November 1998.

** While talking to Mr. Fereira of SoundCloud this evening at NIPS, he naturally mentioned genre recognition, which is the topic of my second-to-last post. (I argued about the need for expert listeners from many cultural backgrounds, which could be augmented with a sufficiently good implementation of crowd-sourcing.) I think he was telling me about embolada, or at least that’s how I interpreted his description of this MC-battle-type of improvised nordeste music. How many genre-recognition researchers even know where to start in telling a street-improvisation embolada from even, say, a pagode-influenced axé song like ‘Entre na Roda’ by Bom Balanço? (Really good swing detection might help, I suppose.)

*** This term has multiple meanings; I’m not referring to the genre partido-alto, but the pattern, which is one of the three primary ingredients of samba, along with the strong surdo beat on 2 (and 4) and the swung samba 16ths.

**** in the sense that, in the idiom, it goes with the so-called 3-2 “bossa clave” (a delightful misnomer): X..X..X…X..X..,

The "bossa clave" is a bit like an English horn; it's neither.as well as with the rather confusing (to some) third-surdo pattern ….X.X…..XX.X, Top: Surdo de terceira. Bottom: The 3-2 partido-alto for cuíca and agogô. Note that playing the partido-alto omitting the first and third crotchet’s worth of onsets results in the terceira.

which has two notes in its first half, and three notes in its second half. (Yes, it’s in 3-2. My grammar for clave direction explains this thoroughly. [http://academiccommons.columbia.edu/catalog/ac:180566])

***** See Merriam: “continual use of off-beating without respite would cause a readjustment on the part of the listener, resulting in a loss of the total effect; thus off-beating [with respite] is a device whereby the listeners’ orientation to a basic rhythmic pulse is threatened but never quite destroyed” (Merriam, Alan P. “Characteristics of African Music.” Journal of the International Folk Music Council 11 (1959): 13–19.)

ALSO, I use the term “offbeatness” instead of ‘syncopation’ because the former is not norm-based, whereas the latter turns out to be so:

Coined by Toussaint as a mathematically measurable rhythmic quantity [1], offbeatness has proven invaluable to the preliminary work of understanding Afro-Brazilian (partido-alto) clave direction. It is interpreted here as a more precise term for rhythmic purposes than ‘syncopation’, which has a formal definition that is culturally rooted: Syncopation is the placement of accents on normally  unaccented notes, or the lack of accent on normally accented notes. It may be assumed that the norm in question is that of the genre, style or cultural/national origin of the music under consideration. However, in all usage around the world (except mine), normal accent placement is taken to be normal European accent placement [2, 3, 4].

For example, according to Kauffman [3, p. 394], syncopation “implies a deviation from the norm of regularly spaced accents or beats.” Various definitions by leading sources cited by Novotney also involve the concepts of “normal position” and “normally weak beat” [2, pp. 104, 108). Thus, syncopation is seen to be norm-referenced, whereas offbeatness is less contextual as it depends solely on the tactus.

Kerman, too, posits that syncopation involves “accents in a foreground rhythm away from their normal places in the background meter. This is called syncopation. For example, the accents in duple meter can be displaced so that the accents go on one two, one two, one two instead of the normal one two, one two” [4, p. 20; all emphasis in the original, as written]. Similarly, on p. 18, Kerman reinforces that “[t]he natural way to beat time is to alternate accented (“strong”) and unaccented (“weak”) beats in a simple pattern such as one two, one two, one two or one two three, one two three, one two three.” [4, p. 18]

Hence, placing a greater accent on the second rather than on the first quarter note of a bar may be sufficient to invoke the notion of syncopation. By this definition, the polka is syncopated, and since it is considered the epitome of “straight rhythm” to many performers of Afro-Brazilian music, syncopation clearly is not the correct term for what the concept of clave direction is concerned with. Offbeatness avoids all such cultural referencing because it is defined solely with respect to a pulse, regardless of cultural norms. (Granted, what a pulse is may also be culturally defined, but there is a point at which caveat upon caveat becomes counterproductive.)

Furthermore, in jazz, samba, and reggae (to name just a few examples) this would not qualify as syncopation (in the sense of accents in abnormal or unusual places) because beats other than “the one” are regularly accented in those genres as a matter of course. In the case of folkloric samba, even the placement of accents on the second eighth note, therefore, is not syncopation because at certain places in the rhythmic cycle, that is the normal—expected—pattern of accents for samba, part of the definition of the style. Hence, it does not constitute syncopation if we are to accept the definition of the term as used and cited by Kauffman, Kerman, and Novotney. In other words, “syncopation” is not necessarily the correct term for the phenomenon of accents off the downbeat when it comes to non-European music.

Moreover, in Meter in Music, Hule observes that “[a]ccent, defined as dynamic stress by seventeenth- and eighteenth-century writers, was one of the means of enhancing the perception of meter, but it became predominant only in the last half of the eighteenth century [emphasis added]. The idea that the measure is a pattern of accents is so widely held today that it is difficult to imagine that notation that looks modern does not have regular accentual patterns. Quite a number of serious scholarly studies of this music [European art music of 1600–1800] make this assumption almost unconsciously by translating the (sometimes difficult) early descriptions of meter into equivalent descriptions of the modern accentual measure” [5, p. viii] Thus, it turns out that the current view of rhythm and meter is not natural, or even traditional, let alone global. In fact, in Essential Dictionary of MUSIC NOTATION: The most practical and concise source for music notation is perfect for all musicians—amateur to professional (the actual book title) states that “the preferred/recommended beaming for the 9/8 compound meter is given as three groups of three eighth notes” [6, p. 73]. This goes against the accent pattern implied by the 9/8 meter in Turkish (and other Balkan) music, which is executed as 4+5, 5+4, 2+2+2+3, etc., but rarely 3+3+3. The 9/8 is one of the most common and typical meters in Turkish music, not an atypical curiosity. This passage is included here to demonstrate the dangers in applying western European norms to other musics (as indicated by the phrase “perfect for all musicians”).

[1]    Toussaint, G., 2005. Mathematical Features for Recognizing Preference in Sub-Saharan African Traditional Rhythm Timelines. Lecture Notes in Computer Science 3686:18-27. Springer Berlin/Heidelberg, 2005.                                                                                                                                [2]    Novotney, E. D. “The 3-2 Relationship as the Foundation of Timelines in West African Musics,” University of Illinois at Urbana-Champaign (Ph.D. dissertation), Urbana-Champaign, Illinois, 1998.
[3]    Kauffman, R. 1980. African Rhythm: A Reassessment. Ethnomusicology 24 (3):393–415.
[4]    Kerman, J., LISTEN: Brief Edition, New York, NY: Worth Publishers, Inc., 1987, p. 20.
[5]    Hule, G., Meter in Music, 1600–1800: Performance, Perception, and Notation, Bloomington, IN: Indiana University Press, 1999.
[6]    Gerou, T., and Lusk, L., Essential Dictionary of MUSIC NOTATION: The most practical and concise source for music notation is perfect for all musicians—amateur to professional, Van Nuys, CA: Alfred Publishing Co., Inc., 1996.