Teaching machine learning within different fields

Everyone is talking about machine learning (ML) these days. They usually call it “machine learning and artificial intelligence” and I keep wondering what exactly they mean by each term.

It seems the term “artificial intelligence” has shaken off its negative connotations from back when it meant top-down systems (as opposed to the superior bottom-up “computational intelligence” that most of today’s so-called AI actually uses) and has come to mean what cybernetics once was: robotics, machine learning, embedded systems, decision-making, visualization, control, etc., all in one.

Now that ML is important to so many industries,application areas, and fields, it is taught in many types of academic departments. We approach machine learning differently in ECE, in CS, in business schools, in mechanical engineering, and in math and statistics programs. The granularity of focus varies, with math and CS taking the most detailed view, followed by EC and ME departments, followed by the highest-level applied version in business schools, and with Statistics covering both ends.

In management, students need to be able to understand the potential of machine learning and be able to use it toward management or business goals, but do not have to know how it works under the hood, how to implement it themselves, or how to prove the theorems behind it.

In computer science, students need to know the performance measures (and results) of different ways to implement end-to-end machine learning, and they need to be able to do so on their own with a thorough understanding of the technical infrastructure. (If what I have observed is generalizable, they also tend to be more interested in virtual and augmented reality, artificial life, and other visualization and user-experience aspects of AI.)

In math, students and graduates really need to understand what’s under the hood. They need to be able to prove the theorems and develop new ones. It is the theorems that lead to powerful new techniques.

In computer engineering, students also need to know how it all works under the hood, and have some experience implementing some of it, but don’t have to be able to develop the most efficient implementations unless they are targeting embedded systems. In either case, though, it is important to understand the concepts, the limitations, and the pros and cons as well as to be able to carry out applications. Engineers have to understand why there is a such a thing as PAC, what the curse of dimensionality is and what it implies for how one does and does not approach a problem, what the NFL is and how that should condition one’s responses to claims of a single greatest algorithm, and what the history and background of this family of techniques are really like. These things matter because engineers should not expect to be plugging-and-playing cookie-cutter algorithms from ready-made libraries. That’s being an operator of an app, not being an engineer. The engineer should be able to see the trade-offs, plan for them, and take them into account when designing the optimal approach to solving each problem. That requires understanding parameters and structures, and again the history.

Today, the field of ‘Neural Networks’ is popular and powerful. That was not always the case. It has been the case two other times in the past. Each time, perhaps like an overextended empire, the edifice of artificial neurons came down (though only to come up stronger some years later).

When I entered the field, with an almost religious belief in neural networks, they were quite uncool. The wisdom among graduate students seemed to be that neural nets were outdated, that we had SVMs now, and with the latter machine learning was solved forever. (This reminds me of the famous patent-office declaration in the late 1800s that everything that could be invented had been invented.) Fortunately, I have always benefited from doing whatever was unpopular, so I stuck to my neural nets, fuzzy systems, evolutionary algorithms, and an obsession with Bayes’ rule while others whizzed by on their SVM dissertations. (SVMs are still awesome, but the thing that has set the world on fire is neural nets again.)

One of the other debates raging, at least in my academic environment at the time, was about “ways of knowing.” I have since come to think that science is not a way of knowing. It never was, though societies thought so at first (and many still think so). Science is a way of incrementally increasing confidence in the face of uncertainty.

I bring this up because machine learning, likewise, never promised to have the right answer every time. Machine learning is all about uncertainty; it thrives on uncertainty. It’s built on the promise of PAC learning; i.e., it promises to be only slightly wrong and to be so only most of the time. The hype today is making ML seem like some magical panacea to all business, scientific, medical, and social problems. For better or worse, it’s only another technological breakthrough in our centuries-long adventure of making our lives safer and easier. (I’m not saying we haven’t done plenty of wrongs in that process—we have—but no one who owns a pair of glasses, a laptop, a ball-point pen, a digital piano, a smart phone, or a home-security system should be able to fail to see the good that technology has done for humankind.)

I left the place of the field of Statistics in machine learning until the end. They are the true owners of machine learning. We engineering, business, and CS people are leasing property on their philosophical (not real) estate.

 

Herbie Hancock's Chameleon's BPM graph from the Android app 'liveBPM' (v. 1.2.0) by Daniel Bach

Listening to music seems easy.

Listening to music seems easy; it even appears like a passive task.

Listening, however, is not the same as hearing. In listening, i.e., attending, we add cognition to perception. The cognition of musical structures, cultural meanings, conventions, and even of the most fundamental elements themselves such as pitch or rhythm turns out to be a complex cognitive task. We know this is so because getting our cutting-edge technology to understand music with all its subtleties and its cultural contexts has proven, so far, to be impossible.

Within small fractions of a second, humans can reach conclusions about musical audio that are beyond the abilities of the most advanced algorithms.

For example, a trained or experienced musician (or even non-musician listener) can differentiate computer-generated and human-performed instruments in almost any musical input, even in the presence of dozens of other instruments sounding simultaneously.

In a rather different case, humans can maintain time-organizational internal representations of music while the tempo of a recording or performance continuously changes. A classic example is the jazz standard Chameleon by Herbie Hancock off the album ‘HEADHUNTERS’. The recording never retains any one tempo, following an up-and-down contour and mostly getting faster. Because tempo recognition is a prerequisite to other music-perception tasks like meter induction and onset detection, this type of behavior presents a significant challenge to signal-processing and machine-learning algorithms but generally poses no difficulty to human perception.

Another example is the recognition of vastly different cover versions of songs: A person familiar with a song can recognize within a few notes a cover version of that song done in another genre, at a different tempo, by another singer, and with different instrumentation.

Each of these is a task that is well beyond machine-learning techniques that are exhibiting remarkable successes with visual recognition where the main challenge, invariance, is less of an obstacle than the abstractness of music and its seemingly arbitrary meanings and structures.

Consider the following aspects of music cognition.

  • inferring a key (or a change of key) from very few notes
  • identifying a latent underlying pulse when it is completely obscured by syncopation [Tal et al., Missing Pulse]
  • effortlessly tracking key changes, tempo changes, and meter changes
  • instantly separating and identifying instruments even in performances with many-voice polyphony (as in Dixieland Jazz, Big-Band Jazz, Baroque and Classical European court music, Progressive Rock, folkloric Rumba, and Hindustani and Carnatic classical music)

These and many other forms of highly polyphonic, polyrhythmic, or cross-rhythmic music continue to present challenges to automated algorithms. Successful examples of automated tempo or meter induction, onset detection, source separation, key detection, and the like all work under the requirement of tight limitations on the types of inputs. Even for a single such task such as source separation, a universally applicable algorithm does not seem to exist. (There is some commercial software that appear to do these tasks universally, but because proprietary programs do not provide sufficiently detailed outputs, whether they really can perform all these function or whether they perform one function in enough detail to suffice for studio uses is uncertain. One such suite can identify and separate every individual note from any recording, but does not perform source separation into streams-per-instrument and presents its output in a form not conducive to analysis in rhythmic, harmonic, melodic, or formal terms, and not in a form analogous to human cognitive processing of music.)

Not only does universal music analysis remain an unsolved problem, but also most of the world’s technological effort goes toward European folk music, European classical music, and (international) popular music. The goal of my research and my lab (Lab BBBB: Beats, Beats, Bayes, and the Brain) is to develop systems for culturally sensitive and culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition, and to do so for popular music styles from the Global South that are not in the industry’s radar.

Since the human nervous system is able to complete musical-analysis tasks under almost any set of circumstances, in multiple cultural and cross-cultural settings, with varying levels of noise and interference, the human brain is still superior to the highest-level technology we have developed. Hence, Lab BBBB takes inspiration and direct insight from human neural processing of audio and music to solve culturally specific cognitive problems in music analysis, and to use this context to further our understanding of neuroscience and machine learning.

The long-term goal of our research effort is a feedback cycle:

  1. Neuroscience (in simulation and with human subjects at our collaborators’ sites) informs both music information retrieval and research into neural-network structures (machine learning). We are initially doing this by investigating the role of rhythm priming in Parkinson’s (rhythm–motor interaction) and in grammar-learning performance (rhythm–language interaction) in the basal ganglia. We hope to then replicate in simulation the effects that have been observed with people, verify our models, and use our modeling experience on other tasks that have not yet been demonstrated in human cases or that are too invasive or otherwise unacceptable.
  2. Work on machine learning informs neuroscience by narrowing down the range of investigation.
  3. Deep learning is also used to analyze musical audio using structures closer to those in the human brain than the filter-bank and matrix-decomposition methods typically used to analyze music.
  4. Music analysis informs cognitive neuroscience, we conjecture, as have been done in certain cases in the literature with nonlinear dynamics.
  5. Phenomena like entrainment and neural resonance in neurodynamics further inform the development of neural-network structures and data-subspace methods.
  6. These developments in machine learning move music information retrieval closer to human-like performance for culturally informed music analysis, music coaching, automated accompaniment, music recommendation, and algorithmic composition for multicultural intelligent music systems.

 

The subjunctive is scientific thinking built into the language.

The subjunctive draws a distinction between fact and possibility, between truths and wishes. The expression “if he were” (not “if he was”) is subjunctive; it intentionally sounds wrong (unless you’re used to it) to indicate that we’re talking about something hypothetical as opposed to something actual.
This is scientific thinking built into the language (coming from its romance-language roots).

This is beautiful. Let’s hold onto it.

You are not disinterested.

Everyone: Stop saying ‘disinterested’. You apparently don’t know what it means. It doesn’t mean ‘uninterested’.

In fact, it means you’re truly interested. ‘Disinterested’ is when you care so deeply as to want to treat the situation objectively. It is a scientific term describing the effort to rid a study of the effects of subconscious biases.

Also, please don’t say ‘substantive’ when all you mean is ‘substantial’. They’re not the same thing. Thanks. (‘Substantial’ is a good word. You’re making it feel abandoned. )

Microsoft: Fix your use of the word ‘both’.
When comparing only two files, Windows says something like “Would you like to compare both files?” As opposed to what, just compare one, all by itself? (like the sound of one hand clapping?)
The word ‘both’ is used when the default is not that of two things. It emphasizes the two-ness to show that the twoness is special, unusual. But when the default is two, you say “the two” (as in “Would you like to compare the two files?”), not ‘both’, and DEFINITELY NOT ‘the both’. (It was cute when that one famous said it once. It’s not cute anymore. Stop saying it.)
Back to ‘both’: A comparison has to involve two things, so ‘both’ (the special-case version of the word ‘two’) only makes sense if the two things are being compared to a third.
English is full of cool, meaningful nuances. I hope we stop getting rid of them.

Seriously, everyone: English is wonderful. Why are you destroying it?

 

PS: same with “on the one hand”… We used to say “on one hand” (which makes sense… either one, any one, not a definite hand with a definite article)

Science-doing

There are (at least) two types of scientists: scientist-scientists and science-doers.

Both groups do essential, difficult, demanding, and crucial work that everyone, including the scientist-scientists, needs. The latter group (like the former) includes people who work in research hospitals, water-quality labs, soil-quality labs, linear accelerators, R-&-D labs of all kinds, and thousands of other places. They carry out the daily work of science with precision, care, and a lot of hard work. Yet, at the same time, in the process of doing the doing of science, they typically do not get the luxury of stepping back, moving away from the details, starting over, and discovering the less mechanical, less operational connections among the physical sciences, the social sciences, the humanities, technology, business, mathematics, and statistics… especially the humanities and statistics.

I am not a good scientist, and that has given me the opportunity to step back, start over, do some things right this time, and more importantly, through a series of delightful coincidences, learn more about the meaning of science than about the day-to-day doing of it.[1] This began to happen during my Ph.D., but only some of the components of this experience were due to my Ph.D. studies. The others just happened to be there for me to stumble upon.

The sources of these discoveries took the form of two electrical-engineering professors, three philosophy professors, one music professor, one computer-science professor, some linguistics graduate students, and numerous philosophy, math, pre-med, and other undergrads. All of these people exposed me to ideas, ways of thinking, ways of questioning, and ways of teaching that were new to me.

As a result of their collective influence, my studies, and all my academic jobs from that period, I have come to think of science not merely as the wearing of lab coats and carrying out of mathematically, mechanically, or otherwise challenging complex tasks. I have come to think of science as the following of, for lack of a better expression, the scientific method, although by that I do not necessarily mean the grade-school inductive method with its half-dozen simple steps. I mean all the factors one has to take into account in order to investigate anything rigorously. These include double-blinding (whether clinical or otherwise, to deal with confounding variables, experimenter effects, and other biases), setting up idiot checks in experimental protocols, varying one unknown at a time (or varying all unknowns with a factorial design), not assuming unjustified convenient probability distributions, using the right statistics and statistical tests for the problem and data types at hand, correctly interpreting results, tests, and statistics, not chasing significance, setting up power targets or determining sample sizes in advance, using randomization and blocking in setting up an experiment or the appropriate level of random or stratified sampling in collecting data [See Box, Hunter, and Hunter’s Statistics for Experimenters for easy-to-understand examples.], and the principles of accuracy, objectivity, skepticism, open-mindedness, and critical thinking. The latter set of principles are given on p. 17 and p. 20 of Essentials of Psychology [third edition, Robert A. Baron and Michael J. Kalsher, Needham, MA: Allyn & Bacon, 2002].

These two books, along with Hastie, Tibshirani, and Friedman’s The Elements of Statistical Learning and a few other sources that are heavily cited papers on the misuses of Statistics have formed the basis of my view of science. This is why I think science-doing is not necessarily the same thing as being a scientist. In a section called ‘On being a scientist’ in a chapter titled ‘Methodology Wars’, the neuroscientist Fost explains how it’s possible, although not necessarily common, to be on “scientific autopilot” (p. 209) because of the way undergraduate education focuses on science facts and methods[2] over scientific thinking and the way graduate training and faculty life emphasize administration, supervision, managerial oversight, grant-writing, and so on (pp. 208–9). All this leaves a brief graduate or a post-doc period in most careers for deep thinking and direct hands-on design of experiments before the mechanical execution and the overwhelming burdens of administration kick in. I am not writing this to criticize those who do what they have to do to further scientific inquiry but to celebrate those who, in the midst of that, find the mental space to continue to be critical skeptical questioners of methods, research questions, hypothesis, and experimental designs. (And there are many of those. It is just not as automatic as the public seems to think it is, i.e., by getting a degree and putting on a white coat.)

 

Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building, George E. P. Box, William G. Hunter, and J. Stuart Hunter, New York , NY: John Wiley & Sons, Inc., 1978 (0-471-09315-7)

Essentials of Psychology, third edition, Robert A. Baron and Michael J. Kalsher, Needham, MA: Allyn & Bacon, A Pearson Education Company, 2002

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, second edition, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, New York, NY: Springer-Verlag, 2009 (978-0-387-84858-7 and 978-0-387-84857-0)

If Not God, Then What?: Neuroscience, Aesthetics, and the Origins of the Transcendent, Joshua Fost, Clearhead Studios, Inc., 2007 (978-0-6151-6106-8)

[1] Granted, a better path would be the more typical one of working as a science-doer scientist for thirty years, accumulating a visceral set of insights, and moving into the fancier stuff due to an accumulation of experience and wisdom. However, as an educator, I did not have another thirty years to spend working on getting a gut feeling for why it is not such a good idea to (always) rely on a gut feeling. I paid a price, too. I realize I often fail to follow the unwritten rules of social and technical success in research when working on my own research, and I spend more time than I perhaps should on understanding what others have done. Still, I am also glad that I found so much meaning so early on.

[2] In one of my previous academic positions, I was on a very active subcommittee that designed critical-thinking assessments for science, math, and engineering classes with faculty from chemistry, biology, math, and engineering backgrounds. We talked often about the difference between teaching scientific facts and teaching scientific thinking. Among other things, we ended up having the university remove a medical-terminology class from the list of courses that counted as satisfying a science requirement in general studies.

This is a sci-fi story.

The icers were out in force that night. Joey really didn’t like running into them. It wouldn’t be dangerous if he wouldn’t wear his lukers leather jacket when he went out alone, but unless he was a full-time luker—and flaunting it—he felt like a traitor. He felt like he wasn’t “real” or wasn’t noteworthy enough to be out on the streets. He also didn’t want to run into any girls while not visually declaring allegiance to his chosen subculture, so he wore the identifiers of his subculture even though it meant he’d likely get beaten bloody by icers or somebody else.

The icers were truly extreme, Joey thought. They claimed they would rather die than drink any but the most extreme ice-cold tap water. Lukers, like Joey, weren’t so picky, and indeed preferred avoiding brain freeze.

Water was no longer the simple commodity previous generations took for granted. It wasn’t exactly unobtainable, but most people had to save to get their monthly allotment, which was a very small amount, or perform community service for extra water which would then be delivered automatically to their approved smarthomes. Now that games, movies, music, fashion, food, cars, drones, jetpacks, body alterations, and everything else a youth could want was readily available through picofabrication, automation, and biotech, it was ironically the most basic substance of life, water, that became scarce—because desalination remained expensive—and thus became the marker of one’s identity as a young person: their subcultural in-group.

Aside from the icers and lukers, there were half a dozen other fine-grained varieties of tap-water subcultures (like the boilers and the JCs—”just cold”).

Mostly high-school kids, and mostly idle due to the abundance of free scavenged (recycled) energy for their picoautomation and their neural implants, these youths roamed the streets in their picofabricated faux-leather jackets emblazoned with their subcultural affiliation, and picked fights with members of other groups.

After the first few years of water shortage, this expression of identity through the one scarce resource that was critical for survival began to expand. Through a naturally stochastic clustering process, some hairstyles and preferences for clothes or shoes became associated with particular groups.

It just happened the way it did. There is no reason icers should prefer fur-lined boots and Christmas sweaters. If anything, one would expect the opposite. Yet, they wear them even in the summer… in the 130° globally warmed summers of Cascadia. That’s how you know you’ve got a genuine subculture: The clothing has got to be uncomfortable; it’s gotta require sacrifice.

The lukers likewise somehow ended up all having to wear havaianas, 20th-century motorcycle helmets over long green hair, tank tops (what the British call “vests”), bandannas tied right at the elbow, one on each arm, and pajama pants with teddy bears sewn unto them. (None of them knew that this last little detail originated with a bassist in a combo of ancient “rock” music from back when music was made by people playing instruments rather than autonomous conscious AI units that wrote every kind of music straight into digital encoding.) The more teddy bears one’s pants had on it, the greater would be their status as a luker.

Joey had found the time to get his automation to sew 37 onto his favorite pajama pants and another 24 on a different pair. The fact that he consequently couldn’t run was a big part of why the icers picked on him so much. They, on the other hand, spent most of their time getting their picobots to learn to assemble themselves into fists and feet for delivering punches and kicks from a distance.

So, Joey called up a game in his neural implant as he and his 37 teddy bears set out onto the streets of Seaportouver, bracing themselves—not so much the teddy bears but Joey’s bio-body and all his affiliated picobots and neurally linked semi-autonomous genetic floaters—against the onslaught of icer attacks and against old people who look disdainfully at his awkward teddy-bear-encumbered gait and transmit unsolicited neuro-advice that clogs up his game for an entire interminable microsecond, in search of a thimblefull of lukewarm water.

 

This mini sci-fi story is an attempt to draw a parallel between how ridiculous and unlikely such tap-water-based subcultures of street-fighting youth might seem to us, and how the music-based subcultures of my youth in the ’80s must seem to today’s youth.

Music, after all, is like water now: You turn the tap, and it pours out—out of YouTube, Spotify, Pandora, Slacker, or SoundCloud, and in a sense, also out of GarageBand, FruityLoops, Acid, and myriad other tools for generating music from loops. A few dozen people in a few offices in LA may make those loops—they’re like the people working the dams and the people who run the municipal water bureau or whatever. They supply the water that we take for granted, and it just flows out of the tap, not requiring any thought or effort on our part about how it got there or how much of it there might be. Music today works the same way. You exchange memory cards or streaming playlists; you download free software that allows you to drag and drop loops and which makes sure they are in the same key and tempo. It’s about as complicated as making lemonade. Why would such a thing have any relation to one’s identity and individuality?

In contrast, when I was young, I had to save money for a year and still beg my parents for a long-playing record. I could also occasionally buy some cheap tapes or record songs off the radio (almost always with the beginning cut off and with a DJ talking over the end) onto cheap low-fi cassettes that had more hiss than hi-hat. My first compact disc, a birthday present from a wealthy relative, was like an alien artifact. It still looks a bit magical to me… so small and shiny. Today, I hear they’re referred to as “coasters” because… why bother putting music on a recording medium when it’s free and ubiquitous? 

Subculture-as-identity-marker has disappeared except among the old. (How old is Iggy today, or the guys from The Clash?) Young people today dress in combinations of the “uniforms” of ‘50s, ‘60s, ‘70s, ‘80s, and ‘90s subcultures without having any interest in the sociopolitics or music of those subcultures. The last three times I talked to a―seemingly―fellow goth or punk rocker, they reacted with mild repulsion at the suggestion that they might listen to such music.

Expressing allegiance to a musical subculture must seem as silly to today’s youth (say, through age 30 or so) as expressing allegiance to a temperature of water would seem to anyone.