Teaching machine learning within different fields

Everyone is talking about machine learning (ML) these days. They usually call it “machine learning and artificial intelligence” and I keep wondering what exactly they mean by each term.

It seems the term “artificial intelligence” has shaken off its negative connotations from back when it meant top-down systems (as opposed to the superior bottom-up “computational intelligence” that most of today’s so-called AI actually uses) and has come to mean what cybernetics once was: robotics, machine learning, embedded systems, decision-making, visualization, control, etc., all in one.

Now that ML is important to so many industries,application areas, and fields, it is taught in many types of academic departments. We approach machine learning differently in ECE, in CS, in business schools, in mechanical engineering, and in math and statistics programs. The granularity of focus varies, with math and CS taking the most detailed view, followed by EC and ME departments, followed by the highest-level applied version in business schools, and with Statistics covering both ends.

In management, students need to be able to understand the potential of machine learning and be able to use it toward management or business goals, but do not have to know how it works under the hood, how to implement it themselves, or how to prove the theorems behind it.

In computer science, students need to know the performance measures (and results) of different ways to implement end-to-end machine learning, and they need to be able to do so on their own with a thorough understanding of the technical infrastructure. (If what I have observed is generalizable, they also tend to be more interested in virtual and augmented reality, artificial life, and other visualization and user-experience aspects of AI.)

In math, students and graduates really need to understand what’s under the hood. They need to be able to prove the theorems and develop new ones. It is the theorems that lead to powerful new techniques.

In computer engineering, students also need to know how it all works under the hood, and have some experience implementing some of it, but don’t have to be able to develop the most efficient implementations unless they are targeting embedded systems. In either case, though, it is important to understand the concepts, the limitations, and the pros and cons as well as to be able to carry out applications. Engineers have to understand why there is a such a thing as PAC, what the curse of dimensionality is and what it implies for how one does and does not approach a problem, what the NFL is and how that should condition one’s responses to claims of a single greatest algorithm, and what the history and background of this family of techniques are really like. These things matter because engineers should not expect to be plugging-and-playing cookie-cutter algorithms from ready-made libraries. That’s being an operator of an app, not being an engineer. The engineer should be able to see the trade-offs, plan for them, and take them into account when designing the optimal approach to solving each problem. That requires understanding parameters and structures, and again the history.

Today, the field of ‘Neural Networks’ is popular and powerful. That was not always the case. It has been the case two other times in the past. Each time, perhaps like an overextended empire, the edifice of artificial neurons came down (though only to come up stronger some years later).

When I entered the field, with an almost religious belief in neural networks, they were quite uncool. The wisdom among graduate students seemed to be that neural nets were outdated, that we had SVMs now, and with the latter machine learning was solved forever. (This reminds me of the famous patent-office declaration in the late 1800s that everything that could be invented had been invented.) Fortunately, I have always benefited from doing whatever was unpopular, so I stuck to my neural nets, fuzzy systems, evolutionary algorithms, and an obsession with Bayes’ rule while others whizzed by on their SVM dissertations. (SVMs are still awesome, but the thing that has set the world on fire is neural nets again.)

One of the other debates raging, at least in my academic environment at the time, was about “ways of knowing.” I have since come to think that science is not a way of knowing. It never was, though societies thought so at first (and many still think so). Science is a way of incrementally increasing confidence in the face of uncertainty.

I bring this up because machine learning, likewise, never promised to have the right answer every time. Machine learning is all about uncertainty; it thrives on uncertainty. It’s built on the promise of PAC learning; i.e., it promises to be only slightly wrong and to be so only most of the time. The hype today is making ML seem like some magical panacea to all business, scientific, medical, and social problems. For better or worse, it’s only another technological breakthrough in our centuries-long adventure of making our lives safer and easier. (I’m not saying we haven’t done plenty of wrongs in that process—we have—but no one who owns a pair of glasses, a laptop, a ball-point pen, a digital piano, a smart phone, or a home-security system should be able to fail to see the good that technology has done for humankind.)

I left the place of the field of Statistics in machine learning until the end. They are the true owners of machine learning. We engineering, business, and CS people are leasing property on their philosophical (not real) estate.



Overfitting, Confirmation Bias, Strong AI, and Teaching

I was asked recently by a student about how machine learning could happen. I started out by talking about human learning: how we don’t consider mere parroting of received information to be same as learning, but that we can make the leap from some examples we have seen to a new situation or problem that we haven’t seen before. Granted there need to be some similarities (shared structure or domain of discourse—we don’t become experts on European Union economics as a result only of learning to distinguish different types of wine), but what makes learning meaningful and fun for us is the ability to make a leap, to solve a previously inaccessible problem or deduce (really it’s ‘induce’) a new categorization.

In response, the student asked how machines could do that. I replied that not only do we give them many examples to learn from, but we also give them algorithms (ways to deal with examples) that are inspired by how natural systems work: inspired by ants or honeybees, genetics, the immune system, evolution, languages, social networks and ideas (memes), and even just the mammalian brain. (One difference is that, so far, we are not trying to make general-purpose consciousness in machines; we are only trying to get them to solve well-defined problems very well, and increasingly these days, not-so-well-defined problems also).

So, then the student asked how machines could make the leap just like we can. This led me to bring up overfitting and how to avoid it. I explained that if a machine learns the examples it is given all too well, it will not be able to see the forest for the trees—it will be overly rigid, and will want to make all novel experiences fit the examples in its training. For new examples that do not fit, it will reject them (if we build that ability into it), or it will make justifiable wrong choices. It will ‘overfit’, in the language of machine learning.

Then it occurred to me that humans do this, too. We’ve all probably heard the argument that stereotypes are there for a reason. In my opinion, they are there because of the power of confirmation bias (not to mention, sometimes selection bias as well—consider the humorous example of the psychiatrist who believes everyone is psychotic).

Just as a machine-learning algorithm that has been presented with a set of data will learn the idiosyncrasies of that data set if not kept from overfitting by early-stopping, prestructuring, or some other measure, people also overfit to their early-life experiences. However, we have one other pitfall compared to machines: We continue to experience new situations which we filter through confirmation bias to make ourselves think that we have verification of the validity of our misinformed or under-informed early notions. Confirmation bias conserves good feelings about oneself. Machines so far do not have this weakness, so they are only limited by what data we give them; they cannot filter out inconvenient data the way we do.

Another aspect of this conversation turned out to be pertinent to what I do every day. Not learning the example set so well is advantageous not only for machines but for people as well, specifically for people who teach.

I have been teaching at the college level since January 1994, and continuously since probably 2004, and full-time since 2010, three or four quarters per year, anywhere from two to five courses per quarter. I listed all this because I need to point out, for the sake of my next argument, that I seem to be a good teacher. (I got tenured at a teaching institution that has no research requirement but very high teaching standards.) So, let’s assume that I can teach well.

I was, for the most part, not a good student. Even today, I’m not the fastest at catching on, whether it’s a joke, an insult, or a mathematical derivation. (I’m nowhere near the slowest, but I’m definitely not among the geniuses.) I think this is a big part of why I’m a good teacher: I know what it’s like not to get it, and I know what I have had to do to get it. Hence, I know how to present anything to those who don’t get it, because, chances are, I didn’t get it right away either.

But there is more to this than speed. I generate analogies like crazy, both for myself and for teaching. Unlike people who can operate solely at the abstract level, I make connections to other domains—that’s how I learn; I don’t overfit my training set. I can take it in a new direction more easily, perhaps, than many super-fast thinkers. They’re right there, at a 100% match to the training set. I wobble around the training set, and maybe even map it to n+1 dimensions when it was given in only n.

Overfitting is not only harmful to machines. In people, it causes undeserved confidence in prejudices and stereotypes, and makes us less able to relate to others or think outside the box.

One last thought engendered by my earlier conservation with this student: The majority of machine-learning applications, at least until about 2010 or maybe 2015, were for well-defined, narrow problems. What happens when machines that are capable of generalizing well from examples in one domain, and in another, and in another, achieve meta-generalization from entire domains to new ones we have not presented them with? Will they attain strong AI as a consequence of this development (after some time)? If so, will they, because they’ve never experienced the evolutionary struggle for survival, never develop the violent streak that is the bane of humankind? Or will they come to despise us puny humans?


Four simple tricks to solve many of your grammar questions without having to search online

REMOVE A WORD: “Me and Mike like ice cream.” becomes “Me like ice cream.” Apparently, that’s not it, so the original sentence should have been “I and Mike (both) like ice cream.”

ANSWER A QUESTION: “Who should I ask?” The answer could be: “You should ask hiM.” Therefore, the first sentence should have been “Whom should I ask?”—the ‘m’s match.

ASK A QUESTION: “industrial music group”: What kind of group? The industrial kind (as well as music kind)

as opposed to

“industrial-music group”: What kind of group? The industrial-music kind

CHANGE THE ORDER: The expression “music industrial group” fails in a different way (and also means something very different) than the expression “red big house” would fail in comparison to “big red house” (so, a hyphen was needed).

“Big red house” is both correct and proper, and a hyphen would be wrong between ‘big’ and ‘red’. The two modifiers ‘big’ and ‘red’ are independent of each other; they act separately. The house could also have been small and red, or big and green.

The other two modifiers, ‘industrial’ and ‘music’ (the latter a noun that tells what type of group) are not independent when what we mean is Einstuerzende Neubauten or Cabaret Voltaire. The opposite is true when we are talking about Roland, Yamaha, Korg, and Nord, for example.

“Verbing weirds language.”

“Verbing Weirds Language” (not so fast)

I have heard this term quite a lot lately, and it certainly gets its point across. However, as a speaker of languages from more than one family (language family, that is), I find that it’s shortsighted.

In languages that use helping verbs to make verbs out of nouns and adjectives (Japanese: suru; Turkish: etmek), there is no such problem, and, it seems, very little accompanying debate of descriptivism versus prescriptivism. For a multiculturally informed version of this aphorism, I suggest saying something along the lines of “Verbing weirds English.” (or “Verbing weirds Indo-European languages.”) because we Turks have been doing it without any weirding for quite some time, not to mention the Japanese and others—recall the infamous ‘bushusuru’.

Furthermore, verbing is not necessary. Take the new “fail” as a noun. The word ‘fail’ is a verb. The noun form is ‘failure’. I am familiar with the contention between descriptivism and prescriptivism. Descriptivists usually argue that language evolves. “It always has, so let it continue to do so.” However, we do not support this line of reasoning in other matters. Human beings necessarily form judgments and opinions based on confirmation bias, regression to the mean, and various well-documented heuristics and biases. Even those who are aware of these mistakes continue to make them much of the time. This, then, is how people behave. According to the descriptivist approach, we should let biases rule, and let informed thinking go by the wayside: The property of being something that occurs very often as part of human behavior does not bestow a sacred status on verbing or biases. Language evolves, as it perhaps ought to, but mindful people can strive for a direction of linguistic evolution that does not reduce clarity, increase redundancy, and encourage laziness of mind. I am all for positive evolution in the English language: changes that will make it more consistent, more rational, easier to understand, less redundant, and more elegant. Many of the changes brought about by the Internet and smart phones do not have this effect. To illustrate my point, I will include many examples of how brilliantly awesome English actually is below, but first, a bit more about verbing (and nouning).

I was reading the membership conditions for a retail chain, and realized that perhaps nouning[i] bothers me more than verbing. But why should either one? English is a language that has many words that serve as a verb and a noun with no change in spelling or pronunciation (‘park’, for example). Yet, as a native speaker of a non-Indo-European language, which has its moments of logical consistency, and which uses helping verbs so that neither verbing nor nouning ever need to happen, and further, as a native-level English speaker of 33 years[ii], I value the superior logical consistency of English, and don’t like seeing it eroded. (Note: Of course I realize that English pronunciation and spelling are not logical or consistent; there is a wonderful demonstration here: https://www.youtube.com/watch?v=2_dc65V7DV8), but trust me, and read on: English grammar, syntax, and punctuation are superbly, surprisingly, wonderfully logical. I will start small.

It may not be worth it, because I decided to give everyone full credit. (This means it ain’t worth it.)

It may not be worth it because I decided to give everyone full credit. (This means it may be worth it for some other reason.)

From a student paper: “Humans are over populating the world.” (This seems to indicate that humans have stopped reproducing, that they are no longer interested in populating the world. What the student really meant was “humans are overpopulating the world.”

Next, I would like to discuss hyphenation. Aniruddh Patel, in one of his talks, describes the hierarchical nature of language as follows: “If you know English, and I say the following sentence, ‘The girl who kissed the boy opened the door.’, … there is a sequence of words in that sentence: ‘the boy opened the door’ … But, if you speak English and understand English, you know it’s not the boy that opened the door; it’s the girl that opened the door. In other words, you don’t just interpret language in a left-to-right fashion. …” He goes on to explain that the phrases are hierarchically related such that ‘girl’ is linked to ‘opened’, not ‘boy’ which is right next ‘opened’.

Hyphens help us with another hierarchical aspect of language. An arithmetic analogy will help demonstrate this. 4 + 2 × 3 = 10 because precedence tells us to multiple 2 and 3 first. In arithmetic, we can use parentheses to change the hierarchy and override precedence: (4 + 2) × 3 = 18. This is exactly what hyphens do: They group words into concepts. Notice that the hyphen, typographically speaking, is rather short. It’s shorter than any of the letters in a monospaced font. That’s because, unlike dashes, hyphens serve to combine, not separate. Dashes, which are either ‘n’ long or ‘m’ long, serve to push words apart. The en dash, for example, is used for ranges, like 9–5 and Seattle–Atlanta. But let’s get back to hyphens.

Hyphens join two words into one concept, as in two-car garage, one-man band, and land-grant university. ‘Two’ and ‘car’ started life as separate concepts. In ‘two-car garage’, they are combined into a single concept, just as (4 + 2) got combined into a single number, 6. Again, just as 6 acted as one entire and single number on 3 in that multiplication above, ‘two-car’ acts as a single concept of garage size, when neither ‘two’ nor ‘car’ ordinarily signify size or width.

A friend once asked (on facebook): [. . .] purple people eaters: Are they purple people who eat people or people who eat purple people? While this may have been posted in jest, the logic applies to serious cases where the intended meaning matters. I responded: “Purple-people eaters eat people who are purple, while purple people eaters are purple in color themselves. In English, the modifiers gang up on the noun at the end unless you hyphenate.” I followed this up with two examples. “A community college association is a group of college-related people from the community whereas a community-college association is a group of colleges. In other languages, nouns have cases, so you don’t have these problems.” (Just as verbing doesn’t weird all languages, this type of problem also completely fails to occur in Turkish because nouns in noun strings get modified with suffixes that place them in their proper cases, doing the job of hyphens in English. The difficulty for native speakers of English is that they grow up speaking English, wherein it’s difficult to hear the sound of the hyphen. In Turkish, there is no mistaking the suffix; you hear it from the time you’re a little baby.

My second, and perhaps better example was the following. “The phrase ‘lake of fire Christians’ implies a lake consisting of ‘fire Christians’, whatever that would mean. What is typically meant is ‘lake-of-fire Christians’. English is ‘endian’ unless you override that with hyphens. In spoken language, we use inflection to make these things clear (which, again, is why many native speakers have a harder time than some ESL-speakers).

The subject line of an e-mail I received said, “Security for the Cloud Lunch in Portland”

This seems to indicate that someone is setting up security for a lunch event in Portland. What they meant, of course, was “Security-for-the-Cloud Lunch in Portland.” This type of mistake can very easily be avoided by rejecting the temptation to produce noun strings, and using the three little powerful words that make English work: ‘of’, ‘for’, and ‘from’. Calling it “Lunch Meeting about the Security of the Cloud, to take place in Portland” would remove all ambiguity.

Likewise, on the back of a 45-RPM record I recently bought[iii], the recording location is identified as “the crazy cat lady house” (another noun chain). It is clear, of course, that what is intended is “the crazy-cat-lady house” where “crazy cat lady” is a compound modifier, a unified concept and single descriptor for the house. Without proper hyphenation, the meaning is open to interpretations such as “the crazy-cat lady-house” (with the last hyphen not strictly necessary, but placed because this blog post is written, not spoken).

My first car was used, so I was a new car owner. I recently bought my third car, and it was brand new, so I am now a new-car owner (but not a new car owner).

The Coursera privacy policy states, “If you participate in an online course, we may collect from you certain student-generated content, such as assignments you submit to instructors, peer-graded assignments and peer grading student feedback.”

Note that the first (correct) expression “peer-graded assignments” and the later (incorrect) expression “peer grading student work” ought to be based on the same reasoning, so how could they end up different? It seems, based not only on this example, that people use other mental processes, not reasoning, for determining whether to hyphenate or not. These processes could be memorization, template-matching, or aesthetics.

There is support for this possibility. American English, as opposed to British English, is template-based in its treatment of punctuation with respect to quotation marks: They always have to be inside, unless they’re large characters like ‘?’. This is a purely aesthetic choice, and is not logical. The IT industry has, in recent years, protested this and switched to logical/British punctuation. I saw this reflected in Microsoft Word grammar-correction recommendations as early as 2012. (Way to go, Microsoft!)

“An Early Bird Sound Collage” is the title of a work by an experimental-music band. Do they mean it’s an early bird-sound collage, or an early-bird sound collage? (They are experimental, so the intended meaning could easily have been either.)

“Introducing the Möbius-Twisted Turk’s Head Knot” is a paper title from the Bridges 2015 conference. They got the first one right. Now, is it a twisted Turk and his head knot, or is it the Turk’s-head knot that’s twisted?

Compare the following. “Small plane crash” where we don’t know how big a plane it was, but the crash was a minor concern, and “small-plane crash” where we know that the plane was small, but the crash could still have been quite a big deal. In most cases, it is better not to be stingy with words; something like “a big crash involving a small plane” would be much clearer.

My next example is from course packs. In one case, one might be able to get a refund: “All packets are not refundable.” (unclear) vs. “All packets are non-refundable.” (quite clear!)

Here are some more examples from academia. There is a big difference between the “higher-ed budget” and a “higher ed budget” (we all want the latter). An “online learning report” is a report that gets posted online, while an “online-learning report” does not have to be posted, but it is about online learning. How about “main session outline” versus “main-session outline”?

What is the opposite of the right-hand rule? And what is the opposite of the right hand rule?[iv] The opposite of the former would be the left-hand rule, whereas the opposite of the latter would be the wrong hand rule.

Compare the expressions “proof of concept viruses for Linux” with “proof-of-concept viruses for Linux.” Again, from the tech fields: “no load gain” could be the opposite of “no-load gain”!

Even if it’s a stretch, one of the following could be about athletic performance whereas the other is clearly about academic performance: “college grade-point average” and “college-grade point average”

Here’s one from the field I teach in: Which technical term does not limit the model size: “small-signal model” and “small signal model”?

“Portland’s first clean air cab” was meant to indicate a regular car that that doesn’t pollute, but is written like a flying car that is not dirty.

I saw this on the web as well: THE HAITIAN TERRACING FOR HOPE PROJECT. One wonders if Hope Project will get some Haitian terracing, or if there is a Haitian project called ‘Terracing for Hope’. Since hyphenation cannot be imposed on proper nouns (such as the official name of a project), using one of the magic little words or changing word order could have helped this case: “A Project in Haiti: Terracing for Hope” or “The Haitian Project of Terracing for Hope” or “Terracing for Hope, a Haitian Project,” etc.

And how about all this free stuff we’re being sold all the time? This was seen on a billboard: “NEW TRANS-FAT FREE” with the word “free” on a separate line. It appears to imply that the new product has trans-fat, and is free. Likewise with all these products that sport the expression “gluten free”: apparently, there is gluten in it, but we’re not paying for the gluten.

And then, there is verb hyphenation, which confuses many people I know even more. Verbs are not hyphenated when used as verbs. However, when non-verbs are used with verbs as a compound verb, they do get hyphenated (and this is common sense). For example, “how to fly-fish” is very different from “how to fly fish.” In the latter, one tries to make fish fly. Likewise, “moonbathing” is about a person enjoying moonlight, whereas an expression like “moon bathing” suggests it’s the moon doing the bathing (assuming the rest of a proper sentence surrounds that expression). Similarly, “battle ready” (the battle is ready?) is very different from “battle-ready” (a compound modifier that shows that some person, equipment, or army is ready for battle).

Even after verbing turns a new word like ‘blog’ into a verb as well as a noun, combining it with the noun/adjective ‘video’ requires hyphenation: “learning to video blog” makes no sense, while “learning to video-blog” does.

Alright, perhaps we do not need to be so vigilant about compound modifiers all the time. Here is one I have seen where even I have to admit that context and common sense are quite sufficient to know what is meant even in the absence of a compulsively placed hyphen. It is “sexual abuse hysteria.” Perhaps, no hyphen is needed when an adjective becomes an adverb. I think this one is clear without the need for a hyphen.

Before leaving hyphenation, I must address adverbs. Adverbs are not hyphenated (although many well-meaning and thoughtful individuals do hyphenate them.)

You only need hyphenation when the target of a modifier is ambiguous, which is why adverbs are not entered into hyphenated compounds. Recall that in one of the examples above, ‘purple’ had the option of referring to the people being eaten or to the creature doing the eating, so we had to specify which by knowing when to and when not to use a hyphen. In the case of adverbs, as in “culturally sensitive employer,” for instance, there is no question about which word ‘culturally’ is attached to; there is no such thing as a “culturally employee”; so there is no ambiguity, and no need to waste time with hyphens.

Commas are another matter disproportionately consequential in comparison to the size of the punctuation mark involved. The following examples come from a variety of sources, but mostly from a delightfully brilliant book, to which I was introduced[v] during University Studies teacher training at Portland State University: Maxwell Nurnberg’s Questions You Always Wanted to Ask about English .

  1. Which statement clearly shows that not all bacteria are sphere-shaped?
  2. a) Christian A. T. Billroth called bacteria which had the shape of tiny spheres ‘cocci’. (In this case, it is implied that only some bacteria are spherical.)
  3. b) Christian A. T. Billroth called bacteria, which had the shape of tiny spheres, ‘cocci’. (In this case, it is implied that all bacteria are spherical.)
  1. Which sentence shows extraordinary powers of persuasion?
  2. I left him convinced he was a fool. (He is convinced, not I.)
  3. I left him, convinced he was a fool. (I am convinced.)
  1. Which is the dedication of a self-confessed polygamist?
  2. I dedicate this book to my wife, Edith, for telling me what to leave out. (In this case, he has one wife, whose name is Edith.)
  3. I dedicate this book to my wife Edith for telling me what to leave out. (In this case, we are led to believe he has at least one wife other than Edith. If it’s not clear why this is the case, see # 5 or # 6 below. The comma starts an explanation of who is being referred to.)


  1. In which sentence are you sure that “somatic” and “bodily” mean the same?
  2. Radioactive materials that cause somatic, or bodily, damage are to be limited

                     in their use. (In this case, ‘bodily’ is offered as a more familiar synonym for ‘somatic’.)

  1. Radioactive materials that cause somatic or bodily damage are to be limited

     in their use. (In this case, the implication is that ‘somatic’ and ‘bodily’ are mutually exclusive, hence mean different things.)

Nurnberg’s examples were my introduction to the power of the comma. They went beyond the boilerplate rules I had been taught, like “Never place a comma before ‘because’!” and “Always put a comma before ‘too’!”

Soon, I was noticing commas where they should not have been, and a lack of commas where they were badly needed.

I read the following at http://www.riskshield.com.au/Glossary.aspx [2]: “Technique, procedure and rule used by risk manager to identify asses and examine the risks.” The missing comma could really have helped with the change in meaning caused by the missing ‘s’ in ‘assess’. On the other hand, perhaps the risk manager’s job really is to identify asses. If so, this is one particularly frank document.

In the book MIDI Systems and Control [3], I came across the following, “RS422 … is a standard for balanced communications over long lines devised by the EIA (Electronics Industries Association)” (p. 23). Without a comma right after ‘long lines’, the sentence is open to the interpretation that it was long lines that were devised by the EIA, as opposed to RS422.

Here is some correct comma use (as one would expect from Stanford University). In The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Hastie, Tibshirani, and Friedman, there is a sentence “. . . in a study to try to predict whether the email was junk email, or ‘spam’” (p. 2). The comma is used, appropriately, in its explanation-signifying role. Later on, however, the authors say “In the handwritten digit example the output is one of 10 different digit classes . . .” (p. 9). This clearly needs a hyphen connecting ‘handwritten’ and ‘digit’. Currently (without the hyphen), it is the example that is handwritten, not the digits. This difference could be meaningful if one were referring to a solutions manual, for example, where examples are often handwritten.

Here’s an example I must have gotten from someone else or a book: “King Charles walked and talked; half an hour after, his head was cut off.” Let’s try it without the comma and the semicolon: “King Charles walked and talked half an hour after his head was cut off.” This is reminiscent of the English-class exercise that was making the rounds on facebook at one point: “A woman without her man is nothing.” which can be punctuated either as “A woman: without her, man is nothing.” or as “A woman, without her man, is nothing.” [4]

Grammarly also posted this headline about Peter Ustinov’s travels: “Highlights of his global tour include encounters with Nelson Mandela, an 800-year-old demigod and a dildo collector.” The absence of the Oxford comma turns what was meant as a list of three entities into one entity (Mandela) and a description of him (an 800-year-old demigod who collects dildos). As I mention elsewhere, the structure of other languages (such as Turkish) may be such that the Oxford comma is useless, but it clearly makes a difference to the meaning of a sentence in English.

Here’s an instance from an e-mail I once wrote. I had asked my friend, “Did I go too far into unnecessary details by way of explaining what I’m doing and why?” This question addresses my explanation of what I was doing and why I was doing it. It is different from “Did I go too far into unnecessary details by way of explaining what I’m doing, and why?” which asks why my explanation is considered to be excessive.

Nurnberg does a much better job of revealing the power and importance of the comma in his book than my examples here do. I think everyone who writes in English should own a copy and read it.

I also want to address redundancy a little. Someone once said this to me during a conversation: “. . . considering they’re both not at the same time. . .” This would make sense if the two events spoken of were not coincident with a third event, but, in this case, there was no third event. All that was meant was “considering they’re not at the same time . . .”

I also frequently hear “continue on …” and “return back …” (This is frustrating.) To continue is to go on. Therefore, to continue on becomes ‘to go on on’. (We’re all familiar with “ATM machine” and “PIN number”…)

Redundancy is necessary when life-threatening situations are handled by electronic or electro-mechanical systems. Let’s leave the redundancy to those cases, and stop wasting our breath with it.

And then there is the centipede sentence: “I’d take MLK Boulevard would be my suggestion.”

My boss at my old job, fifteen years ago, was amazing at these. I wonder if he composed any regular sentences; they all seemed to be along the lines of “The reason is is that there is an address conflict was what happened.” (Otherwise, he was clearly a genius. I don’t know why he talked like that.)

I have now firmly established myself, in this post, as the worst possible killjoy nerd geek compulsive so-called “grammar N**i” ever, so let me close on a positive note.

I wrote this ridiculously long post because I love the English language, and I want people, especially its native speakers, to treat it well. I also love Turkish, Portuguese, German, and Japanese, and if you have ever read engineering material written in English by Japanese engineers, you know that the structure of non-Indo-European languages must be very different. The agglutinative use of cases in Turkish makes most of the discussion of this post unnecessary, but there is one thing Spanish, Portuguese, English, etc. have that I’m quite envious of: THE SUBJUNCTIIIIIVE (Cue dark, scary music.)

The subjunctive, which is still going strong in Spanish, but has only a few surviving occasions of use in English, draws a distinction between fact and possibility, or between wishes and truths: “if he were” makes it sound wrong to indicate something hypothetical as being actual. (Note how it does not go “if he was” but switches to the awkward subjunctive, the non-reality case.) This is scientific thinking built into the language. The speaker is required to differentiate between factual cases and wishing or wondering. (If only we could get the health-care industry to differentiate between factual evidence- and mechanism-based care and wishful-thinking-based care.)

In conclusion: Languages are awesome. Let’s not stand by and watch them get eroded into redundancy and lack of clarity by mental (and technologically aided) sloth. If languages change, fine; let them change well, preserving the characteristics that allow humans to communicate with precision and subtlety. Good communication saves lives.

And what is the deal with “the both” in 2016? The expression ‘both of’ is intended for a different meaning than the expression ‘the two’. There was no reason to make a hybrid that goes ‘the both’. Please, everyone, stop saying this.

“The two of them went away, but we stayed.” (There were more than two people involved.)

“We both went away.” (There were only two people involved.)

English has set up this great way to incorporate set theory into the language. Why are we messing it up? Consider the following:

“Are you ready to compare both files?” (I would be, if I wanted to compare two files each with a third. However, if the comparison were simply between two files, it’s “Are you ready to compare the two files?”)

PS: I need to get this one off my chest, too: “Computation methods” would be methods of computation, whereas “computational methods” would be methods that make use of computation. And don’t get me started on ‘methodologies’. How many people actually study methods? (I’m glad to have witnessed this being brought up at a discussion during the AAWM 2016 conference.)


[1] Nurnberg, M., Questions You Always Wanted to Ask about English (but were afraid to raise your hand), New York: Washington Square Press, 1972.

[2] http://www.riskshield.com.au/Glossary.aspx (not there anymore)

[3] Rumsey, F., MIDI Systems and Control (Second Edition), Oxford, UK: Focal Press, 1994.

[4] facebook.com/Grammarly

[i] “12-month spend of $500” it said. At least it’s hyphenated correctly, but what was wrong with the noun ‘spending’ that we need a new noun to replace it?

[ii] People who know me can attest to this.

[iii] From the band NASALROD

[iv] Again, altering the word order could clarify such cases: After all, we never say “thumb rule” instead of “rule of thumb”; we always take the time to say “rule of thumb”!

[v] Like most of the important things in life

How to Reason in Circuit Analysis

The following conversation played out in my head as I was grading an exam problem that had a supernode composed of two neighboring supernodes. Many students (in introductory circuit analysis) had difficulties with this problem, so here’s what I plan to present when I explain it.


Q: What is the main type of equation involved in performing nodal analysis?

A: KCL equation

Q: What electrical quantity is represented in each term of a KCL equation.

A: current

Q: Are there any elements for which, if the current is not stated, we do not have a way (a defining equation[1]) to know and express the current?

A: yes

Q: What are these elements?

A: voltage sources of any type[2] that are directly between two non-reference essential nodes (NRENs)

Q: Why is that a problem?

A: There is no defining equation (like Ohm’s Law) for a source, and if it’s directly between two NRENs, then there is no other element in series with it.

Q: So what if there is no other element in series with it?

A: If there were a resistor in series with it, we could use Ohm’s Law on the resistor.

Q: Why not use Ohm’s Law on the source?

A: Ohm’s Law does not apply to sources, does not deal with sources; it’s only for resistors[3].

Q: Fine… What’s with the non-reference thing?

A: If a voltage source (of any kind) has one terminal attached to the reference node (ground), then we automatically know the voltage at the other end (with respect to ground).


Conclusion: If there is a voltage source between two NRENs, circle it to make a (super)node, and write KCL out of that node, without going inside it (until later, when you need another equation, at which point you use KVL).


[1] A defining equation is an expression that relates current through a two-terminal element to the voltage across a two-terminal element by means of the inertial aspect of the element (its capacitance, resistance, inductance, and I suppose, pretty soon, its memristance) and the passive sign convention (PSC).

[2] i.e., independent voltage source, current-dependent voltage source, voltage-dependent voltage source: It’s about the voltage aspect, not about the dependence aspect.

[3] two terminal elements with a linear current–voltage relationship; note the en dash : )

Science, Clave, and Understanding

When Dr. Eben Alexander defended, in one of the major news magazines, his book (“proof[i]”) [1] about a spiritual non-physical afterlife realm, part of his argument was that he is a surgeon, and therefore a scientist. Surgeons are highly trained, highly specialized people who perform a very difficult and critically important service. It would be absurd not to recognize their value. Their work is without a doubt science-based, but does that make it “science”? (There are, of course, surgeons who publish scholarly work (although, I’ve noticed that in some cases, it’s not about surgery, but on fields as distant as music), and thus function as scholars, and therefore scientists.)

A scientist is not anyone who functions as a professional practitioner of a difficult and science-based field; a scientist is someone who sets up, tests, and evaluates (mostly via statistical data analysis) testable hypotheses (about anything, including the afterlife and spiritual realms, if necessary), and more importantly, does so within the guidelines of rigor, accuracy, objectivity, skepticism, and open-mindedness [2][ii]. It is worrisome to imagine that surgeons are setting up double-blinded clinical trials of surgical practices as part of their work, choosing to apply a known good technique on one patient and an as-yet-unsupported one on another patient. (In other words, I really hope surgeons do not act as scientists.) Maybe they do; I’d like to know, so please give me feedback on this question.

Assuming, though, that they don’t endanger patients’ lives for the sake of science, as we tend not to do anymore, it seems safe to assume that surgeons are highly trained specialists who practice state-of-the-art medicine. In this sense, they are not scientists. They use the findings and results of science in their practical, applied work (medicine). They must, then, fall somewhere between applied scientists and technologists (inclusive).

To say that someone who practices a specialty that is based on scientific findings is therefore a scientist is like saying a sandwich-shop employee is a farmer because they use bacon, lettuce, and tomato in their work. (The fact that surgery is far more specialized does not invalidate the argument.)

The professions that discover, invent, develop, and apply are all different. The roles can overlap—scientists do develop and build new equipment to perform their experiments, but these are not mass-produced. Anything we can purchase repeatedly on amazon or at Best Buy, say, was not made by scientists. It was designed, developed, tested, and manufactured by engineers, technologists, technicians, and other professionals, not by scientists, even if scientists were involved in the early stages. As for applied scientists, including those who work at laboratories, characterizing soil samples, say, or performing tests, they are also highly trained specialists of scientific background who are not doing science at that point. As one XKCD comic suggested [3], you can simply order a lab coat from a catalog; no one will check your publication record. Science is not solely about what you’re wearing or what degrees you have; it’s about what, exactly, you’re doing.

The public’s idea of what science is seems to be “mathy and difficult, preferable done in a lab coat while uttering multisyllabic words you don’t want to see in your cereal’s list of ingredients.” This may be a decent shortcut for pop-culture purposes, but it is not what science really is. I will not go into the inductive-method-vs-hypothetico-deductive-method-vs-what-have-you debate here because there are people who do that professionally, and do it very well. (I have been enjoying Salmon’s The Foundations of Scientific Inference [4] immensely.) What I do want to do is draw two parallels in succession, first from the preceding discussion to explanation and understanding, and from those concepts, to explanations and understanding of clave (in music).

The former has been done quite successfully in Paul Dirac-medal-and-prize-winning physicist Deutsch’s earlier book The Fabric of Reality [5]. I am not concerned here with the bulk and main point of his book, but only with his opening argument about the role of science (explanation) and what it means to understand. Deutsch criticizes instrumentalism because of its emphasis on prediction at the cost of explanation (pp. 3–7). He gives rather good examples of situations in which no scientist (or layperson, for that matter) would be satisfied with good predictions without explanations (p. 6, for example). He does not deny the role and importance of predictions, but argues that “[t]o say that prediction is the purpose of scientific theory is [. . .] like saying that the purpose of a spaceship is to burn fuel” (similar to another author’s argument that the purpose of a car is not to make vrooom–vrooom noises; they just happen to do that as part of their operation[iii]). Deutsch states that just like spaceships have to burn fuel to do what they’re really meant to do, theories have to pass experimental tests in order “to achieve the real purpose of science, which is to explain the world.” (Think about it: Why did we all, as children, get excited about science? To understand the world!)

He then moves on to explain that theories with greater explanatory power than the ones they’ve replaced are not necessarily more difficult to understand, and certainly do not necessarily add to the list of theories one has to understand the be a scientist (or an enthusiast). Theories with better explanatory power can be simpler. Furthermore, not everything that could be learned and understood needs to be: See his example of multiplication with Roman numerals (pp. 9–10). It might be fun, and occasionally necessary to have some source in which to look it up (for purposes of the history of mathematics, say), but it’s not something anyone today needs a working knowledge of; it has been superseded. His example for this is how the Copernican system superseded the Ptolemaic system, and made astronomy simpler in the process (p. 9). All of this is discussed in order to make the point that there is a distinction between “understanding and ‘mere’ knowing” (p. 10), which is where my interest in clave comes into play.

Several “explanations” of clave (sometimes even with that word in the title) that were published in recent years have been of the “mere knowing” type in which clave patterns are listed, without any explanation as to how and why they indicate what other patterns are allowed or disallowed in the idiom. Telling someone that x..x..x…x.x… is 3-2, and ..x.x…x..x..x. is its opposite, so 2-3, and (essentially) “there you go, you now know clave” does nothing towards explaining why a certain piano pattern played over one is “sick” (good) and over the other, sickening (bad) within the idiom.

Imagine if the natural sciences went about education the way we musicians do with clave. A chapter in a high-school biology book would contain a diagram of the Krebs cycle, with all the inputs, outputs (sorry for the electrical-engineer language), and enzymes given by name and formula, followed by “and now you know biochemical pathways,” without any explanation as to how it has anything to do with an organism being alive. I’m flabbergasted that musicians and music scholars find mere listings of clave son, clave rumba, [and . . . you know, the other one that won’t be named[iv]] sufficient as so-called explanations[v].

All of this reminds me of an argument I once had with a very intelligent person. I had said, in my talk at Tuesday Talks, that science is concerned with ‘why’ and ‘how’, not just ‘how’. He disagreed, which I think is because he thought of a different type of ‘why’: the theological ‘why’. I, instead, had in mind Deutsch’s type of ‘why’: “about what must be so, rather than what merely happens to be so; about laws of nature rather than rules of thumb” (p. 11). I would add, about consistency (even given Goedel, because I’m Bayesian like that, and not so solipsistic), which Deutsch mentions immediately afterwards, calling it ‘coherence’.[vi]

I understand that Hume, Goedel, and others have shown us that our confidence in science, or even math, ought not to be infinite. It isn’t. Even in a book like The God Delusion, even Richard Dawkins makes it clear that he is not absolutely certain. Scientific honesty requires that we not be absolutely certain. But we can examine degrees of (un)certainty, and specifically because of the solipsists, we have to ignore them[vii], and be imperfect pursuers of an imperfect truth, improving our understanding, all the while knowing that it could all be wrong.

To that end, I continue to test my clave hypothesis under different genres. Even if it’s wrong, it definitely is elegant.

[1] Alexander, M.D., E., Proof of Heaven: A Neurosurgeon’s Near-Death Experience and Journey into the Afterlife, Simon & Schuster, 2012.

[2] Baron, R. A., and Kalsher, M. J., Essentials of Psychology, Needham, MA: Allyn & Bacon, A Pearson Education Company, 2002.

[3] http://xkcd.com/699/ (last accessed 12/25/2015).

[4] Salmon, W. C., The Foundations of Scientific Inference, Pittsburgh, Pennsylvania: University of Pittsburgh Press, 1966.

[5] Deutsch, D., The Fabric of Reality: A leading scientist interweaves evolution, theoretical physics, and computer science to offer a new understanding of reality, New York: Penguin Books, 1997.

[i] Scientists do not speak of proof; they deal with evidence. Proofs are limited to the realm of mathematics. There are no scientific proofs; there are just statistically significant results, which are presented to laypersons as ‘proof’ because even scientists have quite a lot of difficulty interpreting measures of statistical significance, and the average person has no patience for or interest in the details of philosophy of science.

[ii] The authors of [2] give the following excellent definitions for these precise terms. Accuracy: “gathering and evaluating information in as careful, precise, and error-free a manner as possible”; objectivity: “obtaining and evaluating such information in a manner as free from bias as possible” [Ibid.]. ‘Bias’ in this case refers to the cognitive biases that are natural to human thinking and judgment, such as confirmation bias, Hawthorne effect[ii], selection bias, etc.; skepticism: the willingness to accept findings “only after they have been verified over and over”; and open-mindedness: not resisting changing one’s own views—even those that are strongly held—in the face of evidence that they are inaccurate [2]. To these we can add principles like transferability and falsifiability, and the key tools of double-blinding, randomization, blocking, and the like. Together, all these techniques and principles constitute science. Simply being trained in science and carrying out science-based work is not sufficient.

[iii] I think it was Philips in The Undercover Philosopher, but I’m not sure.

[iv] If you’ve read my post about running into cool people from SoundCloud at NIPS ’15, you’ll know what pattern I’m talking about: the English-horn-like-named pattern.

[v] Fortunately, we do have work from the likes of Mauleón and Lehmann that show causal relationships between individual notes or phrases in different instrumental lines, but since their work and mine, the trend has reverted to listing three patterns, and calling that an explanation.

[vi] Perhaps this paragraph needs its own blog post. . .

[vii] Because, according to them, they don’t exist.