Deep learning set off the latest AI revolution, transforming computer vision and the field as a whole. Hinton believes deep learning should be almost all that’s needed to fully replicate human intelligence.
But despite rapid progress, there are still major challenges. Expose a neural net to an unfamiliar data set or a foreign environment, and it reveals itself to be brittle and inflexible. Self-driving cars and essay-writing language generators impress, but things can go awry. AI visual systems can be easily confused: a coffee mug recognized from the side would be an unknown from above if the system had not been trained on that view; and with the manipulation of a few pixels, a panda can be mistaken for an ostrich, or even a school bus.
GLOM addresses two of the most difficult problems for visual perception systems: understanding a whole scene in terms of objects and their natural parts; and recognizing objects when seen from a new viewpoint.(GLOM’s focus is on vision, but Hinton expects the idea could be applied to language as well.)
An object such as Hinton’s face, for instance, is made up of his lively if dog-tired eyes (too many people asking questions; too little sleep), his mouth and ears, and a prominent nose, all topped by a not-too-untidy tousle of mostly gray. And given his nose, he is easily recognized even on first sight in profile view.
Both of these factors—the part-whole relationship and the viewpoint—are, from Hinton’s perspective, crucial to how humans do vision. “If GLOM ever works,” he says, “it’s going to do perception in a way that’s much more human-like than current neural nets.”
Grouping parts into wholes, however, can be a hard problem for computers, since parts are sometimes ambiguous. A circle could be an eye, or a doughnut, or a wheel. As Hinton explains it, the first generation of AI vision systems tried to recognize objects by relying mostly on the geometry of the part-whole-relationship—the spatial orientation among the parts and between the parts and the whole. The second generation instead relied mostly on deep learning—letting the neural net train on large amounts of data. With GLOM, Hinton combines the best aspects of both approaches.
“There’s a certain intellectual humility that I like about it,” says Gary Marcus, founder and CEO of Robust.AI and a well-known critic of the heavy reliance on deep learning. Marcus admires Hinton’s willingness to challenge something that brought him fame, to admit it’s not quite working. “It’s brave,” he says. “And it’s a great corrective to say, ‘I’m trying to think outside the box.’”
The GLOM architecture
In crafting GLOM, Hinton tried to model some of the mental shortcuts—intuitive strategies, or heuristics—that people use in making sense of the world. “GLOM, and indeed much of Geoff’s work, is about looking at heuristics that people seem to have, building neural nets that could themselves have those heuristics, and then showing that the nets do better at vision as a result,” says Nick Frosst, a computer scientist at a language startup in Toronto who worked with Hinton at Google Brain.
With visual perception, one strategy is to parse parts of an object—such as different facial features—and thereby understand the whole. If you see a certain nose, you might recognize it as part of Hinton’s face; it’s a part-whole hierarchy. To build a better vision system, Hinton says, “I have a strong intuition that we need to use part-whole hierarchies.” Human brains understand this part-whole composition by creating what’s called a “parse tree”—a branching diagram demonstrating the hierarchical relationship between the whole, its parts and subparts. The face itself is at the top of the tree, and the component eyes, nose, ears, and mouth form the branches below.
One of Hinton’s main goals with GLOM is to replicate the parse tree in a neural net—this would distinguish it from neural nets that came before. For technical reasons, it’s hard to do. “It’s difficult because each individual image would be parsed by a person into a unique parse tree, so we would want a neural net to do the same,” says Frosst. “It’s hard to get something with a static architecture—a neural net—to take on a new structure—a parse tree—for each new image it sees.” Hinton has made various attempts. GLOM is a major revision of his previous attempt in 2017, combined with other related advances in the field.
“I’m part of a nose!”
A generalized way of thinking about the GLOM architecture is as follows: The image of interest (say, a photograph of Hinton’s face) is divided into a grid. Each region of the grid is a “location” on the image—one location might contain the iris of an eye, while another might contain the tip of his nose. For each location in the net there are about five layers, or levels. And level by level, the system makes a prediction, with a vector representing the content or information. At a level near the bottom, the vector representing the tip-of-the-nose location might predict: “I’m part of a nose!” And at the next level up, in building a more coherent representation of what it’s seeing, the vector might predict: “I’m part of a face at side-angle view!”
But then the question is, do neighboring vectors at the same level agree? When in agreement, vectors point in the same direction, toward the same conclusion: “Yes, we both belong to the same nose.” Or further up the parse tree. “Yes, we both belong to the same face.”
Seeking consensus about the nature of an object—about what precisely the object is, ultimately—GLOM’s vectors iteratively, location-by-location and layer-upon-layer, average with neighbouring vectors beside, as well as predicted vectors from levels above and below.
However, the net doesn’t “willy-nilly average” with just anything nearby, says Hinton. It averages selectively, with neighboring predictions that display similarities. “This is kind of well-known in America, this is called an echo chamber,” he says. “What you do is you only accept opinions from people who already agree with you; and then what happens is that you get an echo chamber where a whole bunch of people have exactly the same opinion. GLOM actually uses that in a constructive way.” The analogous phenomenon in Hinton’s system is those “islands of agreement.”
“Imagine a bunch of people in a room, shouting slight variations of the same idea,” says Frosst—or imagine those people as vectors pointing in slight variations of the same direction. “They would, after a while, converge on the one idea, and they would all feel it stronger, because they had it confirmed by the other people around them.” That’s how GLOM’s vectors reinforce and amplify their collective predictions about an image.
GLOM uses these islands of agreeing vectors to accomplish the trick of representing a parse tree in a neural net. Whereas some recent neural nets use agreement among vectors for activation, GLOM uses agreement for representation—building up representations of things within the net. For instance, when several vectors agree that they all represent part of the nose, their small cluster of agreement collectively represents the nose in the net’s parse tree for the face. Another smallish cluster of agreeing vectors might represent the mouth in the parse tree; and the big cluster at the top of the tree would represent the emergent conclusion that the image as a whole is Hinton’s face. “The way the parse tree is represented here,” Hinton explains, “is that at the object level you have a big island; the parts of the object are smaller islands; the subparts are even smaller islands, and so on.”
According to Hinton’s long-time friend and collaborator Yoshua Bengio, a computer scientist at the University of Montreal, if GLOM manages to solve the engineering challenge of representing a parse tree in a neural net, it would be a feat—it would be important for making neural nets work properly. “Geoff has produced amazingly powerful intuitions many times in his career, many of which have proven right,” Bengio says. “Hence, I pay attention to them, especially when he feels as strongly about them as he does about GLOM.”
The strength of Hinton’s conviction is rooted not only in the echo chamber analogy, but also in mathematical and biological analogies that inspired and justified some of the design decisions in GLOM’s novel engineering.
“Geoff is a highly unusual thinker in that he is able to draw upon complex mathematical concepts and integrate them with biological constraints to develop theories,” says Sue Becker, a former student of Hinton’s, now a computational cognitive neuroscientist at McMaster University. “Researchers who are more narrowly focused on either the mathematical theory or the neurobiology are much less likely to solve the infinitely compelling puzzle of how both machines and humans might learn and think.”
Turning philosophy into engineering
So far, Hinton’s new idea has been well received, especially in some of the world’s greatest echo chambers. “On Twitter, I got a lot of likes,” he says. And a YouTube tutorial laid claim to the term “MeGLOMania.”
Hinton is the first to admit that at present GLOM is little more than philosophical musing (he spent a year as a philosophy undergrad before switching to experimental psychology). “If an idea sounds good in philosophy, it is good,” he says. “How would you ever have a philosophical idea that just sounds like rubbish, but actually turns out to be true? That wouldn’t pass as a philosophical idea.” Science, by comparison, is “full of things that sound like complete rubbish” but turn out to work remarkably well—for example, neural nets, he says.
GLOM is designed to sound philosophically plausible. But will it work?
Why mixing vaccines could help boost immunity
We should soon have a better idea. A handful of trials are now under way to test the power of vaccine combinations, with the first results due in later this month. If these mixed regimens prove safe and effective, countries will be able to keep the vaccine rollout moving even if supplies of one vaccine dwindle because of manufacturing delays, unforeseen shortages, or safety concerns.
But there’s another, more exciting prospect that could be a vital part of our strategy in the future: mixing vaccines might lead to broader immunity and hamper the virus’s attempts to evade our immune systems. Eventually, a mix-and-match approach might be the best way to protect ourselves.
Mixing on trial
The covid-19 vaccines currently in use protect against the virus in slightly different ways. Most target the coronavirus’s spike protein, which it uses to gain entry to our cells. But some deliver the instructions for making the protein in the form of messenger RNA (Pfizer, Moderna). Some deliver the spike protein itself (Novavax). Some use another harmless virus to ferry in the instructions for making it, like a Trojan horse (Johnson & Johnson, Oxford-AstraZeneca, Sputnik V). Some offer up whole inactivated virus (Sinopharm, Sinovac).
In a study published in March, researchers from the National Institutes for Food and Drug Control in China tested combinations of four different covid-19 vaccines in mice, and found that some did improve immune response. When they first gave the rodents a vaccine that relies on a harmless cold virus to smuggle in the instructions and then a second dose of a different type of vaccine, they saw higher antibody levels and a better T-cell response. But when they reversed the order, giving the viral vaccine second, they did not see an improvement.
Why combining shots might improve efficacy is a bit of a mystery, says Shan Lu, a physician and vaccine researcher at the University of Massachusetts Medical School who pioneered this mixing strategy. “The mechanism we can explain partially, but we don’t fully understand.” Different vaccines present the same information in slightly different ways. Those differences might awaken different parts of the immune system or sharpen the immune response. This strategy might also make immunity last longer.
Whether those results translate to humans remains to be seen. Researchers at Oxford University have launched a human trial to test just how mixing might work. The study, called Com-CoV, offers participants a first shot of Pfizer or Oxford-AstraZeneca. For their second dose, they will either get the same vaccine or a shot of Moderna or Novavax. The first results should be available in the coming weeks.
Other studies are under way as well. In Spain, where Oxford-AstraZeneca is now being given only to people over 60, researchers plan to recruit 600 people to test whether a first dose of the shot can be paired with a second dose from Pfizer. According to reporting in El País, about a million people received the first dose of the vaccine but aren’t old enough to receive the second dose. Health officials are waiting for the results of this study before issuing recommendations for this group, but it’s not clear whether any participants have yet been recruited.
Late last year Oxford-AstraZeneca announced that it would partner with Russia’s Gamaleya Institute, which developed Sputnik V vaccine, to test how the two shots work in combination. The trial was supposed to launch in March and provide interim results in May, but it’s not clear whether it has actually begun. And Chinese officials have hinted that they’ll explore mixing vaccines to boost the efficacy of their shots.
The biggest gains might come from mixing vaccines that have lower efficacies. The mRNA vaccines from Pfizer and Moderna provide excellent protection. “I don’t think there’s reason to mess with that,” says Donna Farber, an immunologist at Columbia University. But mixing might improve protection for some of the vaccines that have reported lower levels of protection, like Oxford-AstraZeneca and Johnson & Johnson, as well as some of the Chinese vaccines. Many of these vaccines work quite well, but mixing might help them work even better.
The US’s online language gaps are an urgent problem for Asian-Americans
Chen says that while content moderation policies from Facebook, Twitter, and others succeeded in filtering out some of the most obvious English-language disinformation, the system often misses such content when it’s in other languages. That work instead had to be done by volunteers like her team, who looked for disinformation and were trained to defuse it and minimize its spread. “Those mechanisms meant to catch certain words and stuff don’t necessarily catch that dis- and misinformation when it’s in a different language,” she says.
Google’s translation services and technologies such as Translatotron and real-time translation headphones use artificial intelligence to convert between languages. But Xiong finds these tools inadequate for Hmong, a deeply complex language where context is incredibly important. “I think we’ve become really complacent and dependent on advanced systems like Google,” she says. “They claim to be ‘language accessible,’ and then I read it and it says something totally different.”
(A Google spokesperson admitted that smaller languages “pose a more difficult translation task” but said that the company has “invested in research that particularly benefits low-resource language translations,” using machine learning and community feedback.)
All the way down
The challenges of language online go beyond the US—and down, quite literally, to the underlying code. Yudhanjaya Wijeratne is a researcher and data scientist at the Sri Lankan think tank LIRNEasia. In 2018, he started tracking bot networks whose activity on social media encouraged violence against Muslims: in February and March of that year, a string of riots by Sinhalese Buddhists targeted Muslims and mosques in the cities of Ampara and Kandy. His team documented “the hunting logic” of the bots, catalogued hundreds of thousands of Sinhalese social media posts, and took the findings to Twitter and Facebook. “They’d say all sorts of nice and well-meaning things–basically canned statements,” he says. (In a statement, Twitter says it uses human review and automated systems to “apply our rules impartially for all people in the service, regardless of background, ideology, or placement on the political spectrum.”)
When contacted by MIT Technology Review, a Facebook spokesperson said the company commissioned an independent human rights assessment of the platform’s role in the violence in Sri Lanka, which was published in May 2020, and made changes in the wake of the attacks, including hiring dozens of Sinhala and Tamil-speaking content moderators. “We deployed proactive hate speech detection technology in Sinhala to help us more quickly and effectively identify potentially violating content,” they said.
When the bot behavior continued, Wijeratne grew skeptical of the platitudes. He decided to look at the code libraries and software tools the companies were using, and found that the mechanisms to monitor hate speech in most non-English languages had not yet been built.
“Much of the research, in fact, for a lot of languages like ours has simply not been done yet,” Wijeratne says. “What I can do with three lines of code in Python in English literally took me two years of looking at 28 million words of Sinhala to build the core corpuses, to build the core tools, and then get things up to that level where I could potentially do that level of text analysis.”
After suicide bombers targeted churches in Colombo, the Sri Lankan capital, in April 2019, Wijeratne built a tool to analyze hate speech and misinformation in Sinhala and Tamil. The system, called Watchdog, is a free mobile application that aggregates news and attaches warnings to false stories. The warnings come from volunteers who are trained in fact-checking.
Wijeratne stresses that this work goes far beyond translation.
“Many of the algorithms that we take for granted that are often cited in research, in particular in natural-language processing, show excellent results for English,” he says. “And yet many identical algorithms, even used on languages that are only a few degrees of difference apart—whether they’re West German or from the Romance tree of languages—may return completely different results.”
Natural-language processing is the basis of automated content moderation systems. Wijeratne published a paper in 2019 that examined the discrepancies between their accuracy in different languages. He argues that the more computational resources that exist for a language, like data sets and web pages, the better the algorithms can work. Languages from poorer countries or communities are disadvantaged.
“If you’re building, say, the Empire State Building for English, you have the blueprints. You have the materials,” he says. “You have everything on hand and all you have to do is put this stuff together. For every other language, you don’t have the blueprints.
“You have no idea where the concrete is going to come from. You don’t have steel and you don’t have the workers, either. So you’re going to be sitting there tapping away one brick at a time and hoping that maybe your grandson or your granddaughter might complete the project.”
The movement to provide those blueprints is known as language justice, and it is not new. The American Bar Association describes language justice as a “framework” that preserves people’s rights “to communicate, understand, and be understood in the language in which they prefer and feel most articulate and powerful.”
We reviewed three at-home covid-19 tests. Here’s what happened
As a result, I don’t think home tests are as useful as some have hoped. If used at scale to screen for covid, they could send millions of anxious people in search of lab tests and medical care they don’t need.
As the covid-19 pandemic spread around the globe last year, economists and scientists called for massive expansion of testing and contact tracing in the US, to find and isolate infected people. But the number of daily tests in the US has never much exceeded 2 million, according to the Covid Tracking Project, and most of those were done in labs or on special instruments.
Home tests will now be manufactured in the tens of millions, say their makers, but some experts aren’t sure how much they will matter at this point. “The real value of these tests was six months ago,” says Amitabh Chandra, a professor at Harvard University’s Kennedy School. “I think that the move to over-the-counter is great, but it has limited value in a world where vaccines become more widely available.” Vaccination credentials could be more important for travel and dining than test results are.
Companies selling the tests say they are still a relevant strategy for getting back to normal, especially given that kids aren’t getting vaccinated yet. For employers who want to keep an office or factory open, they say, self-directed consumer tests might be a good option. A spokesperson for Abbott told me that they might also help people “start thinking about coordinating more covid-conscious bridal showers, baby showers, or birthday parties.”
The UK government started giving away covid antigen tests for free, by mail and on street corners, on April 9, saying it wants people “to get in the habit” of testing themselves twice a week as social distancing restrictions are eased. Along with vaccines, free tests are part of that nation’s plan to quash the virus. Later, though, a leaked government memo said health officials were privately worried about a tsunami of false positives.
In the US, there’s no still no national campaign around home tests or subsidy for them, and as an out-of-pocket expense, they are still too expensive for most people to use with any frequency. That may be for the best, given my experience.
Types of tests
The three tests we tried included two antigen tests, BinaxNow from Abbott Laboratories and a kit from Ellume, as well as one molecular test, called Lucira. In general, molecular tests, which detect the genes of the coronavirus, are more reliable than antigen tests, which sense the presence of the virus’s outer shell.
Everything you need is in one box, except in the case of the Ellume test, which must be paired with an app. Overall, the Lucira test had the best combination of advertised accuracy and simplicity, but it was also the most expensive at $55.
We didn’t try Quidel QuickVue, another antigen test, or a molecular test from Cue Health. Those tests, while authorized for home use, are not being sold directly to the public yet.
After trying all the tests, I am not planning to invest in using them regularly. I work from home and don’t socialize, so I don’t really need to. Instead, I plan to keep at least one test in my cupboard so that if I do feel sick, or lose my sense of smell, I will be able to quickly find out whether it’s covid-19. The ability to test at home might become more important next winter when cold and flu season returns.
BinaxNow by Abbott
Time required: about 20 minutes
Price: $23.99 for two
Availability: At some CVS stores starting in April. Abbott says it is making tens of millions of BinaxNow tests per month.
Accuracy: 84.6% for detecting covid-19 infections, 98.5% for correctly identifying covid-19 negatives
This is the at-home version of the fast, 15-minute test the White House was using last year to screen staff and visitors. It’s an antigen test, meaning that it examines a sample from a nasal swab to detect a protein in the shell of the virus. It went on sale in the US last week, and I was able to buy a two-test kit at CVS for $23.99 plus tax.
The technology used is called a “lateral flow immunoassay.” In simple terms, that means it works like a pregnancy test. It’s basically a paper card with a test strip. As the sample flows through it, it hits antibodies that stick to the virus protein and then to a colored marker. If the virus is present, a pink bar appears on the strip.
I found the test fairly easy to perform. You use an eye dropper to dispense six drops of chemical into a small hole in the card; then you insert a swab after you’ve run it around in both nostrils. Rotate the swab counterclockwise, fold the card to bring the test strip in contact with the swab, and that’s it. Fifteen minutes later, a positive result will show up as a faint pink line.
The drawback of the test is that there’s room for two different kinds of user error. It’s hard to see the drops come out of the dropper, and using too few could cause a false negative. So could swabbing your nose incorrectly. Unlike the other tests, this one can’t tell if you’ve made a mistake.
And besides the prospect of user error, the test itself has issues with accuracy. BinaxNow is the cheapest test out there, but it’s also the most likely to be wrong, missing about one in seven real infections. Abbott cautions that results “should be treated as presumptive” and “do not rule out SARS-Cov-2.”
But a buyer won’t find the accuracy rate without digging into the fine print. The company also buries a crucial requirement imposed by regulators: to compensate for the lower accuracy, you are supposed to use both tests in the kit, at least 36 hours apart. I doubt a casual buyer will realize that. The two-test requirement is barely mentioned in the instructions.