Fully vaccinated Americans can now gather indoors, maskless and without distancing—as long as it’s with others who’ve gotten their shots, according to new guidance from the Centers for Disease Control and Prevention.
The advice, which comes as vaccinations continue to gain speed in America, are a positive signal for those who have had a course of shots. But it shows there’s a lot we still don’t know about how the virus behaves—and leaves plenty of questions about who can do what, and what’s fair.
Three things the new CDC guidance says
- Indoor, maskless, and non-distanced gatherings are okay, as long as individuals have been fully vaccinated for at least two weeks. The CDC says medium and large gatherings should still be avoided, although it doesn’t specify a number of people for a small gathering.
- In public, keep your mask on and continue to distance from others. When you’re out and about in your community, on the train or at the grocery store, you might cross paths with people who haven’t been vaccinated yet.
- Vaccinated and unvaccinated people can gather together, with limitations. If you’re vaccinated, the CDC says you can visit indoors unmasked with unvaccinated people from one other household. There are important considerations discussed below, like the health profiles of the unvaccinated people involved.
Three things that are still unanswered
- Whether vaccinated people are still considered a serious transmission risk. We know that vaccinated individuals are much less likely to become infected, and much less likely to transmit the virus. But it’s crucial for vaccinated people to understand that interacting with others who haven’t been vaccinated or infected carries “an undefined, finite risk,” says Thomas Russo, professor of infectious diseases at the University of Buffalo. That risk of transmission may be decreased, but it’s probably not zero.
- Whether vaccines can prevent long-term effects of covid-19—and what they are. All vaccines approved for emergency use in the US have proved to be highly effective at preventing death, but we’re still learning about the long-term effects of covid-19. Even people with relatively minor cases could still battle symptoms for weeks or months. The safest bet, Russo says, is to do everything you can not to get infected.
- What your personal risk tolerance should be. Though the CDC guidelines say unmasked indoor gatherings are acceptable between a vaccinated person and unvaccinated people from one household, there’s a big caveat: whether anyone in the unvaccinated household is at an increased risk for severe illness from covid-19.
Even if you read up on the health conditions that are proved to increase risk, “there are still people that end up getting severe disease for reasons that we’re not certain about,” Russo says. “[The guidelines] count on the public to sort that out.” That risk calculation may be especially tricky if you live with some people who are vaccinated but others who aren’t. Russo, who is in a mixed household, says he is taking a conservative approach and being as careful as possible.
More of the same … for now
Though these new guidelines might give some families the peace of mind to organize much-needed visits with grandparents, not much changed today for the vast majority of the US—particularly for people of color. A New York Times analysis found Black people were undervaccinated relative to their population in each one of the 38 states that report on race and ethnicity for vaccinations. A gap exists for Hispanic people, too. And though the new CDC guidance applies only to private activities—not large-scale public reopening—bioethicists have warned that using vaccination status as a prerequisite to participating in reopening could further entrench existing racial inequities.
“We need to make every effort [to ensure] that the vaccination process is equitable and fair,” Russo says. “And we’re still struggling.”
This story is part of the Pandemic Technology Project, supported by the Rockefeller Foundation.
Anti-vaxxers are weaponizing Yelp to punish bars that require vaccine proof
Smith’s Yelp reviews were shut down after the sudden flurry of activity on its page, which the company labels “unusual activity alerts,” a stopgap measure for both the business and Yelp to filter through a flood of reviews and pick out which are spam and which aren’t. Noorie Malik, Yelp’s vice president of user operations, said Yelp has a “team of moderators” that investigate pages that get an unusual amount of traffic. “After we’ve seen activity dramatically decrease or stop, we will then clean up the page so that only firsthand consumer experiences are reflected,” she said in a statement.
It’s a practice that Yelp has had to deploy more often over the course of the pandemic: According to Yelp’s 2020 Trust & Safety Report, the company saw a 206% increase over 2019 levels in unusual activity alerts. “Since January 2021, we’ve placed more than 15 unusual activity alerts on business pages related to a business’s stance on covid-19 vaccinations,” said Malik.
The majority of those cases have been since May, like the gay bar C.C. Attles in Seattle, which got an alert from Yelp after it made patrons show proof of vaccination at the door. Earlier this month, Moe’s Cantina in Chicago’s River North neighborhood got spammed after it attempted to isolate vaccinated customers from unvaccinated ones.
Spamming a business with one-star reviews is not a new tactic. In fact, perhaps the best-known case is Colorado’s Masterpiece bakery, which won a 2018 Supreme Court battle for refusing to make a wedding cake for a same-sex couple, after which it got pummeled by one-star reviews. “People are still writing fake reviews. People will always write fake reviews,” Liu says.
But he adds that today’s online audience know that platforms use algorithms to detect and flag problematic words, so bad actors can mask their grievances by blaming poor restaurant service like a more typical negative review to ensure the rating stays up — and counts.
That seems to have been the case with Knapp’s bar. His Yelp review included comments like “There was hair in my food” or alleged cockroach sightings. “Really ridiculous, fantastic shit,” Knapp says. “If you looked at previous reviews, you would understand immediately that this doesn’t make sense.”
Liu also says there is a limit to how much Yelp can improve their spam detection, since natural language — or the way we speak, read, and write — “is very tough for computer systems to detect.”
But Liu doesn’t think putting a human being in charge of figuring out which reviews are spam or not will solve the problem. “Human beings can’t do it,” he says. “Some people might get it right, some people might get it wrong. I have fake reviews on my webpage and even I can’t tell which are real or not.”
You might notice that I’ve only mentioned Yelp reviews thus far, despite the fact that Google reviews — which appear in the business description box on the right side of the Google search results page under “reviews” — is arguably more influential. That’s because Google’s review operations are, frankly, even more mysterious.
While businesses I spoke to said Yelp worked with them on identifying spam reviews, none of them had any luck with contacting Google’s team. “You would think Google would say, ‘Something is fucked up here,’” Knapp says. “These are IP addresses from overseas. It really undermines the review platform when things like this are allowed to happen.”
These creepy fake humans herald a new age in AI
Once viewed as less desirable than real data, synthetic data is now seen by some as a panacea. Real data is messy and riddled with bias. New data privacy regulations make it hard to collect. By contrast, synthetic data is pristine and can be used to build more diverse data sets. You can produce perfectly labeled faces, say, of different ages, shapes, and ethnicities to build a face-detection system that works across populations.
But synthetic data has its limitations. If it fails to reflect reality, it could end up producing even worse AI than messy, biased real-world data—or it could simply inherit the same problems. “What I don’t want to do is give the thumbs up to this paradigm and say, ‘Oh, this will solve so many problems,’” says Cathy O’Neil, a data scientist and founder of the algorithmic auditing firm ORCAA. “Because it will also ignore a lot of things.”
Realistic, not real
Deep learning has always been about data. But in the last few years, the AI community has learned that good data is more important than big data. Even small amounts of the right, cleanly labeled data can do more to improve an AI system’s performance than 10 times the amount of uncurated data, or even a more advanced algorithm.
That changes the way companies should approach developing their AI models, says Datagen’s CEO and cofounder, Ofir Chakon. Today, they start by acquiring as much data as possible and then tweak and tune their algorithms for better performance. Instead, they should be doing the opposite: use the same algorithm while improving on the composition of their data.
But collecting real-world data to perform this kind of iterative experimentation is too costly and time intensive. This is where Datagen comes in. With a synthetic data generator, teams can create and test dozens of new data sets a day to identify which one maximizes a model’s performance.
To ensure the realism of its data, Datagen gives its vendors detailed instructions on how many individuals to scan in each age bracket, BMI range, and ethnicity, as well as a set list of actions for them to perform, like walking around a room or drinking a soda. The vendors send back both high-fidelity static images and motion-capture data of those actions. Datagen’s algorithms then expand this data into hundreds of thousands of combinations. The synthesized data is sometimes then checked again. Fake faces are plotted against real faces, for example, to see if they seem realistic.
Datagen is now generating facial expressions to monitor driver alertness in smart cars, body motions to track customers in cashier-free stores, and irises and hand motions to improve the eye- and hand-tracking capabilities of VR headsets. The company says its data has already been used to develop computer-vision systems serving tens of millions of users.
It’s not just synthetic humans that are being mass-manufactured. Click-Ins is a startup that uses synthetic AI to perform automated vehicle inspections. Using design software, it re-creates all car makes and models that its AI needs to recognize and then renders them with different colors, damages, and deformations under different lighting conditions, against different backgrounds. This lets the company update its AI when automakers put out new models, and helps it avoid data privacy violations in countries where license plates are considered private information and thus cannot be present in photos used to train AI.
Mostly.ai works with financial, telecommunications, and insurance companies to provide spreadsheets of fake client data that let companies share their customer database with outside vendors in a legally compliant way. Anonymization can reduce a data set’s richness yet still fail to adequately protect people’s privacy. But synthetic data can be used to generate detailed fake data sets that share the same statistical properties as a company’s real data. It can also be used to simulate data that the company doesn’t yet have, including a more diverse client population or scenarios like fraudulent activity.
Proponents of synthetic data say that it can help evaluate AI as well. In a recent paper published at an AI conference, Suchi Saria, an associate professor of machine learning and health care at Johns Hopkins University, and her coauthors demonstrated how data-generation techniques could be used to extrapolate different patient populations from a single set of data. This could be useful if, for example, a company only had data from New York City’s more youthful population but wanted to understand how its AI performs on an aging population with higher prevalence of diabetes. She’s now starting her own company, Bayesian Health, which will use this technique to help test medical AI systems.
The limits of faking it
But is synthetic data overhyped?
When it comes to privacy, “just because the data is ‘synthetic’ and does not directly correspond to real user data does not mean that it does not encode sensitive information about real people,” says Aaron Roth, a professor of computer and information science at the University of Pennsylvania. Some data generation techniques have been shown to closely reproduce images or text found in the training data, for example, while others are vulnerable to attacks that make them fully regurgitate that data.
This might be fine for a firm like Datagen, whose synthetic data isn’t meant to conceal the identity of the individuals who consented to be scanned. But it would be bad news for companies that offer their solution as a way to protect sensitive financial or patient information.
Clinical trials are better, faster, cheaper with big data
“One of the most difficult parts of my job is enrolling patients into studies,” says Nicholas Borys, chief medical officer for Lawrenceville, N.J., biotechnology company Celsion, which develops next-generation chemotherapy and immunotherapy agents for liver and ovarian cancers and certain types of brain tumors. Borys estimates that fewer than 10% of cancer patients are enrolled in clinical trials. “If we could get that up to 20% or 30%, we probably could have had several cancers conquered by now.”
Clinical trials test new drugs, devices, and procedures to determine whether they’re safe and effective before they’re approved for general use. But the path from study design to approval is long, winding, and expensive. Today,researchers are using artificial intelligence and advanced data analytics to speed up the process, reduce costs, and get effective treatments more swiftly to those who need them. And they’re tapping into an underused but rapidly growing resource: data on patients from past trials
Building external controls
Clinical trials usually involve at least two groups, or “arms”: a test or experimental arm that receives the treatment under investigation, and a control arm that doesn’t. A control arm may receive no treatment at all, a placebo or the current standard of care for the disease being treated, depending on what type of treatment is being studied and what it’s being compared with under the study protocol. It’s easy to see the recruitment problem for investigators studying therapies for cancer and other deadly diseases: patients with a life-threatening condition need help now. While they might be willing to take a risk on a new treatment, “the last thing they want is to be randomized to a control arm,” Borys says. Combine that reluctance with the need to recruit patients who have relatively rare diseases—for example, a form of breast cancer characterized by a specific genetic marker—and the time to recruit enough people can stretch out for months, or even years. Nine out of 10 clinical trials worldwide—not just for cancer but for all types of conditions—can’t recruit enough people within their target timeframes. Some trials fail altogether for lack of enough participants.
What if researchers didn’t need to recruit a control group at all and could offer the experimental treatment to everyone who agreed to be in the study? Celsion is exploring such an approach with New York-headquartered Medidata, which provides management software and electronic data capture for more than half of the world’s clinical trials, serving most major pharmaceutical and medical device companies, as well as academic medical centers. Acquired by French software company Dassault Systèmes in 2019, Medidata has compiled an enormous “big data” resource: detailed information from more than 23,000 trials and nearly 7 million patients going back about 10 years.
The idea is to reuse data from patients in past trials to create “external control arms.” These groups serve the same function as traditional control arms, but they can be used in settings where a control group is difficult to recruit: for extremely rare diseases, for example, or conditions such as cancer, which are imminently life-threatening. They can also be used effectively for “single-arm” trials, which make a control group impractical: for example, to measure the effectiveness of an implanted device or a surgical procedure. Perhaps their most valuable immediate use is for doing rapid preliminary trials, to evaluate whether a treatment is worth pursuing to the point of a full clinical trial.
Medidata uses artificial intelligence to plumb its database and find patients who served as controls in past trials of treatments for a certain condition to create its proprietary version of external control arms. “We can carefully select these historical patients and match the current-day experimental arm with the historical trial data,” says Arnaub Chatterjee, senior vice president for products, Acorn AI at Medidata. (Acorn AI is Medidata’s data and analytics division.) The trials and the patients are matched for the objectives of the study—the so-called endpoints, such as reduced mortality or how long patients remain cancer-free—and for other aspects of the study designs, such as the type of data collected at the beginning of the study and along the way.
When creating an external control arm, “We do everything we can to mimic an ideal randomized controlled trial,” says Ruthie Davi, vice president of data science, Acorn AI at Medidata. The first step is to search the database for possible control arm candidates using the key eligibility criteria from the investigational trial: for example, the type of cancer, the key features of the disease and how advanced it is, and whether it’s the patient’s first time being treated. It’s essentially the same process used to select control patients in a standard clinical trial—except data recorded at the beginning of the past trial, rather than the current one, is used to determine eligibility, Davi says. “We are finding historical patients who would qualify for the trial if they existed today.”
Download the full report.
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.