Developing the capacity to annotate massive volumes of data while maintaining quality is a function of the model development lifecycle that enterprises often underestimate. It’s resource intensive and requires specialized expertise.
At the heart of any successful machine learning/artificial intelligence (ML/AI) initiative is a commitment to high-quality training data and a pathway to quality data that is proven and well-defined. Without this quality data pipeline, the initiative is doomed to fail.
Computer vision or data science teams often turn to external partners to develop their data training pipeline, and these partnerships drive model performance.
There is no one definition of quality: “quality data” is completely contingent on the specific computer vision or machine learning project. However, there is a general process all teams can follow when working with an external partner, and this path to quality data can be broken down into four prioritized phases.
Annotation criteria and quality requirements
Training data quality is an evaluation of a data set’s fitness to serve its purpose in a given ML/AI use case.
The computer vision team needs to establish an unambiguous set of rules that describe what quality means in the context of their project. Annotation criteria are the collection of rules that define which objects to annotate, how to annotate them correctly, and what the quality targets are.
Accuracy or quality targets define the lowest acceptable result for evaluation metrics like accuracy, recall, precision, F1 score, et cetera. Typically, a computer vision team will have quality targets for how accurately objects of interest were classified, how accurately objects were localized, and how accurately relationships between objects were identified.
Workforce training and platform configuration
Platform configuration. Task design and workflow setup require time and expertise, and accurate annotation requires task-specific tools. At this stage, data science teams need a partner with expertise to help them determine how best to configure labeling tools, classification taxonomies, and annotation interfaces for accuracy and throughput.
Worker testing and scoring. To accurately label data, annotators need a well-designed training curriculum so they fully understand the annotation criteria and domain context. The annotation platform or external partner should ensure accuracy by actively tracking annotator proficiency against gold data tasks or when a judgement is modified by a higher-skilled worker or admin.
Ground truth or gold data. Ground truth data is crucial at this stage of the process as the baseline to score workers and measure output quality. Many computer vision teams are already working with a ground truth data set.
Sources of authority and quality assurance
There is no one-size-fits-all quality assurance (QA) approach that will meet the quality standards of all ML use cases. Specific business objectives, as well as the risk associated with an under-performing model, will drive quality requirements. Some projects reach target quality using multiple annotators. Others require complex reviews against ground truth data or escalation workflows with verification from a subject matter expert.
There are two primary sources of authority that can be used to measure the quality of annotations and that are used to score workers: gold data and expert review.
- Gold data: The gold data or ground truth set of records can be used both as a qualification tool for testing and scoring workers at the outset of the process and also as the measure for output quality. When you use gold data to measure quality, you compare worker annotations to your expert annotations for the same data set, and the difference between these two independent, blind answers can be used to produce quantitative measurements like accuracy, recall, precision, and F1 scores.
- Expert review: This method of quality assurance relies on expert review from a highly skilled worker, an admin, or from an expert on the customer side, sometimes all three. It can be used in conjunction with gold data QA. The expert reviewer looks at the answer given by the qualified worker and either approves it or makes corrections as needed, producing a new correct answer. Initially, an expert review may take place for every single instance of labeled data, but over time, as worker quality improves, expert review can utilize random sampling for ongoing quality control.
Iterating on data success
Once a computer vision team has successfully launched a high quality training data pipeline, it can accelerate progress to a production ready model. Through ongoing support, optimization, and quality control, an external partner can help them:
- Track velocity: In order to scale effectively, it’s good to measure annotation throughput. How long is it taking data to move through the process? Is the process getting faster?
- Tune worker training: As the project scales, labeling and quality requirements may evolve. This necessitates ongoing workforce training and scoring.
- Train on edge cases: Over time, training data should include more and more edge cases in order to make your model as accurate and robust as possible.
Without high-quality training data, even the best funded, most ambitious ML/AI projects cannot succeed. Computer vision teams need partners and platforms they can trust to deliver the data quality they need and to power life-changing ML/AI models for the world.
Alegion is the proven partner to build the training data pipeline that will fuel your model throughout its lifecycle. Contact Alegion at [email protected].
This content was produced by Alegion. It was not written by MIT Technology Review’s editorial staff.
A nonprofit promised to preserve wildlife. Then it made millions claiming it could cut down trees
Clegern said the program’s safeguards prevent the problems identified by CarbonPlan.
California’s offsets are considered additional carbon reductions because the floor serves “as a conservative backstop,” Clegern said. Without it, he explained, many landowners could have logged to even lower levels in the absence of offsets.
Clegern added that the agency’s rules were adopted as a result of a lengthy process of debate and were upheld by the courts. A California Court of Appeal found the Air Resources Board had the discretion to use a standardized approach to evaluate whether projects were additional.
But the court did not make an independent determination about the effectiveness of the standard, and was “quite deferential to the agency’s judgment,” said Alice Kaswan, a law professor at the University of San Francisco School of Law, in an email.
California law requires the state’s cap-and-trade regulations to ensure that emissions reductions are “real, permanent, quantifiable, verifiable” and “in addition to any other greenhouse gas emission reduction that otherwise would occur.”
“If there’s new scientific information that suggests serious questions about the integrity of offsets, then, arguably, CARB has an ongoing duty to consider that information and revise their protocols accordingly,” Kaswan said. “The agency’s obligation is to implement the law, and the law requires additionality.”
On an early spring day, Lautzenheiser, the Audubon scientist, brought a reporter to a forest protected by the offset project. The trees here were mainly tall white pines mixed with hemlocks, maples and oaks. Lautzenheiser is usually the only human in this part of the woods, where he spends hours looking for rare plants or surveying stream salamanders.
The nonprofit’s planning documents acknowledge that the forests enrolled in California’s program were protected long before they began generating offsets: “A majority of the project area has been conserved and designated as high conservation value forest for many years with deliberate management focused on long-term natural resource conservation values.”
Meet Jennifer Daniel, the woman who decides what emoji we get to use
Emoji are now part of our language. If you’re like most people, you pepper your texts, Instagram posts, and TikTok videos with various little images to augment your words—maybe the syringe with a bit of blood dripping from it when you got your vaccination, the prayer (or high-fiving?) hands as a shortcut to “thank you,” a rosy-cheeked smiley face with jazz hands for a covid-safe hug from afar. Today’s emoji catalogue includes nearly 3,000 illustrations representing everything from emotions to food, natural phenomena, flags, and people at various stages of life.
Behind all those symbols is the Unicode Consortium, a nonprofit group of hardware and software companies aiming to make text and emoji readable and accessible to everyone. Part of their goal is to make languages look the same on all devices; a Japanese character should be typographically consistent across all media, for example. But Unicode is probably best known for being the gatekeeper of emoji: releasing them, standardizing them, and approving or rejecting new ones.
Jennifer Daniel is the first woman at the helm of the Emoji Subcommittee for the Unicode Consortium and a fierce advocate for inclusive, thoughtful emoji. She initially rose to prominence for introducing Mx. Claus, a gender-inclusive alternative to Santa and Mrs. Claus; a non-gendered person breastfeeding a non-gendered baby; and a masculine face wearing a bridal veil.
Now she’s on a mission to bring emoji to a post-pandemic future in which they are as broadly representative as possible. That means taking on an increasingly public role, whether it’s with her popular and delightfully nerdy Substack newsletter, What Would Jennifer Do? (in which she analyzes the design process for upcoming emoji), or inviting the general public to submit concerns about emoji and speak up if they aren’t representative or accurate.
“There isn’t a precedent here,” Daniel says of her job. And to Daniel, that’s exciting not just for her but for the future of human communication.
I spoke to her about how she sees her role and the future of emoji. The interview has been lightly edited and condensed.
What does it mean to chair the subcommittee on emoji? What do you do?
It’s not sexy. [laughs] A lot of it is managing volunteers [the committee is composed of volunteers who review applications and help in approval and design]. There’s a lot of paperwork. A lot of meetings. We meet twice a week.
I read a lot and talk to a lot of people. I recently talked to a gesture linguist to learn how people use their hands in different cultures. How do we make better hand-gesture emoji? If the image is no good or isn’t clear, it’s a dealbreaker. I’m constantly doing lots of research and consulting with different experts. I’ll be on the phone with a botanical garden about flowers, or a whale expert to get the whale emoji right, or a cardiovascular surgeon so we have the anatomy of the heart down.
There’s an old essay by Beatrice Warde about typography. She asked if a good typeface is a bedazzled crystal goblet or a transparent one. Some would say the ornate one because it’s so fancy, and others would say the crystal goblet because you can see and appreciate the wine. With emoji, I lend myself more to the “transparent crystal goblet” philosophy.
Why should we care about how our emoji are designed?
My understanding is that 80% of communication is nonverbal. There’s a parallel in how we communicate. We text how we talk. It’s informal, it’s loose. You’re pausing to take a breath. Emoji are shared alongside words.
When emoji first came around, we had the misconception that they were ruining language. Learning a new language is really hard, and emoji is kind of like a new language. It works with how you already communicate. It evolves as you evolve. How you communicate and present yourself evolves, just like yourself. You can look at the nearly 3,000 emoji and it [their interpretation] changes by age or gender or geographic area. When we talk to someone and are making eye contact, you shift your body language, and that’s an emotional contagion. It builds empathy and connection. It gives you permission to reveal that about yourself. Emoji can do that, all in an image.
Product design gets an AI makeover
It’s a tall order, but one that Zapf says artificial intelligence (AI) technology can support by capturing the right data and guiding engineers through product design and development.
No wonder a November 2020 McKinsey survey reveals that more than half of organizations have adopted AI in at least one function, and 22% of respondents report at least 5% of their companywide earnings are attributable to AI. And in manufacturing, 71% of respondents have seen a 5% or more increase in revenue with AI adoption.
But that wasn’t always the case. Once “rarely used in product development,” AI has experienced an evolution over the past few years, Zapf says. Today, tech giants known for their innovations in AI, such as Google, IBM, and Amazon, “have set new standards for the use of AI in other processes,” such as engineering.
“AI is a promising and exploratory area that can significantly improve user experience for designing engineers, as well as gather relevant data in the development process for specific applications,” says Katrien Wyckaert, director of industry solutions for Siemens Industry Software.
The result is a growing appreciation for a technology that promises to simplify complex systems, get products to market faster, and drive product innovation.
Simplifying complex systems
A perfect example of AI’s power to overhaul product development is Renault. In response to increasing consumer demand, the French automaker is equipping a growing number of new vehicle models with an automated manual transmission (AMT)—a system that behaves like an automatic transmission but allows drivers to shift gears electronically using a push-button command.
AMTs are popular among consumers, but designing them can present formidable challenges. That’s because an AMT’s performance depends on the operation of three distinct subsystems: an electro-mechanical actuator that shifts the gears, electronic sensors that monitor vehicle status, and software embedded in the transmission control unit, which controls the engine. Because of this complexity, it can take up to a year of extensive trial and error to define the system’s functional requirements, design the actuator mechanics, develop the necessary software, and validate the overall system.
In an effort to streamline its AMT development process, Renault turned to Simcenter Amesim software from Siemens Digital Industries Software. The simulation technology relies on artificial neural networks, AI “learning” systems loosely modeled on the human brain. Engineers simply drag, drop, and connect icons to graphically create a model. When displayed on a screen as a sketch, the model illustrates the relationship between all the various elements of an AMT system. In turn, engineers can predict the behavior and performance of the AMT and make any necessary refinements early in the development cycle, avoiding late-stage problems and delays. In fact, by using a virtual engine and transmissions as stand-ins while developing hardware, Renault has managed to cut its AMT development time almost in half.
Speed without sacrificing quality
So, too, are emerging environmental standards prompting Renault to rely more heavily on AI. To comply with emerging carbon dioxide emissions standards, Renault has been working on the design and development of hybrid vehicles. But hybrid engines are far more complex to develop than those found in vehicles with a single energy source, such as a conventional car. That’s because hybrid engines require engineers to perform complex feats like balancing the power required from multiple energy sources, choosing from a multitude of architectures, and examining the impact of transmissions and cooling systems on a vehicle’s energy performance.
“To meet new environmental standards for a hybrid engine, we must completely rethink the architecture of gasoline engines,” says Vincent Talon, head of simulation at Renault. The problem, he adds, is that carefully examining “the dozens of different actuators that can influence the final results of fuel consumption and pollutant emissions” is a lengthy and complex process, made by more difficult by rigid timelines.
“Today, we clearly don’t have the time to painstakingly evaluate various hybrid powertrain architectures,” says Talon. “Rather, we needed to use an advanced methodology to manage this new complexity.”
For more on AI in industrial applications, visit www.siemens.com/artificialintelligence.
Download the full report.
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.