By Mark Brayan, CEO, Appen (APX:ASX)
The rise of autonomous vehicles and the connected car has created an enormous business opportunity. We have already seen significant investment in self-driving technology as vendors race to help citizens and businesses enjoy the benefits of fewer road accidents, traffic reduction, Mobility-as-a-Service (MaaS), and improved logistics and haulage services.
Artificial intelligence (AI) and machine learning (ML) are powering this auto industry transformation, changing the way companies build cars and reshaping how customers will think about and ultimately buy or use those cars.
However, to create the car of the future – with world-class AI, ultra-fast connectivity and environmental impact in mind – manufacturers must consider and bring together an array of capabilities and processes, for utilizing massive amounts of high-quality data.
The data challenge for today’s building blocks of self-driving
In order for a car to “see,” “hear,” “understand,” “talk” and “think,” it needs video, image, audio, text, LIDAR and other sensor data to be correctly collected, structured and understood by its ML models. Today, teams that are busy trying to make these capabilities a reality for fully autonomous vehicles, driver-assistance features or any solution between those two often have to work with multiple vendors and applications to collect and label all the required data to effectively train the ML models.
However, training data is complicated enough without having to connect several dozen different data pipeline components and integrate dozens of APIs. Moreover, cars with autonomous and driver-assist capabilities not only need to abide by strict national and regional regulations, but also have to understand hundreds of languages and dialects, creating an exponentially more difficult challenge.
Putting all the building blocks together shouldn’t be this difficult.
The single-source approach to multi-modal AI needed for autonomous vehicles
With Automotive AI, niche vendors work fine until developers need to automate and integrate different systems and data types, creating bottlenecks and data compatibility issues, both for data collection and annotation. Building in-house systems creates its own problems as well. In addition to overcoming all the challenges related to rolling out AI, companies now need to allocate resources to build, maintain and improve non-core pieces of software.
Development teams that don’t have to struggle with integration obstacles or overcome data challenges related to the diversity of regulations and languages can bring their solutions to market faster. Running and combining multiple collection and annotation jobs already takes significant time. Doing this for 130 countries with more than 180 languages and dialects – each with its own datasets – will create processes that are very expensive and prone to latency and manual error.
By finding a single, proven, global source of reliable training data, developers can help reduce risks by automating multi-step projects, breaking complex tasks down into simple jobs, and routing them in a flexible fashion – while still coordinating and working in a single pipeline. The operational complexity of building and deploying world-class AI can be significantly reduced by performing 2D, 3D and audio annotations in a single process stream, especially if ML assistance can be applied to accelerate speech recognition, object and event detection, and complex annotation of LIDAR and radar data.
With a single source of data, development teams can be focused on model building and training, not on data collection and preparation or maintaining custom software.
Cars that work for everyone
World-class AI has to work for everyone, in every market. This is why developers of self-driving vehicle capabilities must think beyond simple efficiencies, speed and cost. They must remove bias from data, so AI recognizes everything and everyone equally. Top OEM and Tier 1 automotive suppliers must ensure their customers are safe and understood by the cars they are using, no matter their ethnicity, gender, age or geography.
What’s more, leading companies must consider their supply chain impact. An ethical-first approach to data that relies on unbiased AI, creates a positive global impact and offsets some of the disruption Level 5 self-driving will bring to the world.
Smart cars of the future
Companies building world-class AI for autonomous vehicles not only need to hire the best people for their teams, but also find the right partners to help them deliver a car that interacts well with everything and everyone, that keeps drivers and pedestrians safe, and that is built and improved using ethical data that is compliant with evolving privacy regulations (such as the GDPR and CCPA) and annotated by a global workforce that is treated fairly.
The smart car of the future will depend on AI, and AI is only as good as the data that powers it. This makes finding a single source of high-quality, reliable, ethical data a critical success factor.
At Appen, we bring our 25 years of training data experience and quality consistency to help accelerate self-driving capabilities. We have the annotation tools, including, ML-assisted LiDAR, Video, Events and Pixel level labeling, speech and natural language, interconnected with Workflows will deliver higher productivity to a market racing to name a winner.
- Our 1M multi-lingual Crowd and ability to scale
- On-prem solutions, end-to-end managed services
- Workflows for multistep labeling for multimodal requirements and complex labeling tasks
- Ethical AI – GDPR, CCPA, Fair Pay pledge, diversity and inclusion at the core of our AI training data practices