Is it still worth joining the autonomous driving industry in 2026? What types of people have an edge?

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

05/21 2026 449

In 2026, the autonomous driving industry enters a surreal phase. On one hand, news of the intelligent driving sector entering a knockout phase keeps surfacing. On the other hand, the salaries offered by companies to attract talent are becoming increasingly outrageous. Momenta offers interns a daily wage of 2,000 yuan, XPENG Motors' core intelligent driving positions can reach up to a million yuan in salary, and some high-end algorithm roles have annual salaries ranging from 800,000 to 2 million yuan. What exactly is happening in this sector? Is it still worth diving into now?

Intelligent driving has always been at the forefront of sharing technical content in the autonomous driving industry, but we haven't talked much about the job market. However, considering the rise and fall of the autonomous driving industry, we still want to discuss this topic and welcome everyone to leave their opinions in the comments section.

Does the change in technical architecture drive changes in job market demand?

In any industry, job roles are driven by industry needs. Since the emergence of the autonomous driving industry, its underlying technical architecture has undergone significant changes. Before diving into today's topic, let's first review the development of the autonomous driving industry. If you want to jump straight to today's main point, you can skip this section.

Rewinding to before 2023, the development approach for autonomous driving was entirely different from today. The past method involved breaking the system down into four independent modules: perception, prediction, planning, and control, each handled by a dedicated team, with data passed between modules through predefined interfaces. The perception module first identifies vehicles and pedestrians ahead, abstracts target information into rectangular boxes and coordinates, and passes this to the planning module; the planning module then uses these coordinates, combined with manually written rules, to determine whether to accelerate, decelerate, or turn. The core of this approach lies in the thousands of "if-then" logical rules written by engineers.

The biggest issue with this approach is information loss. The target boxes and coordinates output by the perception module have already discarded a vast amount of useful details (e.g., whether a pedestrian is looking back at the car or preparing to sprint), and once these subtle dynamics are simplified into a string of coordinates, the planning module has no way of knowing them. More troublingly, the complexity of the real world simply cannot be covered by manual enumeration, and as more patches are applied, the system only becomes increasingly cumbersome.

Around 2023, the autonomous driving industry began to collectively shift toward a new technical approach: end-to-end systems. Compared to traditional modular frameworks, end-to-end uses a unified deep neural network to directly output steering angles, throttle, and brake commands from raw sensor data (primarily video streams from cameras). It no longer requires abstracting what is seen into rules and then making decisions based on those rules—this intermediate step is eliminated, with the entire process seamlessly handled within the neural network.

Image source: Internet

Tesla is a highly representative automaker in this regard. Its FSD v12 version replaced over 300,000 lines of C++ code with a unified neural network, no longer relying on manually defined logic but instead learning driving behavior through vast amounts of human driving data. The larger and higher-quality the training data, the closer the model's performance becomes to natural human driving habits (related reading: Tesla FSD V14.3: Letting AI Drive Directly?). This shift from rule-driven to data-driven is the most profound technical restructuring in the autonomous driving field in recent years.

From an architectural standpoint, end-to-end approaches mainly fall into two categories. One is the global approach, using a single, massive model to cover the entire process from perception to control, with a unified architecture but extremely high computational demands. The other is the segmented approach, which retains some modular structure, such as dividing processing into perception and planning stages, with smaller parameter sizes and easier deployment on vehicles. Both approaches have their pros and cons, and currently, companies in the industry are using both, but the core trend is consistent: models are becoming larger and more unified, while the portion involving manual, handcrafted design is being continuously reduced.

After entering 2025, new changes began to accelerate, with a new technical concept emerging in the industry: the VLA (Vision-Language-Action) model. Previously, the industry primarily focused on mapping vision to action, i.e., inputting images and outputting controls. VLA adds a language component in between. Why add language? Because language is the natural carrier for humans to express common sense and reasoning. A model that can simultaneously understand images, language, and actions can interpret written information on the road, understand the meanings of traffic police gestures, receive passenger instructions in natural language, and make driving decisions within a unified framework.

Image source: Internet

Li Auto is an active promoter of the VLA approach. Its end-to-end + VLM (Visual Language Model) dual-system architecture, mass production (mass-produced) in 2024, is one of the earlier attempts in the industry to incorporate language understanding into mass-produced intelligent driving systems. XPENG Motors is equally aggressive, with its second-generation VLA system, released in 2026, first deployed on Robotaxi models, achieving cross-domain integration of highway intelligent driving, voice-controlled vehicle operation, and intelligent light signals.

VLA not only enables vehicles to understand human speech but also solves a deeper issue: while end-to-end models excel in intuitive driving, they lack logical reasoning capabilities for complex scenarios. Language, as a carrier for reasoning, allows the system to first deduce at the conceptual level before taking action.

Another important direction developing in parallel with VLA is the world model. The core function of the world model is to understand the operating laws of the physical world. Through large-scale video data pre-training, it learns to predict how the physical world will evolve in the next few seconds—for example, if brake lights light up ahead, it knows a traffic jam or accident is likely to follow; if a cardboard box rolls onto the road, it can deduce which direction the box will roll. This capability is beyond what pure end-to-end imitation learning can achieve.

The practical applications of world models have three layers. Pre-training allows the model to form a foundational understanding of physical laws through vast driving data; simulation can generate various long-tail scenarios for the system to repeatedly rehearse in a virtual environment; reinforcement learning uses the world model as a virtual training ground, allowing the system to continuously learn through trial and error, autonomously exploring optimal driving strategies through reward and punishment mechanisms. At the 2026 Beijing Auto Show, Momenta's R7 reinforcement learning world model, Huawei's Qian Kun ADS 5, and QCraft's Chengfeng Max all adopted this combination of world models and reinforcement learning, indicating rapid convergence in technical roadmaps across the industry.

The integration of world models, reinforcement learning, and end-to-end systems is upgrading autonomous driving from a pattern-matching system to one with a certain degree of understanding. Of course, this understanding is still fundamentally different from true human understanding, but it is far more reliable than pure rule-based matching or pure imitation learning when handling unseen complex scenarios. This is also the technical confidence behind the industry's bold discussions about L3 or even L4 mass production deployment.

Salaries are rising, but the money only flows to one type of person

The restructuring of technical architectures is also reshaping the talent market. At the forefront of intelligent driving, we believe that the autonomous driving industry in 2026 is not short of people—it's short of specific types of people. Data shows that China's total talent gap in intelligent connected vehicles reaches as high as 680,000, with a supply-demand ratio of only 0.38 for intelligent driving engineers, meaning only one suitable candidate can be found for every three positions in the market. This is not a sign of industry contraction but rather a structural supply shortage.

However, a large gap doesn't mean just anyone can get in. Data from Liepin for the first quarter of 2026 shows that roles requiring a master's degree or higher in the new energy vehicle industry grew by 67.65% year-on-year, while the proportion of roles requiring a college degree or below dropped by 22.79% year-on-year. Clearly, the bar is rising significantly. Additionally, hiring for pure application-layer software development roles plummeted by 74% year-on-year in 2026, reflecting the current reality: as the industry shifts from writing if-then rules to training large models, the demand for maintaining intermediate code through sheer manpower is disappearing.

So, where has the demand for roles gone? To the very top of the technical chain. From January to April 2026, the average monthly salary for AI scientists/leaders reached 132,800 yuan, making it the only role to break the 100,000-yuan monthly salary mark. The supply-demand ratio for navigation algorithms in intelligent driving dropped from 0.84 to 0.46, for planning and control algorithms it was 0.64, and for simulation application engineers, it was as low as 0.58. Meanwhile, salary increases for core technical talent during job-hopping generally ranged from 15% to 25%, with some top roles seeing increases exceeding 30%.

Why is there such extreme differentiation? The reason lies in how changes in technical roadmaps directly reshape job demands. As end-to-end large models replace traditional modular architectures, the intermediate steps that once required large numbers of engineers to manually maintain are now automatically absorbed by neural networks. The industry no longer needs as many people to write if-then rules; instead, it urgently needs those who can design, train, and optimize models.

So, after all this discussion, which directions in the autonomous driving industry are most in demand currently? Perception algorithm engineers are required to have a deep understanding of Transformer architectures and BEV perception technology, proficiency in PyTorch, and solid project experience in processing visual and LiDAR data. Planning and control algorithm engineers need to understand the mapping from perception features to control commands and be familiar with the basics of imitation learning and reinforcement learning. End-to-end/VLA algorithm engineers face the highest barriers, requiring candidates to have multimodal model design capabilities, large-scale data training experience, and on-vehicle deployment optimization experience. Talent in world model directions is also becoming a new scarce resource, especially engineers with compound skills in both model training and simulation system development.

As industry architectures change, so do the skill requirements for talent. In the traditional era, proficiency in C++ was nearly all that mattered for intelligent driving roles, as almost all mass-production code was written in C++. Today, however, hiring demands increasingly emphasize dual-language proficiency in C++ and Python—Python for algorithm prototyping and model training, C++ for on-vehicle engineering deployment. At the same time, familiarity with deep learning frameworks and a deep understanding of Transformer architecture principles have shifted from being bonus points to standard requirements. Additionally, the industry's demand for engineering capabilities is rising overall, with skills like understanding functional safety standards (ISO 26262), large-scale software system architecture design, and on-device inference optimization carrying significant weight in high-end roles.

Thus, the talent demand in the autonomous driving industry in 2026 can be summed up in one sentence: high-end algorithm roles are undergoing an arms race with no budget limits, while low-end software roles are being rapidly replaced by large models and automated tools. If you plan to enter this industry, the most important thing is to figure out which end of the chain you're aiming for.

The technical roadmap is still in flux—is this the window of opportunity?

Whether an industry is worth entering depends not just on money but also on whether it has settled into a fixed mold. When an industry's technical standards become rigid and everyone is doing roughly the same thing, opportunities for latecomers shrink significantly. The autonomous driving industry is clearly not at that stage yet. In 2026, the biggest technical highlight in the autonomous driving industry is that internal debates over technical roadmaps have not only failed to subside but have become even more public.

At NVIDIA's GTC conference this year, Li Chuanhai, CTO of Geely Automobile, and Cao Xudong, CEO of Momenta, publicly questioned the VLA approach, arguing that it merely matches standard answers and lacks true understanding of the physical world's laws. Meanwhile, Jin Yuzhi, CEO of Huawei's Automotive BU, opposed the practice of distilling cloud-based large VLA models into smaller on-vehicle models, arguing that model hallucinations could be fatal in driving scenarios. On the other side, companies like Yuanrong Qixing and XPENG Motors are doubling down on VLA, with XPENG releasing its second-generation VLA system in 2026 and claiming a 62% reduction in extreme-scenario takeover rates.

Yu Kai, CEO of Horizon Robotics, put it bluntly at the XuanYuan Automotive Blue Book Forum: "VLA, world models, end-to-end—often, commercial packaging outweighs technical substance. The top players truly in the game don't differ fundamentally in their technical roadmaps." But his words also highlight a key fact: the debate over autonomous driving roadmaps itself means there is no standard answer yet.

For practitioners, the lack of convergence in technical roadmaps means two things. On one hand, the technical direction you choose today might not be mainstream in two or three years, requiring continuous adaptation. On the other hand, precisely because there is no standard answer, newcomers still have opportunities to help shape technical roadmaps rather than merely acting as executors of established solutions. This window-of-opportunity state is relatively rare in the history of tech industry development.

Roles are changing, and so are skill requirements

If you look at the job descriptions for NIO, XPENG, and other automakers' 2026 campus recruitment drives, you'll notice a significant change: traditional role classifications are blurring. NIO's VLA algorithm engineer position lists keywords like LLM, VLM, end-to-end autonomous driving, and world models in its job responsibilities. XPENG's reinforcement learning algorithm engineer role requires candidates to be familiar with both deep learning and reinforcement learning algorithms and to have dual-language proficiency in Python and C++.

Image source: Internet

This also illustrates a trend: roles are converging. The modular era, where perception, planning, and control operated in silos, is over. Today, the industry needs people who understand the entire chain from sensor data to control commands.

Additionally, it's worth noting that high-paying roles are highly concentrated geographically and academically. Shanghai, Shenzhen, and Guangzhou are currently the biggest hubs for intelligent driving talent, with Beijing and Chongqing rapidly gaining attraction (appeal), while industrial powerhouses like Hefei, Wuhan, and Xi'an form regional self-contained ecosystems. If you're not in these cities, salary levels and job density will be noticeably lower. Regarding academic requirements, core algorithm roles at leading companies still primarily require a master's degree or higher, but the industry now places greater emphasis on actual project experience and engineering implementation capabilities rather than just paper publication counts.

Even if you leave intelligent driving, this path won't be for nothing

When discussing whether it's worth entering, we must consider not just upward mobility but also outward exit options. An industry's resilience largely depends on whether the skills you accumulate can be transferred elsewhere.

Autonomous driving has a prominent feature in this regard: its technology stack highly overlaps with the current hottest embodied intelligence. BEV perception, end-to-end models, reinforcement learning, world models, and simulation training—these capabilities accumulated in the intelligent driving field can almost seamlessly transfer into the robotics sector. From January to April 2026, the recruitment index in the embodied intelligence field reached 579, surging 15-fold from 36 during the same period last year, with the average monthly salary for positions rising to RMB 61,625.

Beyond data, the actual technological integration matters more. When releasing its MindVLA-o1 system, Li Auto explicitly stated that this unified model can not only control vehicles but also extend to robotics. Xiaomi's MiMo-Embodied model, released in 2025, is the industry's first cross-domain foundational model that bridges autonomous driving and embodied intelligence. This signifies that the two fields are gradually converging at the underlying technological level.

Data on talent mobility also supports this trend. From 2023 to 2024, the number of technicians transitioning from autonomous driving to embodied intelligence increased by 78% year-on-year. Many around you may have already shifted to the embodied intelligence industry. The career paths of leading entrepreneurs further illustrate this point: Chen Yilun, former Chief Scientist of Huawei's Automotive BU, and Li Zhenyu, former head of Baidu's Intelligent Driving Group, co-founded the embodied intelligence company Tasai Zhihang, securing $120 million in angel-round financing. Zhipingfang completed seven rounds of multi-million-dollar financing within six months, with its founding team injecting engineering experience, data iteration systems, and supply chain management capabilities accumulated in the automotive industry into the robotics field.

This characteristic of downward technological compatibility and lateral talent mobility provides greater safety margins for entering the autonomous driving industry. You don't need to view this field as a lifelong commitment. Here, you can accumulate two years of experience in perception fusion or reinforcement learning. If you ever feel burnt out or believe the industry is declining, a step toward robotics means you won't have to restart your experience or salary from scratch.

A Few Suggestions If You're Seriously Considering Entering the Field

If you're serious about joining the autonomous driving industry, let's clarify the foundational requirements. Based on 2026 recruitment standards, a master's degree isn't an absolute requirement for core algorithm or system roles, but leading companies still predominantly hire master's degree holders or above for core algorithm positions. The most relevant academic backgrounds are computer science, automation, artificial intelligence, and electronic engineering. For campus recruits, starting salaries for master's graduates in core algorithm roles at leading companies generally exceed RMB 300,000 annually, with higher salaries for PhDs.

For campus recruitment, the key competitive factors are solid fundamentals and demonstrable project experience. You must master Python and C++: Python for algorithm prototyping and model training, and C++ for engineering deployment. Lacking either will hinder your job search. Regarding deep learning frameworks, PyTorch has become nearly ubiquitous in intelligent driving, and proficiency directly impacts interview competitiveness. For project experience, demonstrating a complete end-to-end perception or planning project—even a university research project—holds far more value than scattered minor projects on your resume.

The situation for experienced hires is more complex. If you come from traditional automakers or Tier 1 suppliers, the biggest challenge isn't technical expertise but mindset transformation: shifting from requirements-driven development to iterative improvement amid uncertainty, and from following established processes to proactively defining solutions. However, such professionals have unique advantages: their understanding of automotive industry processes, functional safety, and mass production scalability—which internet-trained algorithm engineers typically lack—makes them more competitive for roles like functional safety engineers and system integration engineers.

If you transition from the internet industry, you likely already have strong Python skills and model training experience. The most common stumbling blocks are vehicle-side constraints: models must not be too large, inference must not be too slow, and safety cannot be compromised. Acquiring expertise in edge inference optimization, model quantization, and C++ performance tuning will significantly enhance your negotiating power in the intelligent driving industry.

Image Source: Internet

Finally, regardless of your background, if you lack direct experience in this field, a practical suggestion is to read technical blogs from leading companies (such as Intelligent Driving Frontier) before deciding your direction. Understand the industry's current key concerns, from VLA models to world models to simulation data generation, and use these topics as clues to gradually build your knowledge framework. This industry features strong information liquidity —almost nothing requires isolated learning. The key lies in identifying the right direction and maintaining sustained, in-depth engagement.

The most critical requirement for entering autonomous driving in 2026 isn't what you've already learned, but what you can continuously learn. Technical architectures, job definitions, and capability boundaries are all evolving—the only constant is change itself. If you're mentally prepared for this and possess sufficient interest and patience in technology, this field remains highly worth entering.

-- END --

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links