Self-driving cars have been a topic of intense discussion and development for several years now. While many companies focus on a range of sensors like LIDAR, radar, and ultrasonic sensors, there are companies that insist on a vision-only approach.
What is the LIDAR approach?
Traditionally, autonomous vehicles have been equipped with an array of expensive sensors. LIDAR, short for Light Detection and Ranging, uses lasers to create a 3D map of the car’s surroundings. Radar helps in detecting objects’ speed and distance, and ultrasonic sensors aid in close-range detection. These systems often require a lot of computational power and can significantly drive up the cost of the vehicle.
What is the vision approach?
A vision-only approach is relies mainly on cameras, coupled with advanced machine learning algorithms, to give the vehicle its “sense” of direction. Cameras serve as the eyes of the car, capturing images that are then processed in real-time by the vehicle’s onboard computer. This setup significantly reduces costs and computational requirements.
Tesla is famous for its efforts to crack self-driving using a vision-only approach, but there is also an open source startup called comma AI that has a vision-based system called openpilot, as well as sells the hardware to run openpilot on called comma 3X in its latest iteration.
How does the computer understand how to drive a car?
By now a lot of us are already all familiar with OpenAI’s ChatGPT, a generative AI tool trained on a whole load of data to help it understand the world. So how does an AI know how to drive a car?
The answer is simple – by watching videos on how a car is driven.
Through this process, the AI driving model learns to recognise patterns and distinguish between a pedestrian, a cyclist, another vehicle, or even road signs and signals.
Once the machine learning model understands the environment, it makes real-time driving decisions. Should the car slow down because a pedestrian is crossing? Should it speed up to merge onto a highway? All of these decisions are made based on the continuous input of visual data and the patterns the model has learned.
For a human to get better at driving, it has to have more driving experience. The same goes with the AI model. It improves over time as it is fed more good quality data.
Where does Comma and Tesla get their driving videos?
Comma AI gathers its training data primarily through crowd-sourcing. When users install Comma AI’s hardware into their cars, the devices collect video data as well as other sensor information while driving. This data is then anonymized and used to improve the machine learning model.
This allows the system to be trained on a diverse range of real-world driving conditions, including different types of roads, traffic situations, and even varying weather conditions. Second, the data is naturally updated and expanded, which means the model is continuously refined as more users participate and share their data.
Tesla also collects data from their fleet of vehicles. The video feed from the cameras installed all around your Tesla is being used to train Tesla’s FSD AI.
Of course, it’s not just the video feed that’s needed. That teaches the car how the car should behave, but it doesn’t teach the car how to drive, as in how to control the car. Driving videos are collected along with associated data like steering angles, acceleration, and braking input from human drivers. These serve as the “training examples.”
For example, based on a human driver’s acceleration curve, the AI would be able to learn how to comfortable accelerate the car. If not, the driving would be super jerky, with abrupt acceleration and braking, just like a learner driver.
So the AI is like a child that we are teaching how to drive?
Yes. Imagine teaching a child how to recognize a red traffic light and understand what it means. You would show the child many pictures of red lights and say, “When you see this, you stop.” Over time, the child learns to associate the red light with the action of stopping. This process is similar to how machine learning trains a self-driving car.
First, the machine learning system is shown thousands of images of different traffic lights—red, green, yellow—alongside other road conditions. The system is told which color the light is displaying in each image. Over time, the system learns to identify traffic lights and their colors accurately, just like a child would.
Next, the system needs to learn what to do when it sees each color. It is fed more images, but this time, each image also comes with information about what the car did or should do: stop, go, or slow down. So, when it sees a red light in an image, the information that comes with it would be ‘stop.’
After looking at many images, the computer starts getting the hang of it. When it sees a red light, it knows that it’s time to stop. It has learned this through lots and lots of examples, sort of like practicing over and over again.
The same principle applies to recognizing pedestrians. The system is shown many pictures of people walking, standing, or running near the road. It learns that when it sees a shape that looks like a person, it needs to be cautious and prepared to stop.
After all this learning, the system is tested to make sure it understands correctly. It’s shown new pictures it hasn’t seen before to check if it knows what to do. If it passes these tests, it’s ready to be used in a real car to help it drive by itself.
In this way, the machine learning system gets better at understanding what different road signs, traffic lights, and objects like pedestrians mean and what actions it should take in response.
What is the current state of AI driving?
Here are some videos that show what FSD V12 is currently capable of. FSD V12 is a version that has yet to be released to consumers as a beta, but Elon Musk gave a live demo of it on public roads on X live streaming a few days ago.
Below is what Comma AI is capable of.
What will these AI driving systems cost and when will they be available?
Tesla’s FSD V11 beta can be purchased with a Model 3 or Model Y in Malaysia for RM32,000. But note that since the beta doesn’t work in Malaysia, you would essentially be parking your money with Tesla giving them an interest-free loan until it eventually becomes available.
As for Comma AI, you can actually buy the Comma 3X right now and run openpilot on it. Of course it doesn’t support all cars, so you have to check the list of supported cars (over 250+ currently) to see if you can use it. The Hyundai Ioniq 5 is one example of a car available in Malaysia that Comma 3X supports.
This video will show you how the Comma 3X is installed in your car, in case you are curious as to how it works. It’s basically just like installing a dashcam and plugging it into the OBD port.
It will cost you USD1,250 for the Comma 3X and USD200 for a harness specific to a car. International shipping is USD30. That’s under RM7,000 ringgit. Costs have come down quite significantly compared to previous versions of Comma.
Now if you think these prices are expensive, wait til you hear how LIDAR-based systems cost…
Would you spend money on AI driving capabilities for your car? Share your thoughts in the comments.
* This article was originally published here
For Feedback & Comments, please write to us on email@example.com