Many times I have read the argument, which you reiterate here, that "for a safe system you need Lidar" or "you cannot have a safe system with just a camera". I just don't understand why people connect "safety" with "Lidar", why? Safety of a system (tesla fsd, waymo, whatever) should be evaluated by a series of standardized tests, basically you take the AI to a driving test similar like people get a driving license. This has nothing to do with underlaying technology. (Personally I believe that lidar+cameras and also just cameras will both work ok, obviously together with a machine learning core, and not a rule-based system.)
From where we stand today, the idea that LiDAR is safer comes from 2 observations:
1) FSD still does not do Level 4 autonomous driving like how Waymo does in SF
2) There have been more legal issues with FSD and safety issues related to crashes. It’s also convoluted by the fact that there are more Teslas with FSD compared to Waymo. So likely more mishaps too.
I try to make the case in the articles that a pure vision self driving system can actually be safer if it is sufficiently trained, where we bet on compute and data, Compared to LiDAR based human experience perception.
We are seeming not fully there yet with camera only driving, but doesn’t mean it’s not possible or even safer. I have no idea what the “right” approach will turn out to be. Or maybe both can coexist.
Vision may not be the perfect solution to the problem, only that human's used it as our solution due to the limits of human perception. One could argue that using other ways of sensing the world including vision and lidar but using the same data based approach will end up being superior, it doesn't require it to be rules based by default if using Lidar.
It is not clear it is just a perception and classification problem. Humans not only classify the scene, they relate it to past experience driving on that or similar routes and combinations of elements. This suggests that a true FSD solution might use a RAG scheme to find similar environments and associated solutions, not simply try to pack all knowledge into the LLM.
If you want to see what RAG is like, try using Perplexity which uses a search engine to prefill the context associated with your conversation. Try comparing it on task-oriented questions and compare it to the underlying LLM (you can set Perplexity to use different LLMs). To me it feels more insightful and stays on track better. Maybe driving should borrow from that.
Yes, definitely seems like RAG should be a part of self driving. It's not as simple even for humans to adapt to left vs right hand driving. Or driving in a different geographical location, versus your daily commute.
To me baking everything into an LLM by powering it with a nuclear reactor still feels weird. Humans always specialize. As machines should. RAG seems to be the way in general, imo.
Many times I have read the argument, which you reiterate here, that "for a safe system you need Lidar" or "you cannot have a safe system with just a camera". I just don't understand why people connect "safety" with "Lidar", why? Safety of a system (tesla fsd, waymo, whatever) should be evaluated by a series of standardized tests, basically you take the AI to a driving test similar like people get a driving license. This has nothing to do with underlaying technology. (Personally I believe that lidar+cameras and also just cameras will both work ok, obviously together with a machine learning core, and not a rule-based system.)
From where we stand today, the idea that LiDAR is safer comes from 2 observations:
1) FSD still does not do Level 4 autonomous driving like how Waymo does in SF
2) There have been more legal issues with FSD and safety issues related to crashes. It’s also convoluted by the fact that there are more Teslas with FSD compared to Waymo. So likely more mishaps too.
I try to make the case in the articles that a pure vision self driving system can actually be safer if it is sufficiently trained, where we bet on compute and data, Compared to LiDAR based human experience perception.
We are seeming not fully there yet with camera only driving, but doesn’t mean it’s not possible or even safer. I have no idea what the “right” approach will turn out to be. Or maybe both can coexist.
Vision may not be the perfect solution to the problem, only that human's used it as our solution due to the limits of human perception. One could argue that using other ways of sensing the world including vision and lidar but using the same data based approach will end up being superior, it doesn't require it to be rules based by default if using Lidar.
Yup. Could very well be. Just need Lidars to become cost competitive.
It is not clear it is just a perception and classification problem. Humans not only classify the scene, they relate it to past experience driving on that or similar routes and combinations of elements. This suggests that a true FSD solution might use a RAG scheme to find similar environments and associated solutions, not simply try to pack all knowledge into the LLM.
If you want to see what RAG is like, try using Perplexity which uses a search engine to prefill the context associated with your conversation. Try comparing it on task-oriented questions and compare it to the underlying LLM (you can set Perplexity to use different LLMs). To me it feels more insightful and stays on track better. Maybe driving should borrow from that.
Yes, definitely seems like RAG should be a part of self driving. It's not as simple even for humans to adapt to left vs right hand driving. Or driving in a different geographical location, versus your daily commute.
To me baking everything into an LLM by powering it with a nuclear reactor still feels weird. Humans always specialize. As machines should. RAG seems to be the way in general, imo.