100 floors lvl 46
It takes decades : The above examples show that it often takes decades to move up the edge-case maturity ladder.Similarly, for previous waves of digital innovation such as web and cloud, we can use their uptime as a signal for maturity. For AI, the F1 score can be a useful approximation for maturity. It is clear from the above examples that a F1 of 65% is easily achievable by today’s AI, but how far away are we from an F1 of six nines? A roadmap to hi-fiĪs discussed earlier, maturity and market readiness for any technology is tied to how well it handles edge cases. We estimate that a robo-taxi must achieve over 99.9999% precision and 99.9999% recall in detecting red lights in order to be on par with humans. We devised a method to estimate the level of AI accuracy needed to achieve parity between autonomy and human drivers, taking into account current intersection collision rates and other factors. Both blowing a red light (false negative) and unexpectedly braking at a green (false positive) have a high risk of collision. When a robo-taxi decides whether to cross at a traffic light, it is making a time-sensitive safety decision.This is an adequate score, because a high precision makes for a great user experience and low user churn, whereas a low recall is not noticed by users. If Spotify plays songs you like 95% of the time (precision), but only surfaces half of the songs you like (recall of 50%), its F1 would be 65%.Let’s calculate the F1 score for two applications: By our estimate, some of the best AI today perform at a rate of 99%, though a score above 90% is generally considered high. A F1 of 100% represents a perfectly error-free AI that handles all edge cases. Products can be designed to incorporate human assistance at opportune moments, whether by the user or by support staff, to achieve their desired levels in both precision and recall.Ī popular metric for evaluating AI reliability is the F1 score, which is a type of numeric average of precision and recall, thus measuring for both false positives and false negatives. Lo-fi + humans = hi-fi : Safety uses cases aside, it is often possible to achieve hi-fi performance by combining artificial and human intelligence.
This is where many autonomous car use cases tend to be focused.
We use the single most important measure of maturity in any technology: its ability to manage unforeseen events commonly known as edge cases.
There’s a simple framework for differentiating near-term reality from science fiction.