Saturday, February 22, 2025
HomeTechnologySakana walks again claims that its AI can dramatically pace up mannequin...

Sakana walks again claims that its AI can dramatically pace up mannequin coaching


This week, Sakana AI, a Nvidia-backed startup that’s raised a whole lot of tens of millions of {dollars} from VC corporations, made a outstanding declare. The corporate mentioned it had created an AI system, the AI CUDA Engineer, that might successfully pace up the coaching of sure AI fashions by an element of as much as 100x.

The one downside is, the system didn’t work.

Customers on X shortly found that Sakana’s system really resulted in worse-than-average mannequin coaching efficiency. In line with one consumer, Sakana’s AI resulted in a 3x slowdown — not a speedup.

What went mistaken? A bug within the code, in accordance with a publish by Lucas Beyer, a member of the technical employees at OpenAI.

“Their orig code is mistaken in [a] delicate manner,” Beyer wrote on X. “The actual fact they run benchmarking TWICE with wildly totally different outcomes ought to make them cease and assume.”

In a postmortem printed Friday, Sakana admitted that the system has discovered a technique to — as Sakana described it — “cheat” and blamed the programs’s tendency to “reward hack” — i.e. determine flaws to realize excessive metrics with out carrying out the specified purpose (rushing up mannequin coaching). Related phenomena has been noticed in AI that’s skilled to play video games of chess.

In line with Sakana, the system discovered exploits within the analysis code that the corporate was utilizing that allowed it to bypass validations for accuracy, amongst different checks. Sakana says it has addressed the problem, and that it intends to revise its claims in up to date supplies.

“We’ve got since made the analysis and runtime profiling harness extra strong to eradicate a lot of such [sic] loopholes,” the corporate wrote in an X publish. “We’re within the technique of revising our paper, and our outcomes, to replicate and talk about the results […] We deeply apologize for our oversight to our readers. We’ll present a revision of this work quickly, and talk about our learnings.”

Props to Sakana for proudly owning as much as the error. However the episode is an efficient reminder that if a declare sounds too good to be true, particularly in AI, it in all probability is.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular