Do you know the game "Chinese whispers" (испорченный телефон)? I suggest a benchmark for code-editing AI models based on the same idea: the more transformations a piece of code can survive, the better the model. Take a simple Java algorithm, give it to Claude, and ask to improve it. Then take the result and repeat the process. Eventually, the algorithm will be broken. The question is: how soon?
>>Click here to continue<<