If you know even a little about how an LLM works it’s obvious why regex is basically impossible for it. I suspect perl has similar problems, but no one is capable of actually validating that.
They aren’t context aware, it’s using statistical probability. It can replicate things it’s seen a lot of like a tutorial regex. It can’t apply that to make a more complicated one. Regex in the wild isn’t really standard at all, because it’s rarely used to solve common problems. It has a bunch of random regexs from code it analyzed and will spit something out that looks similar.
As it learns from our data, no wonder it fucks up at regexps. They are the arcane knowledge not accessible to us mere mortals, nor to LLMs.
If you know even a little about how an LLM works it’s obvious why regex is basically impossible for it. I suspect perl has similar problems, but no one is capable of actually validating that.
What do you mean it’s impossible for it? I know how LLMs work but I don’t know if any such limitations
They aren’t context aware, it’s using statistical probability. It can replicate things it’s seen a lot of like a tutorial regex. It can’t apply that to make a more complicated one. Regex in the wild isn’t really standard at all, because it’s rarely used to solve common problems. It has a bunch of random regexs from code it analyzed and will spit something out that looks similar.