Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...

L4sBot · 1 year ago

Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...

@[email protected] · edit-2 10 months ago

deleted by creator

@[email protected] · 11 months ago

You wildly overestimate the competency of management and the capital owners they answer to.

I guarantee a significant % of entities will grow dependent on AI well before it’s dependable. The profit motive will be too high (source: the frequent failure that is outsourcing).

@[email protected] · 11 months ago

This is spot on. Source: 10+ years at F500 companies.

Senior management and/or board members read one article in Forbes, or some other “business” publication, and think that they know everything they need to know about an emerging technology. Risk management is either a ☑ exercise or extremely limited in scope, usually only including threats that have already been observed and addressed in the past.

Not enough people understand the limitations of this kind of tech, and contextualize it in the same frame as outsourcing because as long as the output mostly looks correct, the decision makers can push the blame for any issues down to the middle managers and below.

Gonna be a wild time!

@[email protected] · 11 months ago

Definitely not my experience at F100, they are cautious as fuck about everything. Definitely having the right discussions and exploring all sorts of technology, but risk management remains a huge calculation in making these kind of decisions.

@[email protected] · 11 months ago

I think we’ll see a very large filtering out of companies who do this.

MeanEYE · 11 months ago

We’ve already seen people firing tech support staff and switching to “AI”.

@[email protected] · edit-2 11 months ago

I don’t understand why anyone even considers that. It’s a toy. A novelty, a thing you mess with when you’re bored and want to see how Hank Hill would explain the plot of Full Metal Alchemist, not something you would entrust anything significant to.

@[email protected] · 11 months ago

These models are black boxes right now, but presumably we could open it up and look inside to see each and every function the model is running to produce the output. If we are then able to see what it is actually doing and fix things up so we can mathematically verify what it does will be correct, I think we would be able to use it for mission critical applications. I think a more advanced LLM likes this would be great for automatically managing systems and to do science+math research.

But yeah. For right now these things are mainly just toys for SUSSY roleplays, basic customer service, and generating boiler plate code. A verifiable LLM is still probably 2-4 years away.

Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...

Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...

ChatGPT can get worse over time, Stanford study finds | Fortune