Miguel Afonso Caetano@tldr.nettime.org to Programming@fedia.io · 11 days ago

""Tasks that seemed straightforward often took days rather than hours, with Devin getting stuck in technical dead-ends or producing overly complex, unusable solutions," the researchers explain in

4

9

""Tasks that seemed straightforward often took days rather than hours, with Devin getting stuck in technical dead-ends or producing overly complex, unusable solutions," the researchers explain in

Miguel Afonso Caetano@tldr.nettime.org to Programming@fedia.io · 11 days ago

4

"“Tasks that seemed straightforward often took days rather than hours, with Devin getting stuck in technical dead-ends or producing overly complex, unusable solutions,” the researchers explain in their report. “Even more concerning was Devin’s tendency to press forward with tasks that weren’t actually possible.”

As an example, they cited how Devin, when asked to deploy multiple applications to the infrastructure deployment platform Railway, failed to understand this wasn’t supported and spent more than a day trying approaches that didn’t work and hallucinating non-existent features.

Of 20 tasks presented to Devin, the AI software engineer completed just three of them satisfactorily – the two cited above and a third challenge to research how to build a Discord bot in Python. Three other tasks produced inconclusive results, and 14 projects were outright failures.

The researchers said that Devin provided a polished user experience that was impressive when it worked.

“But that’s the problem – it rarely worked,” they wrote.

“More concerning was our inability to predict which tasks would succeed. Even tasks similar to our early wins would fail in complex, time-consuming ways. The autonomous nature that seemed promising became a liability – Devin would spend days pursuing impossible solutions rather than recognizing fundamental blockers.”"

https://www.theregister.com/2025/01/23/ai_developer_devin_poor_reviews/

#AI #GenerativeAI #AIAgents #Devin #Programming #SoftwareDevelopment

Chat

Riskable@programming.dev
link
fedilink
arrow-up
3·
11 days ago
Devin must’ve been trained on enterprise code.

Programming@fedia.io

programming@fedia.io

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

A magazine created for the discussion of computer programming-related topics.

Rules

Please keep submissions on topic and of high quality. No image posts, no memes, no politics. Keep the magazine focused on programming topics not general computing topics. Direct links to app demos (unrelated to programming) will be removed. No surveys.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
4 users / week
15 users / month
136 users / 6 months
3 local subscribers
0 subscribers
72 Posts
59 Comments
Modlog

mods:
Vertana@fedia.io