Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Semi-obligatory thanks to @dgerard for starting this.)

  • froztbyte@awful.systems
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 hours ago

    following on from this comment, it is possible to get it turned off for a Workspace Suite Account

    1. contact support (? button from admin view)
    2. ask the first person to connect you to Workspace Support (otherwise you’ll get some made-up bullshit from a person trying to buy time or Case Success or whatever, simply because they don’t have the privileges to do what you’re asking)
    3. tell the referred-to person that you want to enable controls for “Gemini for Google Workspace” (optionally adding that you have already disabled “Gemini App”)

    hopefully you spend less time on this than the 40-something minutes I had to (a lot of which was spent watching some poor support bastard start-stop typing for minutes at a time because they didn’t know how to respond to my request)

  • Steve@awful.systems
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    3 hours ago

    so the new feature in the next macos release 15.3 is “fuck you, apple intelligence is on by default now”

    For users new or upgrading to macOS 18.3, Apple Intelligence will be enabled automatically during Mac onboarding. Users will have access to Apple Intelligence features after setting up their devices. To disable Apple Intelligence, users will need to navigate to the Apple Intelligence & Siri Settings pane and turn off the Apple Intelligence toggle. This will disable Apple Intelligence features on their device.

    https://archive.ph/4pSIw

  • skillissuer@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    11
    ·
    edit-2
    10 hours ago

    til that there’s not one millionaire with family business in south african mining in current american oligarchy, but at least two. (thiel’s father was an exec at mine in what is today Namibia). (they mined uranium). (it went towards RSA nuclear program). (that’s easily most ghoulish thing i’ve learned today, but i’m up only for 2h)

    • froztbyte@awful.systems
      link
      fedilink
      English
      arrow-up
      6
      ·
      10 hours ago

      there’s probably a fair couple more. tracing anything de beers or a good couple of other industries will probably indicate a couple more

      (my hypothesis is: the kinds of people that flourished under apartheid, the effect that had on local-developed industry, and then the “wider world” of opportunities prey they got to sink their teeth into after apartheid went away; doubly so because staying ZA-only is extremely limiting for ghouls of their sort - it’s a fixed-size pool, and the still-standing apartheid-vintage capital controls are Limiting for the kinds of bullshit they want to pull)

  • BurgersMcSlopshot@awful.systems
    link
    fedilink
    English
    arrow-up
    9
    ·
    11 hours ago

    Banner start to the next US presidency, with Wiener Von Wrong tossing a Nazi salute and the ADL papering that one over as an “awkward gesture”. 2025 is going to be great for my country.

    Incidentally is “Wiener Von Wrong” or “Wernher Von Brownnose” better?

  • maol@awful.systems
    link
    fedilink
    English
    arrow-up
    4
    ·
    9 hours ago

    It’s term time again and I’m back in college. One professor has laid out his AI policy: you should not use an AI (presumably Chat GPT) to write your assignment, but you can use an AI to proofread your assignment. This must be mentioned in the acknowledgements. He said in class that in his experience AI does not produce good results and that when asked to write about his particular field it produces work with a lot of mistakes.

    Me, I’m just wondering how you can tell the difference between material generated by AI then edited by a human, and material written by a human then edited by an AI.

    • blakestacey@awful.systemsOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 hour ago

      Here is what I wrote in the instructions for the term-paper project that I will be assigning my quantum-physics students this coming semester:

      I can’t very well stop you from using a text-barfing tool. I can, however, point out that the “AI” industry is a disaster for the environment, which is the place that we all have to live in; and that it depends upon datasets made by exploiting and indeed psychologically torturing workers. The point of this project is for you to learn a physics topic and how to write physics, not for you to abase yourself before a blurry average of all the things the Internet says about quantum physics — which, spoiler alert, includes a lot of wrong things. If you are going to spend your time at university not learning physics, there are better ways to do that than making yourself dependent upon a product that is a tech bubble waiting to pop.

  • BigMuffin69@awful.systems
    link
    fedilink
    English
    arrow-up
    13
    ·
    1 day ago

    Reposting this for the new week thread since it truly is a record of how untrustworthy sammy and co are. Remember how OAI claimed that O3 had displayed superhuman levels on the mega hard Frontier Math exam written by Fields Medalist? Funny/totally not fishy story haha. Turns out OAI had exclusive access to that test for months and funded its creation and refused to let the creators of test publicly acknowledge this until after OAI did their big stupid magic trick.

    From Subbarao Kambhampati via linkedIn:

    "𝐎𝐧 𝐭𝐡𝐞 𝐬𝐞𝐞𝐝𝐲 𝐨𝐩𝐭𝐢𝐜𝐬 𝐨𝐟 “𝑩𝒖𝒊𝒍𝒅𝒊𝒏𝒈 𝒂𝒏 𝑨𝑮𝑰 𝑴𝒐𝒂𝒕 𝒃𝒚 𝑪𝒐𝒓𝒓𝒂𝒍𝒍𝒊𝒏𝒈 𝑩𝒆𝒏𝒄𝒉𝒎𝒂𝒓𝒌 𝑪𝒓𝒆𝒂𝒕𝒐𝒓𝒔” hashtag#SundayHarangue. One of the big reasons for the increased volume of “𝐀𝐆𝐈 𝐓𝐨𝐦𝐨𝐫𝐫𝐨𝐰” hype has been o3’s performance on the “frontier math” benchmark–something that other models basically had no handle on.

    We are now being told (https://lnkd.in/gUaGKuAE) that this benchmark data may have been exclusively available (https://lnkd.in/g5E3tcse) to OpenAI since before o1–and that the benchmark creators were not allowed to disclose this *until after o3 *.

    That o3 does well on frontier math held-out set is impressive, no doubt, but the mental picture of “𝒐1/𝒐3 𝒘𝒆𝒓𝒆 𝒋𝒖𝒔𝒕 𝒃𝒆𝒊𝒏𝒈 𝒕𝒓𝒂𝒊𝒏𝒆𝒅 𝒐𝒏 𝒔𝒊𝒎𝒑𝒍𝒆 𝒎𝒂𝒕𝒉, 𝒂𝒏𝒅 𝒕𝒉𝒆𝒚 𝒃𝒐𝒐𝒕𝒔𝒕𝒓𝒂𝒑𝒑𝒆𝒅 𝒕𝒉𝒆𝒎𝒔𝒆𝒍𝒗𝒆𝒔 𝒕𝒐 𝒇𝒓𝒐𝒏𝒕𝒊𝒆𝒓 𝒎𝒂𝒕𝒉”–that the AGI tomorrow crowd seem to have–that 𝘖𝘱𝘦𝘯𝘈𝘐 𝘸𝘩𝘪𝘭𝘦 𝘯𝘰𝘵 𝘦𝘹𝘱𝘭𝘪𝘤𝘪𝘵𝘭𝘺 𝘤𝘭𝘢𝘪𝘮𝘪𝘯𝘨, 𝘤𝘦𝘳𝘵𝘢𝘪𝘯𝘭𝘺 𝘥𝘪𝘥𝘯’𝘵 𝘥𝘪𝘳𝘦𝘤𝘵𝘭𝘺 𝘤𝘰𝘯𝘵𝘳𝘢𝘥𝘪𝘤𝘵–is shattered by this. (I have, in fact, been grumbling to my students since o3 announcement that I don’t completely believe that OpenAI didn’t have access to the Olympiad/Frontier Math data before hand… )

    I do think o1/o3 are impressive technical achievements (see https://lnkd.in/gvVqmTG9 )

    𝑫𝒐𝒊𝒏𝒈 𝒘𝒆𝒍𝒍 𝒐𝒏 𝒉𝒂𝒓𝒅 𝒃𝒆𝒏𝒄𝒉𝒎𝒂𝒓𝒌𝒔 𝒕𝒉𝒂𝒕 𝒚𝒐𝒖 𝒉𝒂𝒅 𝒑𝒓𝒊𝒐𝒓 𝒂𝒄𝒄𝒆𝒔𝒔 𝒕𝒐 𝒊𝒔 𝒔𝒕𝒊𝒍𝒍 𝒊𝒎𝒑𝒓𝒆𝒔𝒔𝒊𝒗𝒆–𝒃𝒖𝒕 𝒅𝒐𝒆𝒔𝒏’𝒕 𝒒𝒖𝒊𝒕𝒆 𝒔𝒄𝒓𝒆𝒂𝒎 “𝑨𝑮𝑰 𝑻𝒐𝒎𝒐𝒓𝒓𝒐𝒘.”

    We all know that data contamination is an issue with LLMs and LRMs. We also know that reasoning claims need more careful vetting than “𝘸𝘦 𝘥𝘪𝘥𝘯’𝘵 𝘴𝘦𝘦 𝘵𝘩𝘢𝘵 𝘴𝘱𝘦𝘤𝘪𝘧𝘪𝘤 𝘱𝘳𝘰𝘣𝘭𝘦𝘮 𝘪𝘯𝘴𝘵𝘢𝘯𝘤𝘦 𝘥𝘶𝘳𝘪𝘯𝘨 𝘵𝘳𝘢𝘪𝘯𝘪𝘯𝘨” (see “In vs. Out of Distribution analyses are not that useful for understanding LLM reasoning capabilities” https://lnkd.in/gZ2wBM_F ).

    At the very least, this episode further argues for increased vigilance/skepticism on the part of AI research community in how they parse the benchmark claims put out commercial entities."

    Big stupid snake oil strikes again.

    • Soyweiser@awful.systems
      link
      fedilink
      English
      arrow-up
      14
      ·
      24 hours ago

      Every time they go ‘this wasnt in the data’ it turns out it was. A while back they did the same with translating rareish languages. Turns out it was trained on it. Fucked up. But also, wtf how are they expecting this to stay secret and there being no backlash? This world needs a better class of criminals.

      • V0ldek@awful.systems
        link
        fedilink
        English
        arrow-up
        8
        ·
        12 hours ago

        But also, wtf how are they expecting this to stay secret and there being no backlash?

        No, they bet on it not mattering and they’ve been completely right thus far.

      • FredFig@awful.systems
        link
        fedilink
        English
        arrow-up
        12
        ·
        23 hours ago

        The conspiracy theorist who lives in my brain wants to say its intentional to make us more open to blatant cheating as something that’s just a “cost of doing business.” (I swear I saw this phrase a half dozen times in the orange site thread about this)

        The earnest part of me tells me no, these guys are just clowns, but I dunno, they can’t all be this dumb right?

        • self@awful.systems
          link
          fedilink
          English
          arrow-up
          10
          ·
          23 hours ago

          holy shit, that’s the excuse they’re going for? they cheated on a benchmark so hard the results are totally meaningless, sold their most expensive new models yet on the back of that cheated benchmark, further eroded the scientific process both with their cheating and by selling those models as better for scientific research… and these weird fucks want that to be fine and normal? fuck them

          • David Gerard@awful.systemsM
            link
            fedilink
            English
            arrow-up
            9
            ·
            23 hours ago

            they can’t even sell o3 really - in o3 high mode, needed to do this level of query, it’s about $1000 per query lol

            • self@awful.systems
              link
              fedilink
              English
              arrow-up
              3
              ·
              4 hours ago

              do you figure it’s $1000/query because the algorithms they wrote with their insider knowledge to cheat the benchmark are very expensive to run, or is it $1000/query because they’re grifters and all high mode does is use the model trained on frontiermath and allocate more resources to the query? and like any good grifter, they’re targeting whales and institutional marks who are so invested that throwing away $1000 on horseshit feels like a bargain

              • froztbyte@awful.systems
                link
                fedilink
                English
                arrow-up
                3
                ·
                edit-2
                3 hours ago

                so, for an extremely unscientific demonstration, here (warning: AWS may try hard to get you to engage with Explainer[0]) is an instance of an aws pricing estimate for big handwave “some gpu compute”

                and when I say “extremely unscientific”, I mean “I largely pulled the numbers out of my ass”. even so, they’re not entirely baseless, nor just picking absolute maxvals and laughing

                parameters assumptions made:

                • “somewhat beefy” gpu instances (g4dn.4xlarge, selected through the tried and tested “squint until it looks right” method)
                • 6-day traffic pattern, excluding sunday[1]
                • daily “4h peak” total peak load profile[2]
                • 50 instances mininum, 150 maximum (let’s pretend we’re not openai but are instead some random fuckwit flybynight modelfuckery startup)
                • us west coast
                • spot instances, convertible spot reserves, 3y full prepay commit (yeah I know full vs partial is a big diff; once again, snore)

                (and before we get any fucking ruleslawyering dumb motherfuckers rolling in here about accuracy or whatever: get fucked kthx. this is just a very loosely demonstrative example)

                so you’d have a variable buffer of 50…150 instances, featuring 3.2…9.6TiB of RAM for working set size, 800…2400 vCPU, 50…150 nvidia t4 cores, and 800…2400GiB gpu vram

                let’s presume a perfectly spherical ops team of uniform capability[3] and imagine that we have some lovely and capable active instance prewarming and correct host caching and whatnot. y’know, things to reduce user latency. let’s pretend we’re fully dynamic[4]

                so, by the numbers, then

                1y times 4h daily gives us 1460h (in seconds, that’s 5256000). this extremely inaccurate full-of-presumptions number gives us “service-capable life time”. the times your concierge is at the desk, the times you can get pizza delivered.

                x3 to get to lifetime matching our spot commit, x50…x150 to get to “total possible instance hours”. which is the top end of our sunshine and rainbows pretend compute budget. which, of course, we still have exactly no idea how to spend. because we don’t know the real cost of servicing a query!

                but let’s work backwards from some made-up shit, using numbers The Poor Public gets (vs numbers Free Microsoft Credits will imbue unto you), and see where we end up!

                so that means our baseline:

                • upfront cost: $4,527,400.00
                • monthly: $1460.00 (x3 x12 = $52560)
                • whatever the hell else is incurred (s3, bandwidth, …)
                • >=200k/y per ops/whatever person we have

                3y of 4h-daily at 50 instances = 788400000 seconds. at 150 instances, 2365200000 seconds.

                so we can say that, for our deeply Whiffs Ever So Slightly values, a second’s compute on the low instance-count end is $0.01722755 and $0.00574252 at the higher instance-count end! which gives us a bit of a handle!

                this, of course, entirely ignores parallelism, n-instance job/load/whatever distribution, database lookups, network traffic, allllllll kinds of shit. which we can’t really have good information on without some insider infrastructure leaks anyway. if we pretend to look at the compute alone.

                so what does $1000/query mean, in the sense of our very ridiculous and fantastical numbers? since the units are now The Same, we can simply divide things!

                at the 50 instance mark, we’d need to hypothetically spend 174139.68 instance-seconds. that’s 2.0154 days of linear compute!

                at the 150 instance mark, 522419.05 instance-seconds! 6.070 days of linear compute!

                so! what have we learned? well, we’ve learned that we couldn’t deliver responses to prompts in Reasonable Time at these hardware presumptions! which, again, are linear presumptions. and there’s gonna be a fair chunk of parallelism and other parts involved here. but even so, turns out it’d be a bit of a sizable chunk of compute allocated. to even a single prompt response.

                [0] - a product/service whose very existence I find hilarious; the entire suite of aws products is designed to extract as much money from every possible function whatsoever, leading to complexity, which they then respond to by… producing a chatbot to “guide users”

                [1] - yes yes I know, the world is not uniform and the fucking promptfans come from everywhere. I’m presuming amerocentric design thinking (which imo is probably not wrong)

                [2] - let’s pretend that the calculators’ presumption of 4h persistent peak load and our presumption of short-duration load approaching 4h cumulative are the same

                [3] - oh, who am I kidding, you know it’s gonna be some dumb motherfuckers with ansible and k8s and terraform and chucklefuckery

                • froztbyte@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  2 hours ago

                  when digging around I happened to find this thread which has some benchmarks for a diff model

                  it’s apples to square fenceposts, of course, since one llm is not another. but it gives something to presume from. if g4dn.2xl gave them 214 tok/s, and if we make the extremely generous presumption that tok==word (which, well, no; cf. strawberry), then any Use Deserving Of o3 (let’s say 5~15k words) would mean you need a tok-rate of 1000~3000 tok/s for a “reasonable” response latency (“5-ish seconds”)

                  so you’d need something like 5x g4dn.2xl just to shit out 5000 words with dolphin-llama3 in “quick” time. which, again, isn’t even whatever the fuck people are doing with openai’s garbage.

                  utter, complete, comprehensive clownery. era-redefining clownery.

                  but some dumb motherfucker in a bar will keep telling me it’s the future. and I get to not boop 'em on the nose. le sigh.

          • FredFig@awful.systems
            link
            fedilink
            English
            arrow-up
            6
            ·
            edit-2
            23 hours ago

            They understand that all of the major model providers is doing it, but since the major model providers are richer than they are, they can’t possibly ask OpenAI and friends to stop, so in their heads, it is what it is and therefore must be allowed to continue.

            Or at least, that’s my face value read of it, I certainly hope I’m simplifying things too much.

    • froztbyte@awful.systems
      link
      fedilink
      English
      arrow-up
      11
      ·
      1 day ago

      I recall seeing something of this sort happening on goog for about 12~18mo - every so often a researcher post does the rounds where someone finds Yet Another way goog is fucking it up

      the advertising dept has completely captured all mindshare and it is (demonstrably) the only part that goog-the-business cares about

    • istewart@awful.systems
      link
      fedilink
      English
      arrow-up
      11
      ·
      1 day ago

      Hmm, surely there is no downside to doing all of one’s marketing, both personal* and professional, through the false certainty and low signal of short-form social media. The leopard has only licked Sam’s face, it will never bite and begin chewing!

      *You and I may find the concept of a “personal brand” to be horrifying, but these guys clearly want to become brands more fervently than Bruce Wayne wanted to become a bat