Poked at this for a while, was the best I could get. Turns out bing image gen doesn’t know what a halberd is.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      5 months ago

      Someone wrote a scraper that converts Worm and Ward to ebooks. I used it to read them offline.

      https://github.com/domenic/worm-scraper

      It’d be kind of cool if there were another project that incorporated illustrations to the generated ebook.

      EDIT: And maybe somewhere that could incorporate patches to fix typos and the like. I remember seeing some of those when reading them.

      EDIT2: Looks like the scraper already has a collection of typo fixes, so the functionality is probably already present; just needs PRs for the issues I noticed. Damn, wish I’d made note of them when reading through:

      This project makes a lot of fixups to the original text, mostly around typos, punctuation, capitalization, and consistency.

      EDIT3: One problem with using generative AI to do illustrations – especially across multiple LLM models – is that it’d be hard to get a consistent appearance for a character.

      • Some of that can, in my experience, be masked, by using some artist’s style, because as long as an LLM is reasonably trained on that artist’s style, it creates some level of consistency in general appearance. That doesn’t help with specific attributes.

      • Won’t work for the proprietary LLM models, but maybe it’s possible to generate multiple images of a character and use something like Stable Diffusion’s clip-interrogator to “converge” on prompt terms that reasonably-consistently generate a character. Maybe train a LoRA on a corpus of “bad” and “good” examples of a character as consistency goes, attach terms to those, like “badworm” and “goodworm” or something.

      • It might also be possible to use some sort of “sketch generator” that generates a rough outline of the character, pose that in a 3d modeler, and then hand it off to LLMs; for Stable Diffusion, at least, that’s existing functionality, though I don’t know about all of the other LLMs. Then one just has to use a modeler to really-quickly pose a premade model for the characters, feed it in as input to the LLM. I suspect that artists are, in general, going to develop techniques like this further for working with LLMs. Here’s someone using existing Stable Diffusion tools, though I have no idea how well that would work as a general solution.

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        5 months ago

        I’m not gonna dig up the text in the book, but according to the Fandom wiki, Hookwolf’s eyes are human:

        https://worm.fandom.com/wiki/Hookwolf

        Hookwolf’s eyes still remain human, but are protected by a shifting screen of blades.

        kagis

        https://parahumans.wordpress.com/2012/06/20/interlude-11e/

        Orders given, Hookwolf drew the majority of his flesh into a condensed point in his ‘core’, felt himself come alive as more metal spilled forth. Only his eyes remained where they were, set in recessed sockets, behind a screen of shifting blades. He was half-blind until the movement of the blades hit a rhythm, moving fast enough that they zipped over the surface of his eye at speeds faster than an eyeblink.

        Technically, he should probably look larger than a human, since his body is still intact inside.

        tries in Stable Diffusion

        It really wants to give me yellow eyes for wolves.

        pokes around more

        https://lemmy.today/pictrs/image/5dc1b764-1319-47ae-a6b8-6889f7f72e50.png

        wolf made of hooks and barbs, steel, standing on a street,night,(human blue eyes:.3) Negative prompt: (yellow eyes:.3) Steps: 20, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 7, Size: 1024x1024, Model hash: ebf42d1fae, Model: realmixXL_v15, Token merging ratio: 0.5, Version: v1.9.4-169-ga30b19dd

        I don’t know what prompt you’d use for the “human internal” portion. Not really the shape of a wolf, exactly. OP did a better job of that.