AI RPG images aren't great

Given the price of art, I’ve been playing a whole heck of a lot with Machine Learning (ML) images (along with ever other indie RPG designer out there), and the results are bad. This one is Midjourney, which seems to be one of the better generators.

If the problem is just my lack of skill, that still sounds like a problem. If I have to hire a professional, I’d rather just hire an artist.

I’m writing a campaign about Vampires in Belgrade (Hungary) in the year 1230.

Starting with something without too many parts, a young Tzimisce vampire in the story (well, he was embraced young), has a ghouled raven he speaks with.

dark ages boy speaks to raven in the moonlit rain

Tzimisce and raven

Oh dear… it doesn’t know that human boys are bigger than ravens. So it’s beatuful, and enchanting, but doesn’t convey information, and the kid looks like ‘the little prince’, not like a sinister flesh-crafting vampire.

Making some variations, I finally got here:

It’s better, but the raven also looks like a humming-bird, and the moon looks like someone spilled it. It really conveys nothing more than ‘boy and raven’, so it’s not about to enhance the passages - and RPGs really do need good images, because every one conveys a boat-load of strange ideas.

Next up, what about a that scene where a vampire-hunter finally tracks down the coterie’s lair? He finds them by sunset and has to flee before they wake up, but he’ll be back tomorrow to kill the lot. He rides a horse, and has an ovcharka (bear-hunting Russian dog) by his side. The coterie will find signs of his passing, such as footprints.

After some bad images, I finally left the dog out - most of them blended the dog and horse into a single image, if the dog appeared as anything more than a shadow.

Slavic, of-the-night, noble hunter reading tracks, horse, footprints, village, 1300s

So we have a ruddy-great horse dwarfing the world in one, and lots of horse-butts which look out of place.

Time to make lots of variations again.

Slavic, of-the-night, noble hunter reading tracks, horse, footprints, village, 1300s

… so now we have more of a centaur-creature as the horse blends with the man.

Overall

RPG images should explain things, and the explanations should involve the interactions of multiple elements, such as one person shooting an arrow at another, or threats, or setting a building on fire. AI seems to mix styles well - want a vampire drawn by Picasso? I’m sure the results would be stunning. But if interactions are missing, I don’t see how anyone can use these results.

Machine Learning In General

I suspect machine learning will simply not work in our lifetimes. Consider the story of machine learning when translating:

You make a basic dictionary, so you can type ‘cat’, and it gives you ‘le chat’.
You give it rules about nouns and adjectives - now you type ‘the black cat’, and it returns ‘le chat noire’.

It gets 5% of language, then 10%, then 20%, and it’s tempting to imagine that 99%-accurate translations are coming soon, but they’re not, because if we go to translate ‘James is right, Alice is left’, the machine will return ‘James is correct’, because translating this statement does not rely on rules, but on understanding intention and meaning. Those hold-out sentences may require that we start by programming real AI, with real consciousness, and only then teaching it multiple languages.