Improved Wikipedia AI Modernizer

lanolinoil@lemmy.world · 10 months ago

Improved Wikipedia AI Modernizer

Landsharkgun@midwest.social · 10 months ago

I…what? Who thought this was a good idea, or even a thing that was needed? Why would you need to ‘modernize’ historical artwork? Why the great flying fuck would you do this by putting it through an AI program that - extremely fucking crucially - changes all of the minor details of the piece? This is absolutely terrible. Whoever worked on this needs to unplug all of their devices and go peel some potatoes.

lanolinoil@lemmy.world · edit-2 10 months ago

Check out my milk bread posts if you’re into potatoes.

Can you share more about what information you think is lost in the high fidelity generations like the lighthouse and colossus, especially given you can toggle the images?

The fidelity with the lines and subject is pretty high right now via gpt4 vision and controlnet, but I could work on getting it higher – Which images bother you the most and what level of fidelity/auto restoration would cause you to have a more positive reaction?

Thanks for the feedback!

Landsharkgun@midwest.social · 10 months ago

My problem is the basic concept. These are historical documents, and you are tampering with them. It’s like translating the Declaration of Independence into leetspeak. As a one-time gag, it’s worth a chuckle, but the idea that it would be an ‘improvement’ or a ‘modernisation’ is an insult. The ‘fidelity’ of your process is irrelevant. You do not and should not need an artifical ‘higher resolution image’ of a centuries-old painting.

lanolinoil@lemmy.world · 10 months ago

That’s what I thought – It seems like there’s a group of people this bothers and aren’t interested in regardless of the outputs. It feels ideological, but I won’t go that far and claim that for you or them – Anyway, this isn’t made for you and it is OK that we feel differently.

Thanks for sharing your feedback and opinion again! Have a great day!

Plum@lemmy.world · 10 months ago

That wooly mammoth mangina is not an improvement on the original.

lanolinoil@lemmy.world · 10 months ago

I’m going to lock “I’m old greegggggggg” into the generation prompt

ruckblack@sh.itjust.works · 10 months ago

Absolute garbage

lanolinoil@lemmy.world · 10 months ago

Thanks for the feedback!

Thorry84@feddit.nl · edit-2 10 months ago

What are you doing? That painting shown on Wikipedia for the lighthouse of Alexandria is an actual painting, you can’t just replace that with another random image and say look this is better. The painting is the painting, with its flaws and character and historical aspects.

Please stop and delete this.

lanolinoil@lemmy.world · 10 months ago

The drawing is the original and the painting is the AI image :)

On that image too, I can’t even see any difference in the lines at all – which differences are most obvious and upsetting to you?

Thanks for the feedback!

Thorry84@feddit.nl · edit-2 10 months ago

Yeah I meant the drawing, the original. I would consider that a painting, but that’s semantics.

One of the biggest issues I see with this is the two people in the bottom right. They are naked and most likely slaves. One of them has a lighter skin color and the other a darker skin color. Whilst it isn’t the focus of the original drawing, it is a part of it, because it is a part of history.

In the new “AI improved” version, not only is all detail lost and are there straight up mistakes, the light colored slave now has clothes on (implying it’s not a slave) and the darker colored one is simply deleted. Rewriting history by deleting slavery using AI is a HUGE issue and highlights why you should absolutely not do this.

The original contains so much more information and context, you are deleting all of that. Basically everything that made the original worth anything and worth including in the article is gone.

If you say you can’t see the differences you are full of shit, get the fuck out of here.

lanolinoil@lemmy.world · 10 months ago

No need to be ugly I don’t think — Thanks for the feedback all the same.

I honestly did not notice it because my focus was on the lighthouse structure itself and that level of fidelity wasn’t something I personally needed as an addendum to the original image (not replacement)

So, I could get tiny subject fidelity up by using larger images from the start, but that would dramatically increase generation time.

Right now an image runs through GPT AND generates in about 10-15 seconds. If we pretend you were using this, what the max time you’d want to wait for the highest quality image?

If I added an upscale button to the AI image that produced a 4X much higher detail/fidelity image but took longer, would that be a solution?

Thorry84@feddit.nl · 10 months ago

I think you have a flawed understanding of what “AI” is and does.

It doesn’t enhance, it doesn’t improve, it doesn’t increase tiny subject fidelity. It makes stuff up, it invents data, in other words it’s total BS.

Using something like AI upscaling for videogames is fine, because if it fucks up it’s at worse a tiny glitch to get annoyed with. Using it on something like Wikipedia, which for many people is a source of information, is VERY dangerous and downright stupid. You can’t rely on anything produced by AI. It isn’t the magic zoom and enhance button we know from TV and movies.

When Elden Ring came out, I wanted a huge ass poster of it. But even the official press release only included images of limited resolution, fine for a wallpaper on the computer, not fine for a high quality print. I messed around with different AI upscaling techniques till I found one I was happy with. Even then I spent hours tweaking the parameters and throwing a lot of computing power against it, till I got out something I was happy with. And even now I know small little details which aren’t right because of that algorithm, but I’m the only one who knows or sees so I was OK with it.

If you are learning about subjects by using AI, please stop and use actual primary sources. What you are learning is fiction, a fantasy and not real life.

lanolinoil@lemmy.world · edit-2 10 months ago

So, this is more than just running the image through an AI generator – It uses a few techniques to lock the image to the lines in the original image and the prompt is aware of the article data and the image via GPT vision, controlnet, and the internet.

Also, this does not replace the images on Wikipedia, it’s a chrome extension that let’s you toggle between original and generated images – The intent is if you see an old 1700s etching and wonder what it really looked like – Or see a poorly drawn Mughal era painting and wonder what the scene might have looked like in real life – The only real ‘funcitonal’ use I’ve seen building it is with coins and other things that are ‘worn down’ It does a pretty good job at making that stuff more visible – There’s a few coin examples in the post.

Can you look at the line drawing of the lighthouse of Alexandria and the AI generated image for me and tell me if there’s some level of fidelity improvement that could be present to make you feel differently? I struggle to find a lot of differences other than the color.

The ‘upscale’ button could just let us start with a higher resolution starting image with all details preserved – In the painting of the lighthouse, where the boy is removed, that kind of thing would get fixed and small characters would be much better preserved, at the cost of generation time – I’m not saying just upscale the AI image.

On the comment about fiction/fantasy – The majority of the images we’re modifying are not ‘primary sources’ in that Hermann Thiersch never saw the Lighthouse – This feels like the same level of fantasy since we’re using his original image with such high fidelity. I’m curious to get your thoughts.

Thanks for the feedback!

Thorry84@feddit.nl · edit-2 10 months ago

Please just stop, you don’t know what you are doing.

People who went to school for over a decade in this subject would be able to tell you a thousand things about some of the images you are referencing. People worked hard to include the best possible image with the article.

You then go and generate some BS image and say: “I struggle to find a lot of differences other than the color.”

And no these things cannot be fixed, there is no fixing a flawed principle. You can’t fix it by renaming it or by saying it’s only a chrome extension. Please stop.

lanolinoil@lemmy.world · 10 months ago

Do you have any articles or reading I can do on what those ‘thousand’ things would be? I can definitely build that into the model either with fine-tuning or connecting GPT to the internet.

I wholesale disagree things can’t be fixed and your logic there doesn’t really track. In general your manner reminds me of the famous Sartre quote. You don’t seem to really be interested in engaging in good faith. I find your failure to even attempt an answer at my question suggests your true motives.

If you press them too closely, they will abruptly fall silent, loftily indicating by some phrase that the time for argument is past.”

https://www.goodreads.com/quotes/7870768-never-believe-that-anti-semites-are-completely-unaware-of-the-absurdity

Have a great day and look out for the next update! I will incorporate your feedback into the changes.

lanolinoil@lemmy.world · 10 months ago

Oh, I see – I meant the 2nd lighthouse picture. The one that is much higher fidelity to the original image. I see what you’re saying about the first image

Fisch · 10 months ago

Kinda interesting to see what AI can do but I really hope you didn’t actually upload these to Wikipedia

lanolinoil@lemmy.world · 10 months ago

It’s a chrome extension that loads/saves them to an external DB and let’s you toggle between them

Helix 🧬@feddit.de · 10 months ago

How is rewriting history an “improvement”? I fear the day when this will actually be uploaded to Wikipedia (be it in the open or secretly) and history will be erased by it.

keepthepace@slrpnk.net · edit-2 10 months ago

The first examples make my previous opinion change a bit. The first three kind of work: they are improvement over later depiction, they add a layer of speculation with more quality, why not. But you really need to be careful about proposing “improvements” of primary sources, see the comments you get here or on imgur. The fresco with 4 character is an example of what should not be done: you turn a roman fresco into a renaissance painting (looks like Rubens style?).

Your tool can be useful, you changed my mind about it with the lighthouse of Alexandria reconstruction. But really, choose your examples more carefully. Some are akin to writing a manga version of Batman and calling it an improvement. It is a core difference in style that not everyone will like.

lanolinoil@lemmy.world · 10 months ago

Yes – I must never use the word ‘improve’ again that is clear haha – Do you like ‘modernize’ ‘update’ ? which words are least upsetting?

Somewhere else someone gave me the idea to build different fine tune models that are more aware of styles and techniques from different periods. Thanks for the great feedback I appreciate you!

I wanted to not cherry pick examples and so I just did 15 images and posted them to see how the ‘anger’ reaction has changed since last time.

keepthepace@slrpnk.net · 10 months ago

I would suggest to maybe use it more on imaginary rendition, of fiction or literature. Or on colorizing some specific styles like engravings.

lanolinoil@lemmy.world · edit-2 10 months ago

I actually have an old project that illustrates books automatically – https://github.com/pwillia7/booksplitter

I haven’t looked at it in a while – good idea and I’ll try to rebuild that on this stack. Here’s an example output from that (this is pretty ‘dumb’ and is just generating prompts based on the text and has some style locking with prompts) https://docs.google.com/document/d/1IsnynQZoxOBmZx9Jac4DfWn15YCevG63CxsIkbu8tgE/edit

Aria@lemmygrad.ml · 10 months ago

Some of these are more clear, easier to grok, have more information etc, all while looking more beautiful. But the images on Wikipedia are usually selected based on specific relevancy, not just being the best illustration. Often there are better illustrations available, just that they are more removed from the subject, so they don’t get picked.