ELI5: Why couldn't we just manufacture thousands of tiny 8086 cores on a single chip and run everything in parallel?

Amon@lemmy.world · edit-2 1 month ago

ELI5: Why couldn't we just manufacture thousands of tiny 8086 cores on a single chip and run everything in parallel?

ch00f@lemmy.world · edit-2 1 month ago

Modern processors have a lot of optimizations that go beyond just doing math quickly. For example branch prediction allows them to act on information before its even known.

Also, 8086s could be made smaller to go faster and run more efficiently, but they’d still run into thermal limits. What you’re describing is basically the strategy behind the gigahertz race that happened in the early 2000s. Eventually just going smaller and faster wasn’t enough because you couldn’t get the heat out fast enough.

Finally, writing code for thousands of parallel processors would be a bear, and the memory limitations of a 16 bit processor would add a lot of overhead when trying to address the memory requirements of modern software.

Edit: sorry, forgot what sub I was in. Basically, a 16 bit processor just can’t address enough memory, and modern processors have a lot of clever tricks that go beyond just being smaller.

Amon@lemmy.world · 1 month ago

Thanks, TIL!

merthyr1831 · 1 month ago

You’ve just invented a really shitty GPU, essentially.

Only some work is easily parallelized (and benefits from it) such as graphics, so people have essentially done what you’ve described only the cores aren’t 8086 but more specialised for graphics math such as matrices and vectors.

As for more standard workloads, we still do a similar thing with server processors that utilize hundreds of weaker cores, but not all work benefits from the kind of parallelism that computer graphics does (or other stuff like AI, blockchain, physics simulations) and some workloads benefit from less parallelism.

But the short answer is that we do already do what you describe there it’s just that sometimes it’s still faster to have a few complex cores over thousands of simple ones.

chonglibloodsport@lemmy.world · 1 month ago

How do you get data to all those cores? Think about it like this: imagine a restaurant with thousands of one person tables packed as tightly as possible. How do you get customers to/from the tables and how do you get them their food and clear their empty plates?

When it comes to modern processor designs, a large amount of the area on the chip is simply dedicated to solving this problem: getting data into the chip, caching it, and getting the data back out to the rest of the computer.

If your chip is just a massive grid of identical cores then you’re going to have extreme issues keeping them all fed with data to work on. What you likely end up doing is having a bunch of the cores spending a lot of time doing nothing other than acting as intermediaries that pass data along to inner cores for processing.

If you’re doing that then the advantage of having all these identical cores goes away and you’re probably better off replacing some of them with specialized hardware for solving that (IO ports and large, fast caches). But then if you continue that line of reasoning you get to the current designs which are already out there!

SplashJackson@lemmy.ca · 1 month ago

Give them all paper plates and just stick big vacuum cleaners to all the windows so when the plates are empty they just go wooosh

Shadow@lemmy.ca · edit-2 1 month ago

Because a 8086 is really fucking slow, and writing highly multi threaded code is way more difficult.

Amon@lemmy.world · 1 month ago

Still in 5yo mode: why couldn’t you run the 8086s at an order of magnitude faster considering they would be much smaller?

Doombot1@lemmy.one · 1 month ago

Heat

grue@lemmy.world · edit-2 1 month ago

Adding more execution units doesn’t speed up tasks as much as you’d think, because not everything can be done in parallel.

It’s kinda the same way that, for example, one woman can have a baby in nine months but nine women can’t have a baby in one month.

For more info, see: https://en.wikipedia.org/wiki/Amdahl’s_law

Also, even when you can in theory parallelize something, it’s not always easy to figure out how to do it.

The other reason we don’t make chips with thousands of tiny 8086s is that, if we’re making new chips anyway, why not incorporate the ideas we’ve learned in the last 40 years instead of just copying old tech? Not using new designs is kinda a waste.

And if you do incorporate new ideas to make a chip with thousands of execution units on it, guess what: that does exist and is called a GPU.

weker01@sh.itjust.works · 1 month ago

We probably can but why should we? GPU ISAs are way better optimised for “run everything in parallel” jobs.

Treczoks@lemmy.world · 1 month ago

Cache coherency. That is the word you need to understand why your idea is a bad idea.