Encryption-breaking, password-leaking bug in many AMD CPUs could take months to fix

pe1uca@lemmy.pe1uca.dev · 1 year ago

Encryption-breaking, password-leaking bug in many AMD CPUs could take months to fix

SatanicNotMessianic · 1 year ago

For some floating-point heavy code, it could potentially be major, but not disastrous.

That’s a really interesting point (no pun intended)

I had run into a few situations where a particular computer architecture (eg, the Pentiums for a time) had issues with floating point errors and I remember thinking about them largely the same way. It wasn’t until later that I started working in complexity theory, by which time I completely forgot about those issues.

a one of the earliest discoveries in what would eventually become chaos and complexity theory was the butterfly effect. Edward Lorenz was doing weather modeling back in the 60s. The calculations were complex enough that the model could have to be run over several sessions, starting and stopping with partial results at each stage. Internally, the computer model used six significant figures for floating point data. When Lorenz entered the parameters to continue his runs, he used three sig figs. He found that the trivial difference in sig digs actually led to wildly different results. This is due to the nature of systems that use present states to determine next states and which also have feedback loops and nonlinearities. Like most complexity folks, I learned and told the story many times over the years.

I’ve never wondered until just now whether anyone working on those kinds of models ran into problems with floating point bugs. I can imagine problematic scenarios, but I don’t know if it ever actually happened or if it would have been detected. That would make for an interesting study.

AbelianGrape@beehaw.org · edit-2 1 year ago

These would be performance regressions, not correctness errors. Specifically, some false dependencies between instructions. The result of that is that some instructions which could be executed immediately may instead have to wait for a previous instruction to finish, even though they don’t actually need its result. In the worst case, this can be really bad for performance, but it doesn’t look like the affected instructions are too likely to be bottlenecks. I could definitely be wrong though; I’d want to see some actual data.

The pentium fdiv bug, on the other hand, was a correctness bug and was a catastrophic problem for some workloads. It was absolutely detected - it was found outside of Intel!

SatanicNotMessianic · 1 year ago

Thanks for the clarification!

I remember having to learn about fp representations in a numerical analysis class and some of the things you had to worry about back then, but by the time I ended up doing work where I’d actually have to worry about it, most of the gotchas had been taken care of so I largely stopped paying attention to the topic.