• cacheson@kbin.social
    link
    fedilink
    arrow-up
    105
    ·
    10 months ago

    In this thread: Programmers disassembling the joke to try and figure out why it’s funny.

  • ono@lemmy.ca
    link
    fedilink
    English
    arrow-up
    78
    arrow-down
    3
    ·
    10 months ago

    Cute. It would be funnier if it was correct.

  • Thorry84@feddit.nl
    link
    fedilink
    arrow-up
    65
    ·
    10 months ago

    For people interested in the difference between decompiled machine code and source code I would recommend looking at the Mario 64 Decomp project. They are attempting to turn a Mario 64 rom into source code and then back into that same rom. It’s really hard and they’ve been working on it for a long time. It’s come a long way but still isn’t done.

    https://github.com/n64decomp/sm64

      • Thorry84@feddit.nl
        link
        fedilink
        arrow-up
        6
        ·
        10 months ago

        There is still some stuff that needs documenting, but the original goal of recompiling the created source code into the ROMs has been achieved. People are still actively working on it, so in that sense it’s maybe never done.

    • voxel@sopuli.xyz
      link
      fedilink
      arrow-up
      2
      arrow-down
      2
      ·
      edit-2
      10 months ago

      well assembly is technically “source code” and can be 1:1 translated to and from binary, excluding “syntactic sugar” stuff like macros and labels added on top.

      • Malfeasant@lemm.ee
        link
        fedilink
        arrow-up
        4
        arrow-down
        1
        ·
        10 months ago

        But those things you’re excluding are the most important parts of the source code…

        • 257m
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          10 months ago

          By excluded he means macro assemblers which in my mind do qualify as an actual langauge as they have more complicated syntax than instruction arg1, arg2 …

      • 257m
        link
        fedilink
        arrow-up
        4
        arrow-down
        1
        ·
        10 months ago

        The code is produced by the compiler but they are not the original source. To qualify as source code it needs to be in the original language it was written in and a one for one copy. Calling compiler produced assembly source code is wrong as it isn’t what the author wrote and their could be many versions of it depending on architecture.

      • newIdentity@sh.itjust.works
        link
        fedilink
        arrow-up
        14
        ·
        10 months ago

        A decompiler won’t give you the source code. Just some code that might not even necessarily work when compiled back.

        • amki@feddit.de
          link
          fedilink
          arrow-up
          1
          arrow-down
          2
          ·
          10 months ago

          From the point of view of the decompiler machine code is indeed the source code though

        • over_clox@lemmy.world
          link
          fedilink
          arrow-up
          4
          arrow-down
          6
          ·
          10 months ago

          And? Decompilers aren’t for noobs. So what if it gives you variable and function names like A000, A001, etc?

          It can still lead a seasoned programmer where to go in the raw machine code to mod some things.

        • over_clox@lemmy.world
          link
          fedilink
          arrow-up
          1
          arrow-down
          2
          ·
          10 months ago

          No, it’s actually better when you can read the machine code.

          Most folks don’t care to recompile the whole thing when all they wanna do is bypass the activation and tracker shit.

          • SpaceNoodle@lemmy.world
            link
            fedilink
            arrow-up
            3
            arrow-down
            1
            ·
            10 months ago

            Having access to the source code actually makes reading machine code easier, so you’re also wrong on this entirely different thing you’re going on about.

            • over_clox@lemmy.world
              link
              fedilink
              arrow-up
              1
              arrow-down
              2
              ·
              10 months ago

              I never said disassembly or decompiling was easier in any way. I’ll agree with you on that, it’s way more difficult.

              Back to the point of the meme though, if you can read assembly, you can read it all.

                • over_clox@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  10 months ago

                  I’ve written drivers in 65 bytes of code. I don’t tend to use high level languages that hide what’s going on behind the scenes.

            • over_clox@lemmy.world
              link
              fedilink
              arrow-up
              1
              arrow-down
              2
              ·
              10 months ago

              You’ve clearly never used a disassembler such as HIEW have you? You get the entire breakdown of the assembly code.

                • over_clox@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  10 months ago

                  I didn’t say it was. I just said loosely what the OG meme said, if you know how to read assembly, you know how to read (and write) what some of the code does.

  • just_ducky_in_NH@lemmy.world
    link
    fedilink
    arrow-up
    30
    arrow-down
    1
    ·
    10 months ago

    Okay, boomer here, be gentle.

    So back in the ‘70s I dabbled in programming (now called “coding”, I hear). I only did higher-level languages like Fortran, Cobol, IBM Basic, but a friend had a job (at age 13!) programming in assembler. Is assembler now called assembly, or are they different?

    • fidodo@lemm.ee
      link
      fedilink
      arrow-up
      32
      ·
      10 months ago

      It’s still called programming, coding is the same thing. Assembler more commonly refers to the utility program that converts the assembly code to machine code while assembly refers to the code itself, but the term assembler code is also valid. It’s uncommon to simply call the code assembler because it would be easily confused with the utility program.

    • Thwompthwomp@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      10 months ago

      I thought that the assembler is a specific program that translates mnemonics into the corresponding machine code. Perhaps in early computing this was done by hand so a person was the assembler (and worked in assembler), but now that is handled by software (and supports various macros). So programming in assembly would generate a stream of text that must be assembled by an assembler. (Although I have heard people refer to programming in assembler as well, just not often.)

      • lhamil64@programming.dev
        link
        fedilink
        arrow-up
        9
        ·
        10 months ago

        I hear people say “program in assembler” but IMO that’s wrong. I’d say you write the code in “assembly language” (or better yet, the actual architecture you’re using like “x86 assembly”) but you “assemble” it with an “assembler”. Kind of like how you could write a program in the “C language” and “compile” it with a “compiler”

        • amki@feddit.de
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          edit-2
          10 months ago

          A compiler and an assembler do wildly different things though. An assembler simply replaces mnemonics while a compiler transfers instructions to a whole other language.

          • Malfeasant@lemm.ee
            link
            fedilink
            arrow-up
            1
            ·
            10 months ago

            Depends on the language, really… C maps pretty closely to assembly language, it’s not as simple as one mnemonic to one machine code byte, more like tokens get mapped to sequences of machine code, a function call translates to some code that sets up a stack frame, a return tears it down…

    • Overzeetop@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      I was too young/poor to afford an assembler for my 6502 so I wore out the assembly long hand on a legal pad and then manually converted each operation to machine code.

      Needless to say my programs done this way were exceptionally simple, but it’s interesting to understand the underlying code.

  • NounsAndWords@lemmy.world
    link
    fedilink
    arrow-up
    25
    arrow-down
    1
    ·
    10 months ago

    It just occurred to me that AI in the nearish future will probably/almost certainly be able to do this.

    • Psythik@lemm.ee
      link
      fedilink
      arrow-up
      36
      ·
      10 months ago

      I can’t wait for AI to make a PC port of every console game ever so that we can finally stop using emulators.

      • amki@feddit.de
        link
        fedilink
        arrow-up
        19
        arrow-down
        11
        ·
        10 months ago

        This won’t happen in our lifetime. Not only because this is more complex than rambling vaguely correlated human speech while hallucinating half the time.

          • 257m
            link
            fedilink
            arrow-up
            12
            ·
            edit-2
            10 months ago

            That dosen’t really translate to neural nets though. There is nothing inherent about matrix multiplication that would make it good at reading code. And also computers aren’t reading code they are executing it. The hardware just reads instruction by instruction and performs that instruction it has no idea what the high level purpose of what it is doing actually is.

          • gens@programming.dev
            link
            fedilink
            arrow-up
            3
            ·
            10 months ago

            Half of programming is writing code, the other half is thinking about the problem. As i learn more about programming i feel that it is even more about solving problems.

          • amki@feddit.de
            link
            fedilink
            arrow-up
            2
            ·
            10 months ago

            It’s the other way round. Code is being written to fit how a specific machine works. This is what makes Assembly so hard.

            Also there is by design no understanding required, a machine doesn’t “get” what you are trying to do it just does what is there.

            If you want a machine to understand what specific code does and modify that for another machine that is extremely hard because the machine would need to understand the semantics of the operation. It would need to “get” what you were doing which isn’t happening.

        • secret301@sh.itjust.works
          link
          fedilink
          arrow-up
          4
          ·
          10 months ago

          I think it’ll be in our lifetime just not anytime soon. I feel like AI is gonna boom like the internet did. Didn’t happen overnight and not even in a year but over 35ish years

        • SnipingNinja@slrpnk.net
          link
          fedilink
          arrow-up
          3
          ·
          10 months ago

          Idk the specifics, but what you say makes it sound like it would be easier to create an AI that recreates a game based on gameplay visuals (and the relevant controls)

          • amki@feddit.de
            link
            fedilink
            arrow-up
            1
            ·
            10 months ago

            That game would still not work because there is a ton of hidden state in all but the simplest computer games that you cannot tell from just playing through the game normally.

            An AI could probably reinvent flappy birds because there is no more depth than what is currently on screen but that’s about it.

        • GBU_28@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          ·
          10 months ago

          Off the shelf models do this, yes.

          Sophisticated local trained models on expensive private hardware are already dunking on publicly available versions. The problem of hallucination is generally resolved in those contexts

          • amki@feddit.de
            link
            fedilink
            arrow-up
            5
            arrow-down
            1
            ·
            10 months ago

            Sure but until I see such a thing I chose not to believe in fairy tales.

            Decompiling arbitrary architecture machine code is quite a few levels above everything I’ve seen so far which is generally pretty basic pattern recognition paired with statistics and training reinforcement.

            I’d argue decompiling arbitrary machine code into either another machine code or legible higher level code is in a whol other league than what AO has proven to be capable of.

            Especially because with this being 90% accurate is useless.

            • GBU_28@lemm.ee
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              10 months ago

              Again you aren’t seeing this because these models are being developed for private enterprise purposes.

              Regarding deep machine code analysis, sure, that’s gonna take work but the whole hallucination thing is an off the shelf, rookie problem these days

              • Rikudou_Sage@lemmings.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                10 months ago

                It’s not, though. Hallucinations are inherent to the technology, it’s not a matter of training. Good training can greatly reduce the likelihood, but cannot solve it.

          • sacredfire@programming.dev
            link
            fedilink
            arrow-up
            1
            ·
            10 months ago

            Why does a pre-trained model need expensive private hardware after it was trained, other than to handle API requests faster? Is Open AI training chat-GPT on inferior hardware compared to these sophisticated private versions you mentioned?

            • GBU_28@lemm.ee
              link
              fedilink
              English
              arrow-up
              3
              ·
              10 months ago

              The fine tuning, while much more efficient than starting fresh, can still be a large amount of work.

              Then consider that your target corpus of data may also be large.

              Then consider to do your reasoning tasks across that corpus also takes strong hardware to get production ready response times.

              No, openai isn’t using inferior hardware, but their model goals, token chunking strategies and overall corpus are generalist in nature.

              There are then processing strategies teams are using to go beyond the “memory” limitations gpt 4 has, that provide massive benefits to coherency, essentially anti hallucination and better overall reasoning

          • amki@feddit.de
            link
            fedilink
            arrow-up
            8
            arrow-down
            7
            ·
            10 months ago

            About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.

            Don’t be surprised but about half of the time I can predict the result of a coin flip.

            I’m not saying it’s not interesting but needing custom training and an fMRI is not “an AI can read minds”

            It can see if patterns it saw previously reappear in a heavily time delayed fMRI. Looking for patterns you already know isn’t such an impressive feat Computers have done this for ages now.

            It litterally can’t read minds.

            • sfgifz@lemmy.dbzer0.com
              link
              fedilink
              arrow-up
              3
              arrow-down
              1
              ·
              edit-2
              10 months ago

              Later, the same participants were scanned listening to a new story or imagining telling a story and the decoder was used to generate text from brain activity alone. About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.

              You left out the most important context about “half of the time”. Guessing what you’re thinking of by just looking at your brain activity with a 50% accuracy is a very very good achievement - it’s not pulling it out of a 1 or 0 outcome like you’re with your coin flip.

              You can pretend that the AI is useless and you’re the smartest boy in the class all you want, doesn’t negate the accomplishments.

              • amki@feddit.de
                link
                fedilink
                arrow-up
                1
                ·
                10 months ago

                Being close (and “sometimes” precise) to the intended meaning is an equally useless metric to measure performance.

                Depending on what you allow for “well close enough I think” asking ChatGPT to tell a story without any reading of fMRI would get you to these results. Especially if you know beforehand it’s gonna be a story told.

    • perviouslyiner@lemm.ee
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      10 months ago

      It was a staple of Asimov’s books that while trying to predict decisions of the robot brain, nobody in that world ever understood how they fundamentally worked.

      He said that while the first few generations were programmed by humans, everything since that was programmed by the previous generation of programs.

      This leads us to Asimov’s world in which nobody is even remotely capable of creating programs that violate the assumptions built into the first iteration of these systems - are we at that point now?

      • amki@feddit.de
        link
        fedilink
        arrow-up
        8
        ·
        10 months ago

        No. Programs cannot reprogram themselves in a useful way and are very very far from it.

        • legion02@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          10 months ago

          Eh, I’d say continuous training models are pretty close to this. Adapting to changing conditions and new input is kinda what they’re for.

          • Bjornir@programming.dev
            link
            fedilink
            arrow-up
            1
            ·
            10 months ago

            Very far from reprogramming though. The general shape of the NN doesn’t change, you won’t get a NN made to process images to suddenly process code just by training it.

  • Southern Wolf@pawb.social
    link
    fedilink
    arrow-up
    22
    arrow-down
    2
    ·
    10 months ago

    It’s honestly remarkable how few people in the comments here seem to get the joke.

    Never stop dissecting things, y’all.

  • oldfart@lemm.ee
    link
    fedilink
    arrow-up
    17
    arrow-down
    1
    ·
    10 months ago

    IDA Pro (a disassembler) is closed source but came with a license that allowed disassembly and binary modification. Unfortunately, that’s no longer the case.

  • kamen@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    ·
    10 months ago

    Joke aside, that’s kind of like claiming that any web frontend is open source because you can access the built, minified and often obfuscated source of it.

    • Jocker Black
      link
      fedilink
      arrow-up
      1
      ·
      10 months ago

      So true! I have been “hacking” some chrome extensions recently, do you know of a tool for reverse engineering JS?

  • over_clox@lemmy.world
    link
    fedilink
    arrow-up
    13
    ·
    10 months ago

    If you wanna skip a few inconvenient instructions in X86 assembly, throw a few No Operation instructions in the right places.

    NOP = 0x90

    • SzethFriendOfNimi@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      edit-2
      10 months ago

      And so you add a hashing check. But then that can be removed.

      So you need one in the OS but that can be removed.

      So you need one in hardware.

      In other words no matter how clever you are there’s always a way to monkey with something unless you have absolute control from silicon on up.

      Here’s a really interesting video the Xbox team did on the challenges of trying to make sure that the content running wasn’t pirated.

      https://youtu.be/U7VwtOrwceo

      While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.

      • grue
        link
        fedilink
        arrow-up
        5
        arrow-down
        1
        ·
        10 months ago

        While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.

        Nah, when the user wants to ensure trust and integrity in his own system, it works just fine. The problem comes when the user who needs to be able to access the data is simultaneously the adversary who needs to be stopped from accessing the data.

        In other words, it’s one of those situations where the fact that it’s hard to manage is a gigantic clue that it’s wrongheaded to try to do so in the first place.

        • SzethFriendOfNimi@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 months ago

          I agree. I mean when doing secure channel communications or weapons systems or health biometrics.

          There are cases where you need to be sure of the integrity of the data and environment