I know people here are very skeptical of AI in general, and there is definitely a lot of hype, but I think the progress in the last decade has been incredible.

Here are some quotes

“In my field of quantum physics, it gives significantly more detailed and coherent responses” than did the company’s last model, GPT-4o, says Mario Krenn, leader of the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany.

Strikingly, o1 has become the first large language model to beat PhD-level scholars on the hardest series of questions — the ‘diamond’ set — in a test called the Graduate-Level Google-Proof Q&A Benchmark (GPQA)1. OpenAI says that its scholars scored just under 70% on GPQA Diamond, and o1 scored 78% overall, with a particularly high score of 93% in physics

OpenAI also tested o1 on a qualifying exam for the International Mathematics Olympiad. Its previous best model, GPT-4o, correctly solved only 13% of the problems, whereas o1 scored 83%.

Kyle Kabasares, a data scientist at the Bay Area Environmental Research Institute in Moffett Field, California, used o1 to replicate some coding from his PhD project that calculated the mass of black holes. “I was just in awe,” he says, noting that it took o1 about an hour to accomplish what took him many months.

Catherine Brownstein, a geneticist at Boston Children’s Hospital in Massachusetts, says the hospital is currently testing several AI systems, including o1-preview, for applications such as connecting the dots between patient characteristics and genes for rare diseases. She says o1 “is more accurate and gives options I didn’t think were possible from a chatbot”.

  • hotcouchguy [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    30
    ·
    7 hours ago

    Kyle Kabasares, a data scientist at the Bay Area Environmental Research Institute in Moffett Field, California, used o1 to replicate some coding from his PhD project that calculated the mass of black holes. “I was just in awe,” he says, noting that it took o1 about an hour to accomplish what took him many months.

    Bro it was trained on your thesis

    • UlyssesT [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      15
      ·
      7 hours ago

      This is dangerously close to “prompt: say ‘I love you senpai’” and then suddenly feeling as if the treat printer really does love the computer toucher.

      People will believe what they want to believe, and even a data scientist isn’t immune to that impulse, especially when their job encourages it.

  • InevitableSwing [none/use name]@hexbear.net
    link
    fedilink
    English
    arrow-up
    15
    ·
    7 hours ago

    There is definitely a lot of hype.

    I’m not being sarcastic when I say I have yet to see a single real world example where the AI does extraordinarily well and lives up to the hype. It’s always the same.

    It’s brilliant!*

    *When it’s spoonfed in a non real world situation. Your results may vary. Void were prohibited.

    OpenAI also tested o1 on a qualifying exam for the International Mathematics Olympiad. Its previous best model, GPT-4o, correctly solved only 13% of the problems, whereas o1 scored 83%.

    Ah, I read an article on the Mathematics Olympiad. The NYT agrees!..

    Move Over, Mathematicians, Here Comes AlphaProof

    A.I. is getting good at math — and might soon make a worthy collaborator for humans.

    The problem - as always - is the US media is shit. Comments on that article by randos are better and far more informative than that PR-hype article pretending to be journalism.

    Major problem with this article: competition math problems use a standardized collection of solution techniques, it is known in advance that a solution exists, and that the solution can be obtained by a prepared competitor within a few hours.

    “Applying known solutions to problems of bounded complexity” is exactly what machines always do and doesn’t compete with the frontier in any discipline.

    -–

    Note in the caption of the figure that the problem had to be translated into a formalized statement in AlphaGeometry’s own language (presumably by people). This is often the hardest part of solving one of these problems.

    AI tech bros keep promising the moon and the stars. But then their AI doesn’t deliver so tech bros lie even more about everything to get more funding. But things don’t pan out again. And the churn continues. Tech bros promise the moon and the stars…

    • UlyssesT [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      7 hours ago

      The Rube Goldbergian machine that burns forests and dries up lakes needs just a few more Rube Goldbergian layers to do… what we already had, more or less, but quicker and sloppier with more errors and more burned forests and dried up lakes.

      I truly do believe that most of the loudest “AI” proselytizers are trying to convince everyone else, and perhaps themselves, that there’s more to this than what’s being presented, and just like in the cyberpunkerino treats, criticism, doubt, or even concern about the harm this technology has already done and will be doing on a larger scale is framed in a tiresome lazy “you are just Luddites afraid of the future” thought-terminating cliched way. soypoint-1 k-pain soypoint-2

      • batsforpeace [any, any]@hexbear.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        16 minutes ago

        Despite skepticism over whether nuclear fusion—which doesn’t emit greenhouse gases or carbon dioxide—will actually come to fruition in the next few years or decades, Gates said he remains optimistic. “Although their timeframes are further out, I think the role of fusion over time will be very, very critical,” he told The Verge.

        gangster-spongebob don’t worry climate folks, we will throw some dollars at nuclear fusion startups and they will make us beautiful clean energy for AI datacenters in just a few years, only a few more years of big fossil fuel use while we wait, promise

        Oracle currently has 162 data centers in operation and under construction globally, Ellison told analysts during a recent earnings call, adding that he expects the company to eventually have 1,000 to 2,000 of these facilities. The company’s largest data center is 800 megawatts and will contain “acres” of Nvidia (NVDA)’s graphics processing units (GPUs) to train A.I. models, he said.

        porky-happy I want football fields of gpus

        Ellison described a dinner with Elon Musk and Jensen Huang, the CEO of Nvidia, where the Oracle head and Musk were “begging” Jensen for more A.I. chips. “Please take our money. No, take more of it. You’re not taking enough, we need you to take more of it,” recalled Ellison, who said the strategy worked.

        NOOOOO give us more chips brooo

  • fubarx
    link
    fedilink
    English
    arrow-up
    12
    ·
    8 hours ago

    Tried it for python coding involving PDFs, OCR, and text substitution. Did worse than GPT-4o (which also failed).

    Gave up and told it so. At least, it was very apologetic.

    • sgtlion [any]@hexbear.net
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 hours ago

      I feel like a broken record saying this. But AI frequently does solve coding problems for me that would’ve taken hours. It can’t solve everything, and can’t handle large amounts, but it can be genuinely useful.

      • JoeByeThen [he/him, they/them]@hexbear.net
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        2 hours ago

        Same, but it has to be presented well. If you want it to work for you like a Junior Coding Assistant you need to talk to it like such; outline what you need, refine the prompt for caveats, and provide unique information for specialized use cases. I find it especially helpful for one off programming in languages I’m not familiar with or getting me past the mental block of a blank page.

        Also, there’s a lot of stuff being thrown at LLMs that really shouldn’t be. It’s not the be all end all of AI tech.

  • KobaCumTribute [she/her]@hexbear.net
    link
    fedilink
    English
    arrow-up
    36
    ·
    10 hours ago

    All of their models have consistently done pretty good on any sort of standard test, and then performed horribly in real use. Which makes sense, because if they can train it specifically to make something that looks like the answers to that test it will probably be good at making the answers to that, but it’s still fundamentally just a language parser and predictor without knowledge or any sort of internal modeling.

    Their entire approach is just so fundamentally lazy and grifty, burning massive amounts of energy on what is fundamentally a dumbshit approach to building AI. It’s like trying to make a brain by just making the speech processing lobe bigger and bigger and expecting it’ll eventually get so good at talking that the things it says will be intrinsically right instead of only looking like text.

  • EelBolshevikism [none/use name]@hexbear.net
    link
    fedilink
    English
    arrow-up
    11
    ·
    edit-2
    8 hours ago

    i feel like it passing the tests, but then also getting 30% incorrect biology responses, might be the most ironic and sad example of the failure of our memory-based education system i can think of

  • hypercracker [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    13
    ·
    10 hours ago

    Kyle Kabasares, a data scientist at the Bay Area Environmental Research Institute in Moffett Field, California, used o1 to replicate some coding from his PhD project that calculated the mass of black holes. “I was just in awe,” he says, noting that it took o1 about an hour to accomplish what took him many months.

    yeah I’m gonna doubt that, or he didn’t actually compile/run/test that code. like all LLMs it’s amazing until you interact with it a bit and see how incredibly limited it is.

  • It works 100% of the time 70% of the time now! While this is interesting and chain-of-thought reasoning is a creative way to get better at logic, this is inefficient and expensive to the point where hiring a person is certainly cheaper. I belive the API is only available to those who have already spent 1k on OpenAI subscriptions.

  • FrogPrincess
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    edit-2
    8 hours ago

    I know people here are very skeptical of AI in general, and…

    People here are gonna have to come up with something to say about AI apart from just screeching “I hate it”

    When man attains the knowledge of this common essence, he uses it as a guide and proceeds to study various concrete things which have not yet been studied, or studied thoroughly, and to discover the particular essence of each; only thus is he able to supplement, enrich and develop his knowledge of their common essence and prevent such knowledge from withering or petrifying. These are the two processes of cognition: one, from the particular to the general, and the other, from the general to the particular. Thus cognition always moves in cycles and (so long as scientific method is strictly adhered to) each cycle advances human knowledge a step higher and so makes it more and more profound. Where our dogmatists err on this question is that, on the one hand, they do not understand that we have to study the particularity of contradiction and know the particular essence of individual things before we can adequately know the universality of contradiction and the common essence of things, and that, on the other hand, they do not understand that after knowing the common essence of things, we must go further and study the concrete things that have not yet been thoroughly studied or have only just emerged. Our dogmatists are lazy-bones. They refuse to undertake any painstaking study of concrete things, they regard general truths as emerging out of the void, they turn them into purely abstract unfathomable formulas, and thereby completely deny and reverse the normal sequence by which man comes to know truth. Nor do they understand the interconnection of the two processes in cognition— from the particular to the general and then from the general to the particular. They understand nothing of the Marxist theory of knowledge.

    • UlyssesT [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      17
      ·
      edit-2
      7 hours ago

      apart from just screeching

      Their emotional screeching.

      Your enlightened totally-non-emotional proselytizing.

      https://futurism.com/openai-employees-say-firms-chief-scientist-has-been-making-strange-spiritual-claims

      You voluntarily came in here, leading with your passive-aggressive “screeching” framing. Don’t whine about being greeted as a clown while you’re riding a Silicon Valley unicycle and juggling cherry picked very smart and very important quotes while internalizing all those tech startup hype pitches and corporate ad copy at the same time. clown

    • EelBolshevikism [none/use name]@hexbear.net
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      8 hours ago

      we have consistently given other reasons. our criticisms of AI are salient and based both within the fundamental structure of the technology and the ways in which it’s employed.

      edit: the quote you are giving is far more critical of the highly idealistic form of ‘rationality’ present in Silicone Valley than anyone on this site.

      edit 2: you have likely seen a lot of people who have criticized it on the grounds of it’s bad outputs, like with visual arts. While this is often true, it’s a really weak criticism altogether and the far more pressing issues related to ‘AI’ are far more structural and less dependent on the specific capabilities of a given model.