ylaiEnglish · 5 months agoScalable MatMul-free Language Modelingplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkScalable MatMul-free Language Modelingplus-squarearxiv.orgylaiEnglish · 5 months agomessage-square0fedilink
ylaiEnglish · 6 months ago1-bit LLMs Could Solve AI’s Energy Demandsplus-squarespectrum.ieee.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-link1-bit LLMs Could Solve AI’s Energy Demandsplus-squarespectrum.ieee.orgylaiEnglish · 6 months agomessage-square0fedilink
ylaiEnglish · 8 months agoMistral 7B v0.2 Base (released at SHACK15sf hackathon)plus-squaregithub.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkMistral 7B v0.2 Base (released at SHACK15sf hackathon)plus-squaregithub.comylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoEvolving New Foundation Models: Unleashing the Power of Automating Model Developmentplus-squaresakana.aiexternal-linkmessage-square0fedilinkarrow-up11arrow-down11
arrow-up10arrow-down1external-linkEvolving New Foundation Models: Unleashing the Power of Automating Model Developmentplus-squaresakana.aiylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoGaLore: Advancing Large Model Training on Consumer-grade Hardwareplus-squarehuggingface.coexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGaLore: Advancing Large Model Training on Consumer-grade Hardwareplus-squarehuggingface.coylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoHow Chain-of-Thought Reasoning Helps Neural Networks Computeplus-squarewww.quantamagazine.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkHow Chain-of-Thought Reasoning Helps Neural Networks Computeplus-squarewww.quantamagazine.orgylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoWhy Are Large AI Models Being Red Teamed?plus-squarespectrum.ieee.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkWhy Are Large AI Models Being Red Teamed?plus-squarespectrum.ieee.orgylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoGPT-4 won't run DOOM but will play the game poorlyplus-squarewww.theregister.comexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkGPT-4 won't run DOOM but will play the game poorlyplus-squarewww.theregister.comylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoLLMs become more covertly racist with human interventionplus-squarewww.technologyreview.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLLMs become more covertly racist with human interventionplus-squarewww.technologyreview.comylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 8 months agoAI chatbot models ‘think’ in English even when using other languagesplus-squarewww.newscientist.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkAI chatbot models ‘think’ in English even when using other languagesplus-squarewww.newscientist.comylaiEnglish · 8 months agomessage-square0fedilink
ylaiEnglish · 9 months agoAI Prompt Engineering Is Deadplus-squarespectrum.ieee.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkAI Prompt Engineering Is Deadplus-squarespectrum.ieee.orgylaiEnglish · 9 months agomessage-square0fedilink
ylaiEnglish · 9 months agoLarge language models can do jaw-dropping things. But nobody knows exactly why.plus-squarewww.technologyreview.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLarge language models can do jaw-dropping things. But nobody knows exactly why.plus-squarewww.technologyreview.comylaiEnglish · 9 months agomessage-square0fedilink
ylaiEnglish · 10 months agoMixtral of Expertsarxiv.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkMixtral of Expertsarxiv.orgylaiEnglish · 10 months agomessage-square0fedilink
ylaiEnglish · 10 months agoFinetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystemplus-squarepytorch.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkFinetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystemplus-squarepytorch.orgylaiEnglish · 10 months agomessage-square0fedilink
ylaiEnglish · 11 months ago“AI’s Ostensible Emergent Abilities Are a Mirage” paper won the Outstanding Paper Award at NeurIPS 2023plus-squarenitter.netimagemessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1image“AI’s Ostensible Emergent Abilities Are a Mirage” paper won the Outstanding Paper Award at NeurIPS 2023plus-squarenitter.netylaiEnglish · 11 months agomessage-square0fedilink
ylaiEnglish · 1 year agoHigh-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUsplus-squarepytorch.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkHigh-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUsplus-squarepytorch.orgylaiEnglish · 1 year agomessage-square0fedilink
ylaiEnglish · 1 year agoLlemma: An Open Language Model For Mathematicsplus-squareblog.eleuther.aiexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLlemma: An Open Language Model For Mathematicsplus-squareblog.eleuther.aiylaiEnglish · 1 year agomessage-square0fedilink
ylaiEnglish · 1 year ago“Large Language Models (in 2023)” (Talk by Hyung Won Chung, OpenAI, at Seoul National University)plus-squarewww.youtube.comexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-link“Large Language Models (in 2023)” (Talk by Hyung Won Chung, OpenAI, at Seoul National University)plus-squarewww.youtube.comylaiEnglish · 1 year agomessage-square0fedilink
ylaiEnglish · 1 year agoLLM Finetuning Risksplus-squarellm-tuning-safety.github.ioexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLLM Finetuning Risksplus-squarellm-tuning-safety.github.ioylaiEnglish · 1 year agomessage-square0fedilink
ylaiEnglish · 1 year agoPhi 1.5 and the Shift Towards Smaller Models with Curated Data: A Closer Lookplus-squaremedium.comexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkPhi 1.5 and the Shift Towards Smaller Models with Curated Data: A Closer Lookplus-squaremedium.comylaiEnglish · 1 year agomessage-square0fedilink