Torch package

obbeel · 3 years ago

Torch package

wazowski · 3 years ago

basically torch is an huge lib in itself, and it targets not only virtually all cpu architectures, but also multiple gpu frameworks (cuda, roc, vulkan), all off which support thousands of gpus together, both desktop and mobile

and all of this is packaged into a single binary so that it works for everyone, regardless of hardware

if you want a smaller size, you can compile it from source for your specific architecture, or download minimised precompiled versions for your target architecture

Amicese · edit-2 2 years ago

deleted by creator

wazowski · 3 years ago

it’s really a miracle how all of this is held together tbh while being so cross-platform

the core engine of torch, which contains things like vector calculus (automatic differentiation), some tensor operations, data preprocessing, data de/serialisation, et cetera, is written in regular C++, so it basically runs on anything that a C++ compiler could target, which is basically everything

the problem starts when you want to add gpu acceleration in order to speed up things like matrix multiplication (which is typically the most computationally expensive part of the machine learning pipeline)

when torch (and other ml libs) started out, cuda was basically the most advanced, easiest to use lib for gpu compute (probably still is), and nvidia gpus were far superior to anything the competition could offer, and ml on mobile devices wasn’t a thing, so everyone went for it and for a long time ml existed almost solely on devices with nvidia graphics cards that could support cuda

then amd and arm started to catch up, and things like amd rocm was added to support amd gpus, vulkan was added to support both gpus on mobile devices and also nvidia and amd gpus, and at the moment all of this exists in this kind of mess, where practically all functionality is supported if you use cuda, but for rocm and vulkan a lot of things don’t work, and you often have to compile everything from scratch for things to be supported

and now all of this mess is wrapped in python to simplify the api, which was a big mistake in my opinion, bc not only is the api simplification unnecessary, but now if you want to target any specific architecture, it must be supported by the core torch engine, some version of a gpu compute lib (unless you want to do inference on the cpu, which you prolly don’t), and the python wrapper

so now, bc you want everything to work out of the box, all of these things are put into a binary, which results in this huge file size, and i imagine the maintenance of torch is pretty hard at least partially as a result of this

if you were building something like torch today, things are a lot more simple, bc you could just write the core engine in smth like C++, and then use smth like vulkan kompute, which is the name of a wrapper api around regular vulkan, but massively simpler and more user-friendly, and supports every gpu under the sun, and boom, you have an much more concise and easily maintainable library