Eerke Boiten, Professor of Cyber Security at De Montfort University Leicester, explains his belief that current AI should not be used for serious applications.
If you’re unfamiliar with FPGA, you may want to read up a bit, but essentially, a generic platform that is reprogrammed between iterations of doing something more efficiently than a generic instruction set. You tell it what to do, and it does it.
This is more efficient than x86, ARM, or RISC because you’re setting the boundaries and capabilities, not the other way around.
Your understanding of GPUs is wrong though. What people run now is BECAUSE of GPUs being available and able to run those workloads. Not even well, just quickly. Having an FPGA set for YOUR specific work is drastically more efficient, and potentially faster depending on what you’re doing. Obviously for certain things, it’s a circle peg in a square hole, but you have to develop for what is going to work for your own specific use-case.
I know exactly what they are. I design CPUs for a living, use FPGAs to emulate them, and have worked on GPUs and many other ASICs in the past.
FPGAs can accelerate certain functions, yes, but neural net evaluation is basically massive matrix multiplies. That’s something that GPUs are already highly optimised for. Hence, why I asked what circuit you’d put on the FPGA. Unless you can accelerate the algorithmic evaluation by several orders of magnitude the inefficiency of FPGAs Vs ASICs will cripple you.
You don’t design CPUs for a living unless you’re talking about the manufacturing process, or maybe you’re just bad at it and work for Intel. Your understanding of how FPGA works is super flawed, and your boner for GPUs is awkward. Let me explain some things as someone who actually works in this industry.
Matrix math is just stupid for whatever you pipe through it. It does the input, and gives an output.
That is exactly what all these “NPU” co processing cores are about from AMD, Intel, and to a further subset Amazon and Google on whatever they’re calling their chips now. They are all about an input and output for math operations as fast as possible.
In my own work, these little AMD XDNA chips pop out multiple segmented channels way better than GPUs when gated for single purpose. Image inference, audio, logic, you name it. And then, SHOCKER!, if I try and move this to a cloud instance, I can reprogram the chip on the fly to swap from one workload to another in 5ms. It’s not just a single purpose math shoveling instance anymore, it’s doing articulations on audio clips, or if the worker wants, doing ML transactions for data correlation. This costs almost 75% less than provisioning stock sets of any instances to do the same workload.
Matrix math is just stupid for whatever you pipe through it. It does the input, and gives an output.
Indeed.
That is exactly what all these “NPU” co processing cores are about from AMD, Intel, and to a further subset Amazon and Google on whatever they’re calling their chips now. They are all about an input and output for math operations as fast as possible.
Yes, they are all matrix math accelerators, and none of which have any FPGA aspects.
yes, but what you need to be doing is tons of multiply-accumulate, using a fuckton of memory bandwidth… Which a gpu is designed for. You won’t design anything much better with an fpga.
If you’re unfamiliar with FPGA, you may want to read up a bit, but essentially, a generic platform that is reprogrammed between iterations of doing something more efficiently than a generic instruction set. You tell it what to do, and it does it.
This is more efficient than x86, ARM, or RISC because you’re setting the boundaries and capabilities, not the other way around.
Your understanding of GPUs is wrong though. What people run now is BECAUSE of GPUs being available and able to run those workloads. Not even well, just quickly. Having an FPGA set for YOUR specific work is drastically more efficient, and potentially faster depending on what you’re doing. Obviously for certain things, it’s a circle peg in a square hole, but you have to develop for what is going to work for your own specific use-case.
I know exactly what they are. I design CPUs for a living, use FPGAs to emulate them, and have worked on GPUs and many other ASICs in the past.
FPGAs can accelerate certain functions, yes, but neural net evaluation is basically massive matrix multiplies. That’s something that GPUs are already highly optimised for. Hence, why I asked what circuit you’d put on the FPGA. Unless you can accelerate the algorithmic evaluation by several orders of magnitude the inefficiency of FPGAs Vs ASICs will cripple you.
You don’t design CPUs for a living unless you’re talking about the manufacturing process, or maybe you’re just bad at it and work for Intel. Your understanding of how FPGA works is super flawed, and your boner for GPUs is awkward. Let me explain some things as someone who actually works in this industry.
Matrix math is just stupid for whatever you pipe through it. It does the input, and gives an output.
That is exactly what all these “NPU” co processing cores are about from AMD, Intel, and to a further subset Amazon and Google on whatever they’re calling their chips now. They are all about an input and output for math operations as fast as possible.
In my own work, these little AMD XDNA chips pop out multiple segmented channels way better than GPUs when gated for single purpose. Image inference, audio, logic, you name it. And then, SHOCKER!, if I try and move this to a cloud instance, I can reprogram the chip on the fly to swap from one workload to another in 5ms. It’s not just a single purpose math shoveling instance anymore, it’s doing articulations on audio clips, or if the worker wants, doing ML transactions for data correlation. This costs almost 75% less than provisioning stock sets of any instances to do the same workload.
You have no idea what you’re talking about.
Indeed.
Yes, they are all matrix math accelerators, and none of which have any FPGA aspects.
Except AMD XDNA is a straight up FPGA, and Intel XEco is as well.
For someone who claims to work in this industry, you sure have no idea what’s going on.
yes, but what you need to be doing is tons of multiply-accumulate, using a fuckton of memory bandwidth… Which a gpu is designed for. You won’t design anything much better with an fpga.