It would be great if they were designed from the ground up to be good machines for running models, say with a GPU that had a copious amount of memory that didn’t cost $1,500 for an add-on. Unfortunately, to do that they’d have to create something from nothing, so instead they’ve added something that is worse than most GPUs, added some dumb software which is designed to pair with the ultimate result of disappointing people, and called it a day.
You can get a lot done currently with ARC. The mobile ARC versions share system memory, So if you get a mini PC with ARC and upgrade it to 96GB, you can share system ram with the GPU and load decently large models. They’re a little slow it not being vram and all, but still useful (and cheap)
To me, that’s the killer flaw of these things.
It would be great if they were designed from the ground up to be good machines for running models, say with a GPU that had a copious amount of memory that didn’t cost $1,500 for an add-on. Unfortunately, to do that they’d have to create something from nothing, so instead they’ve added something that is worse than most GPUs, added some dumb software which is designed to pair with the ultimate result of disappointing people, and called it a day.
You can get a lot done currently with ARC. The mobile ARC versions share system memory, So if you get a mini PC with ARC and upgrade it to 96GB, you can share system ram with the GPU and load decently large models. They’re a little slow it not being vram and all, but still useful (and cheap)
https://www.youtube.com/watch?v=xyKEQjUzfAk
I have it running on a zenbook duo with 32GB so I can’t load the 70B models, but I works shockingly well.