That's nice and all, but what are some FOSS models I can run on GPU with only 4GB?
I've tried Deepseek Coder, and it's pretty nice for what I use it for. Then there's TinyLlama, which... well it's fast, but I need to be veeeery exact in how I prompt it.