Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

Toes ,
@Toes@ani.social avatar

4GB is practically nothing in this space. Ideally you want at least 10GB of dedicated vram if you can't get even more. Keep in mind you're also probably trying to share that vram with your operating system. So it's more like ~3GB before you even started.

Kolboldcpp is capable of using both your GPU and CPU together, you might wanna consider that. (Using a feature called layers) There's a trade-off that occurs between the memory available and the quality of its output and the speed of the calculation.

The model mentioned in this post can be run on the CPU with enough system ram or swap.

If you wanna keep it all on the GPU check out 4bit models. Also there's been a lot of work into trying to do this with the raspberry Pi. I suspect that their work could help you out here as well.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • technology@lemmy.world
  • random
  • incremental_games
  • meta
  • All magazines