Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

BetaDoggo_ ,

Koboldcpp should allow you to run much larger models with a little bit of ram offloading. There's a fork that supports rocm for AMD cards: https://github.com/YellowRoseCx/koboldcpp-rocm

Make sure to use quantized models for the best performace, q4k_M being the standard.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • selfhosted@lemmy.world
  • incremental_games
  • meta
  • All magazines