Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

General_Effort ,

CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.) To run Qwen-72B-Chat in bf16/fp16, at least 144GB GPU memory is required (e.g., 2xA100-80G or 5xV100-32G). To run it in int4, at least 48GB GPU memory is requred (e.g., 1xA100-80G or 2xV100-32G).

It's derived from Qwen-72B, so same specs. Q2 clocks it in at only ~30GB.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • technology@lemmy.world
  • random
  • incremental_games
  • meta
  • All magazines