Welcome to Incremental Social! Learn more about this project here! Check out lemmyverse to find more communities to join from here!
scottmeme , 4 months ago It's not about their frontend, they are running custom LPUs which can process LLM tokens at 500/sec which is insanely impressive. For reference with a max size of 2k tokens, my dual xeon silver 4114 procs take 2-3 minutes.
It's not about their frontend, they are running custom LPUs which can process LLM tokens at 500/sec which is insanely impressive.
For reference with a max size of 2k tokens, my dual xeon silver 4114 procs take 2-3 minutes.