Exactly! If you only want to use a Large Language Model (LLM) to run your own local chatbot, then using a quantized version will dramatically improve speed and performance. It also allows consumer hardware to run larger models which would otherwise be prohibitively resource intensive.