Huggingface TGI is just a piece of software handling the models, like gpt4all. Here is a list of models officially supported by TGI, although they state that you can try different ones as well. You follow the link and look for the files section. The size of the model files (safetensors or pickele binaries) gives a good estimate of how much vram you will need. Sadly this is more than most consumer graphics cards have except for santacoder and microsoft phi.