I'm using koboldcpp and ollama. KoboldCpp is really awesome. In terms of hardware it's an old PC with lots of RAM but no graphics card, so it's quite slow for me. I occasionally rent a cloud GPU instance on runpod.io Not doing anything fancy, mainly role play, recreational stuff and I occasionally ask it to give me creative ideas for something, translate something or re-word or draft an unimportant text / email.
Have tried coding, summarizing and other stuff, but the performance of current AI isn't enough for my everyday tasks.
Probably better to ask on !localllama. Ollama should be able to give you a decent LLM, and RAG (Retrieval Augmented Generation) will let it reference your dataset.
The only issue is that you asked for a smart model, which usually means a larger one, plus the RAG portion consumes even more memory, which may be more than a typical laptop can handle. Smaller models have a higher tendency to hallucinate - produce incorrect answers.
Short answer - yes, you can do it. It's just a matter of how much RAM you have available and how long you're willing to wait for an answer.
As you may know, ChatGPT collects a lot of data on the users for the improvement of their AI, but this poses risks in its own way. I was wondering whether there are privacy alternatives to ChatGPT. Perhaps on F-Droid or Aurora/PlayStore, or for Linux....
Hi there! Looks like you linked to a Lemmy community using a URL instead of its name, which doesn't work well for people on different instances. Try fixing it like this: !localllama
Hello internet users. I have tried gpt4all and like it, but it is very slow on my laptop. I was wondering if anyone here knows of any solutions I could run on my server (debian 12, amd cpu, intel a380 gpu) through a web interface. Has anyone found any good way to do this?
Depends on your needs. Best look around in !localllama or similar. (I don't wanna say reddit but r/localLlama is much larger.)
If you're more into creative writing, maybe look for places that discuss SillyTavern (r/SillyTavernAI is an option). It's software for role-play chats, which may not be what you want. But the community is (relatively) large and likely to have good tips for non-coding/less technical applications.
Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...)
I've been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat....
Self hoating an LLM for research
I am a teacher and I have a LOT of different literature material that I wish to study, and play around with....
Privacy alternatives ChatGPT
As you may know, ChatGPT collects a lot of data on the users for the improvement of their AI, but this poses risks in its own way. I was wondering whether there are privacy alternatives to ChatGPT. Perhaps on F-Droid or Aurora/PlayStore, or for Linux....
Inside the Creation of the World’s Most Powerful Open Source AI Model (www.wired.com)
Self hosted LLM
Hello internet users. I have tried gpt4all and like it, but it is very slow on my laptop. I was wondering if anyone here knows of any solutions I could run on my server (debian 12, amd cpu, intel a380 gpu) through a web interface. Has anyone found any good way to do this?
Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard (huggingface.co)
Abacus.ai:...