Heh, you shouldn’t be paying for LLMs. Gemini 2.5 Pro is free, and so are a bunch of great API models. ChatGPT kinda sucks these days (depending on the content).
I have technical reasons for running local models (instant cached responses, constrained grammar, logprob output, finetuning), and I can help you set that up if you want, but TBH I am not going into a long technical proof of why that’s advantageous unless you really want to try this all yourself.
Heh, you shouldn’t be paying for LLMs. Gemini 2.5 Pro is free, and so are a bunch of great API models. ChatGPT kinda sucks these days (depending on the content).
I have technical reasons for running local models (instant cached responses, constrained grammar, logprob output, finetuning), and I can help you set that up if you want, but TBH I am not going into a long technical proof of why that’s advantageous unless you really want to try this all yourself.