LLM Inference Costs
The cluster focuses on discussions about the costs of running and using large language models, including API pricing from OpenAI and other providers, comparisons between models like GPT-3, GPT-4, and cheaper alternatives, and debates on self-hosting versus cloud inference expenses.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Also is the gpt3 an important part of the cost? Gave you tried if e.g. gpt2 was enough for your use case?
Likely yes. You could even switch to a less intensive model for doing that (e.g. Curie). The Curie model is 1/10th as costly as Davinci to run. Running 8k tokens through Davinci is $0.16 while Curie would only be $0.016 - and that's likely showing up in back end compute and should be considered if someone was building their own chat bot on top of gpt3.
Based on my research, GPT-3.5 is likely significantly smaller than 70B parameters, so it would make sense that it's cheaper to run. My guess is that OpenAI significantly overtrained GPT-3.5 to get as small a model as possible to optimize for inference. Also, Nvidia chips are way more efficient at inference than M1 Max. OpenAI also has the advantage of batching API calls which leads to better hardware utilization. I don't have definitive proof that they're not dumping, but econo
Self-hosting LLMs is expensive at scale. It's cheaper to use VC subsidized model inference like the OpenAI APIs.
Cost to use openAI is also pretty low compared to hosting models
Wow! Is it really that cheap? GPT4 is much more expensive, I imagine?
Cheaper than GPT-3? Can you give a comparison of the costs?
Basically nothing especially if you use cheaper models and set the max token limit in `settings.json`. GPT-4 is $30.00 / 1M tokens for input and $60.00 / 1M tokens for output. For English text, 1 token is approximately 4 characters or 0.75 words. As a point of reference, the collected works of Shakespeare are about 900,000 words or 1.2M tokens. See https://openai.com/api/pricing and <a h
How much is it costing you to run the LLM? Is it OpenAI?
What % of your reveneue goes towards LLM costs?