LLM Parameter Sizes
Comments discuss the parameter counts of large language models, debating what constitutes a 'large' or 'small' model, comparisons to benchmarks like GPT-3, and topics like quantization, inference efficiency, and scaling.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Pffft only 280B parameters? Give me a break
How does training on just 800k pieces of data need 7b parameters?
Is it really close to GPT-3.5 at 2.7B?
Reckon they will (if not already) use 4bit or 8bit precision and may not need 175b params
The embedding size is only 8k so while the parameters are 70B. So it's a huge difference
28.7 million parameter is nothing for inference
Is 14B parameters still considered small?
Assuming you're referring to the largest model - BLOOM is huge, this is not, so presumably much worse
How can a Large Language Model be a small language model?
Is a tiny large language model equivalent to a normal sized one?