News

Fine-tuned models are generally smaller than their large language model counterparts. Examples include OpenAI’s Codex, a direct descendant of GPT-3 fine-tuned for programming tasks.
All of today’s well-known language models—e.g., GPT-3 from OpenAI, PaLM or LaMDA from Google, Galactica or OPT from Meta, Megatron-Turing from Nvidia/Microsoft, Jurassic-1 from AI21 Labs—are ...
Students often train large language models (LLMs) as part of a group. In that case, your group should implement robust access ...
More information: Valentin Hofmann et al, Derivational morphology reveals analogical generalization in large language models, Proceedings of the National Academy of Sciences (2025). DOI: 10.1073 ...
Many NLP applications are built on language representation models (LRM) designed to understand and generate human language. Examples of such models include GPT (Generative Pre-trained Transformer ...
UC Berkeley researchers say large language models have gained "metalinguistic ability," a hallmark of human language and ...
They dubbed the resulting dataset “TinyStories” and used it to train very small language models of around 10 million parameters. To their surprise, when prompted to create its own stories, the small ...
Large language models are infamous for spewing toxic biases, thanks to the reams of awful human-produced content they get trained on. But if the models are large enough, and humans have helped ...
The starting point of a FlexOlmo project is a so-called anchor AI model. Every organization that participates in the project ...
For example, AI21 Labs in April debuted a model called Jamba, an intriguing combination of transformers with a second neural network called a state space model (SSM). The mixture has allowed Jamba ...