February 28, 2024
First 'truly' open source LLM from AI2 to drive 'critical shift' in AI development


The Allen Institute for AI (AI2), a non-profit research institute founded in 2014 by the late Microsoft co-founder Paul Allen, announced today that it has introduced the open source OLMo, which it calls the “first truly open LLM and framework,” creating an “alternative to current models that are restrictive and closed” and driving a “critical shift” in AI development.

While other models have included the model code and model weights, OLMo also provides the training code, training data and associated toolkits, as well as evaluation toolkits. In addition, OLMo was released under an open source initiative (OSI) approved license, with AI2 saying that “all code, weights, and intermediate checkpoints are released under the Apache 2.0 License.”

The news comes at a moment when open source/open science AI, which has been playing catch-up to closed, proprietary LLMs like OpenAI’s GPT-4 and Anthropic’s Claude, is making significant headway.

For example, yesterday the CEO of Paris-based open source AI startup Mistral confirmed the ‘leak’ of a new open source AI model nearing GPT-4 performance. And on Monday, Meta released a new and improved version of its code generation model, Code Llama 70B, as many eagerly await the third iteration of its Llama LLM.

VB Event

The AI Impact Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.

 


Request an invite

However, open source AI continues to come under fire by some researchers, regulators and policy makers — a recent, widely-shared opinion piece in IEEE Spectrum, for instance, is titled “Open-Source AI is Uniquely Dangerous.”

The OLMo framework’s “completely open” AI development tools, available to the public, includes full pretraining data, training code, model weights and evaluation. It provides inference code, training metrics and training logs, as well as the evaluation suite used in development — 500+ checkpoints per model, “from every 1000 steps during the training process and evaluation code under the umbrella of the Catwalk project.”

The researchers at AI2 said they will continue to iterate on OLMo with different model sizes, modalities, datasets, and capabilities.

“Many language models today are published with limited transparency,” said Hanna Hajishirzi, OLMo project lead, a senior director of NLP Research at AI2, and a UW professor, in a press release. “Without having access to training data, researchers cannot scientifically understand how a model is working. It’s the equivalent of drug discovery without clinical trials or studying the solar system without a telescope,” said “With our new framework, researchers will finally be able to study the science of LLMs, which is critical to building the next generation of safe and trustworthy AI.”

Nathan Lambert, an ML scientist at AI2, posted on LinkedIn saying that “OLMo will represent a new type of LLM enabling new approaches to ML research and deployment, because on a key axis of openness, OLMo represents something entirely different. OLMo is built for scientists to be able to develop research directions at every point in the development process and execute on them, which was previously not available due to incomplete information and tools.”

Jonathan Frankle, chief scientist at MosaicML and Databricks, called AI2’s OLMa release a “A giant leap for open science,” while Hugging Face CTO posted on X that the model/framework is “pushing the envelope of open source AI.”

And Meta chief scientist Yann LeCun contributed a quote to AI2s press release: “Open foundation models have been critical in driving a burst of innovation and development around generative AI,” he said. “The vibrant community that comes from open source is the fastest and most effective way to build the future of AI.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.





Source link