As the chatbot wars heat up, Facebook parent company Meta is introducing its own AI-powered large language model, which it says can outperform rival programs, including the older GPT-3 model from OpenAI. 

Meta’s large language model is called LLaMA(Opens in a new window), and it can generate human-like conversations by essentially autocompleting strings of text, much like other AI-powered chatbots. However, the company says it can run more efficiently than other large language models and requires fewer hardware requirements.   

“LLaMA-13B outperforms GPT-3 on most benchmarks, despite being 10x smaller,” the company’s researchers wrote(Opens in a new window) in a paper. (For perspective, GPT-3 originally debuted in 2020 before a newer version was used to power OpenAI’s ChatGPT program.)

Despite the touted improvements, Meta is only releasing LLaMA to the research community. The goal is to gather more input from experts when it’s clear that AI-powered chatbots, such as ChatGPT, can generate factual errors and show biases in their answers. 

“We believe that this model will help democratize the access and study of LLMs (large language models), since it can be run on a single GPU,” the company’s researchers added in a paper.

In the past, Meta has released its own chatbots, but they’ve failed to attract excitement like OpenAI’s ChatGPT, which may have recently seen as many 100 million people using it. So it’s possible the company wants to take its time before releasing LLaMA to the public.

Meta also points out access to large language models can be limited since they often require servers to run. “This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation,” the company wrote(Opens in a new window) in a blog post. 

Recommended by Our Editors

Meta’s LLaMA model comes in four versions(Opens in a new window) that operate over 7 billion, 13 billion, 33 billion, or 65 billion parameters. That’s significantly smaller than OpenAI’s GPT-3, which runs on 175 billion parameters. However, Meta says LLaMA was able to outperform existing large language models by training it on more snippets of data, known as tokens. 

“We trained LLaMA 65B and LLaMA 33B on 1.4 trillion tokens. Our smallest model, LLaMA 7B, is trained on one trillion tokens,” the company added. This included training the models on text from across the internet, along with books and Wikipedia articles. 

Researchers can apply for access to Meta’s LLaMA model through a company website(Opens in a new window).

Get Our Best Stories!

Sign up for What’s New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *