A recent breakthrough in the field of Artificial Intelligence is the introduction of Large Language Models (LLMs). These models enable us to understand language more concisely and, thus, make the best use of Natural Language Processing (NLP) and Natural Language Understanding (NLU). These models are performing well on every other task, including text summarization, question answering, content generation, language translation, and so on. They understand complex textual prompts, even texts with reasoning and logic, and identify patterns and relationships between that data.
Though language models have shown incredible performance and have developed significantly in recent times by demonstrating their competence in a variety of tasks, it still remains difficult for them to use tools through API calls in an efficient manner. Even famous LLMs like GPT-4 struggle to generate precise input arguments and frequently recommend inappropriate API calls. To address this issue, Berkeley and Microsoft Research researchers have proposed Gorilla, a finetuned LLaMA-based model that beats GPT-4 in terms of producing API calls. Gorilla helps in choosing the appropriate API, improving LLMs’ capacity to work with external tools to carry out particular activities.
The team of researchers has also created an APIBench dataset, which is made up of a sizable corpus of APIs with overlapping functionality. The dataset has been created by collecting public model hubs like TorchHub, TensorHub, and HuggingFace for their ML APIs. Every API request from TorchHub and TensorHub is included for each API, and the top 20 models from HuggingFace for each task category are chosen. Additionally, they produce ten fictitious user query prompts for each API using the self-instruct method.
Using this APIBench dataset and document retrieval, researchers have finetuned Gorilla. Gorilla, the 7 billion parameter model outperforms GPT-4 in terms of the correctness of API functioning and lowers hallucinatory mistakes. The document retriever’s effective integration with Gorilla demonstrates the possibility for LLMs to use tools more precisely. The improved API call-generating capabilities of Gorilla and its capacity to modify documentation as necessary improves the applicability and dependability of the model’s results. This development is important because it allows LLMs to keep up with regularly updated documentation, giving users more accurate and current information.
One of the examples shared by the researchers shows how Gorilla correctly recognizes tasks and offers fully-qualified API results. API calls generated by the models showed GPT-4 producing API requests for hypothetical models, which demonstrates a lack of comprehension of the task. Claude chose the wrong library, showing a lack of ability to recognize the right resources. Gorilla, in contrast, correctly recognized the task. Gorilla thus differs from GPT-4 and Claude as its API call creation is accurate, demonstrating both its enhanced performance and task comprehension.
In conclusion, Gorilla is a major addition to the list of language models, as it even addresses the issue of writing API calls. Its capabilities enable the reduction of problems related to hallucination and reliability.
Check out the Paper, Github Link, and Project Page. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]
Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.