Nvidia’s NVLM Is a Game-Changing Open-Source AI Model

Krishi Chowdhary

from Techreport on 2024-10-04 10:28 (#6R7PA)

Nvidia is venturing into OpenAI, Meta, and Google's territory with its multimodal large language model NVLM 1.0 and its flagship version NVLM-D-72B.
As the name suggests, this model has 72 billion parameters.
It's proficient in processing complex information, but the key takeaway is that it is completely open source.

Untitled-design-10_cr.jpg?_t=1728037670

Nvidia's latest open-source artificial intelligence model can give current industry leaders, such as OpenAI (which recently released a new AI model, OpenAI o1), Meta, and Google, a serious run for their money.

After years of making the best-in-class chips for AI models, Nvidia has finally released a new open-source multimodal large language model NVLM 1.0 and its flagship model NVLM-D-72B (featuring up to 72 billion parameters).

Making the announcement, the researchers wrote in a paper: We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models."

About the Model

The NVLM-D-72B model is setting new benchmarks in the industry. It's extremely proficient in processing complex visual and textual inputs. You can see examples of how it can solve mathematical problems, interpret charts, or analyze images (even memes) online.

Unlike a lot of similar models that face a dip in text performance, Nvidia's tool improves its performance on text-only tasks by an average of 4.3 points. This is across various text benchmarks post multimodal training.The Impact of an Open-Source AI Model

Almost all popular AI models are closed, but Nvidia's model weights are publicly available, with the training code set to be released soon. As was expected, the AI community has reacted positively to Nvidia's decision to make this model open-source.

For example, a social media user posted on X, Wow! Nvidia just published a 72B model which is ~on par with llama 3.1 405B in math and coding evals and also has vision ?"

Nvidia's decision to make it open source will allow researchers and developers more access to its technology. This in turn will boost the overall AI industry and might even pressure other companies into introducing open-source models.

Another reason why Nvidia's latest AI model is so revolutionary is that it uses a hybrid approach combining different multimodal processing techniques. Simply put, it has the potential to change the direction of future research.

However, with great power comes great responsibility, especially in today's day and age when we have top dogs of a giant company resigning citing safety concerns associated with AI development. Yep, I'm talking about OpenAI.

In the last few years since AI has boomed, we've already had a glimpse of how dangerous it can be in the wrong hands. This begs the question: is it wise to make such potent technology widely accessible?

Look, I'm not saying that research on AI needs to be stopped. Open-source tools are undoubtedly a blessing for researchers and developers, allowing them to make valuable contributions despite the dominance of the top players.

However, the most important thing is balance. AI can really transform lives for the better. However, at the same time, it can also ruin them - think deepfakes, privacy violations, job losses, and more.

Therefore, it's important to ensure that we have enough regulation, protection, and safety nets in place before making a tool like this public.

As for NVLM 1.0, the coming months will tell us more about whether and how it will change the course of AI development. Stay tuned to find out!

The post Nvidia's NVLM Is a Game-Changing Open-Source AI Model appeared first on The Tech Report.

Source	RSS or Atom Feed
Feed Location	https://techreport.com/feed/
Feed Title	Techreport
Feed Link	https://techreport.com/