Nvidia Is Piloting a Generative AI for Its Engineers
In a keynote address at the IEEE/ACM International Conference on Computer-Aided Design Monday, Nvidia chief technology officer Bill Dally revealed that the company has been testing a large-language-model AI to boost the productivity of its chip designers.
Even if we made them 5 percent more productive, that's a huge win," Dally said in an interview ahead of the conference. Nvidia can't claim it's reached that goal yet. The system, called ChipNeMo, isn't ready for the kind of large-and lengthy-trial that would really prove its worth. But a cadre of volunteers at Nvidia is using it, and there are some positive indications, Dally said.
ChipNeMo is a specially tuned spin on a large language model. It starts as an LLM made up of 43 billion parameters that acquires its skills from one trillion tokens-fundamental language units-of data. That's like giving it a liberal arts education," said Dally. But if you want to send it to graduate school and have it become specialized, you fine-tune it on a particular corpus of data...in this case, chip design."
That took two more steps. First, that already-trained model was trained again on 24 billion tokens of specialized data. Twelve billion of those tokens came from design documents, bug reports, and other English-language internal data accumulated over Nvidia's 30 years work designing chips. The other 12 billion tokens came from code, such as the hardware description language Verilog and scripts for carrying things out with industrial electronic design automation (EDA) tools. Finally, the resulting model was submitted to supervised fine-tuning," training on 130,000 sample conversations and designs.
The result, ChipNeMo, was set three different tasks: as a chatbot, as an EDA-tool script writer, and as a summarizer of bug reports.
Acting as a chatbot for engineers could save designers time, said Dally. Senior designers spend a lot of time answering questions for junior designers," he said. As a chatbot, the AI can save senior designer's time by answering questions that require experience, like what a strange signal might mean or how a specific test should be run.
Chatbots, however, are notorious for their willingness to lie when they don't know the answer and their tendency to hallucinate. So Nvidia developers integrated a function called retrieval-augmented generation into ChipNeMo to keep it on the level. That function forces the AI to retrieve documents from Nvidia's internal data to back up its suggestions.
The addition of retrieval-augmented generation improves the accuracy quite a bit," said Dally. More importantly, it reduces hallucination."
In its second application, ChipNeMo helped engineers run tests on designs and parts of them. We use many design tools," said Dally. These tools are pretty complicated and typically involve many lines of scripting." ChipNeMo simplifies the designer's job by providing a very natural human interface to what otherwise would be some very arcane commands."
ChipNeMo's final use case, analyzing and summarizing bug reports, is probably the one where we see the prospects for the most productivity gain earliest," said Dally. When a test fails, he explained, it gets logged into Nvidia's internal bug-report system, and each report can include pages and pages of detailed data. Then an ARB" (short for action required by") is sent to a designer for a fix, and the clock starts ticking.
ChipNeMo summarizes the bug report's many pages into as little as a single paragraph, speeding decisions. It even can write that summary in two modes: one for the engineer and one for the manager.
Makers of chip-design tools, such as Synopsys and Cadence, have been diving into integration of AI into their systems. But according to Dally, they won't be able to achieve the same thing Nvidia is after.
The thing that enables us to do this is 30 years of design documents and code in a database," he said. ChipNeMo is learning from the entire experience of Nvidia." EDA companies just don't have that kind of data.