Article 6NPQ5 AI Models Are Undertrained by 100-1000 Times – AI Will Be Better With More Training Resources

AI Models Are Undertrained by 100-1000 Times – AI Will Be Better With More Training Resources

by
Brian Wang
from NextBigFuture.com on (#6NPQ5)
Story ImageThe Chinchilla compute optimal point for an 8B (8 billion parameter) model would be train it for ~200B (billion) tokens. (if you were only interested to get the most bang-for-the-buck" w.r.t. model performance at that size). So this is training ~75X beyond that point, which is unusual but personally, [Karpathy] thinks this is extremely welcome. ...

Read more

External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/blogspot/advancednano
Feed Title NextBigFuture.com
Feed Link https://www.nextbigfuture.com/
Reply 0 comments