Granite Experiments Collection Experimental projects under consideration for the Granite family. • 13 items • Updated 2 days ago • 19
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning mayank-mishra • Jun 11, 2024 • 21
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper • 2404.05567 • Published Apr 8, 2024 • 10
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler Paper • 2408.13359 • Published Aug 23, 2024 • 23
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published May 7, 2024 • 26
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published May 7, 2024 • 26