TinyLlama is a project that aims to pretrain a 1.1B Llama model on 3 trillion tokens. The project is optimized to achieve this within a span of 90 days using only 16 A100-40G GPU. TinyLlama is different from LLaMA, which is a series of large language models released by Meta AI, including LLaMA and LLaMA2. LLaMA2 is an open-source large language model that can be used for commercial applications.
GitHub Repo: TinyLlama
TinyLlama plans to pretrain a relatively small 1.1 billion parameter Llama model using a massive 3 trillion tokens, far exceeding the commonly used Chinchilla scaling laws. This sparks an interesting debate on whether TinyLlama could overfit despite the large dataset, as OpenAI's early findings showed small models tend to overfit. Some argue the sheer diversity and size of the dataset could overcome this.
There is also exploration around why Meta stopped training the much larger Llama 2 model when loss was still decreasing - whether a Llama 2.1 release would be feasible. Compared to the huge Llama 7B, TinyLlama's smaller size may make it more usable for production environments and enable new applications, though consumer GPUs can efficiently run Llama 7B now too.
Questions arise around how the project is funded, likely through wealthy engineers' hobby or academic grants. Overall, TinyLlama is an ambitious project that will provide valuable learnings for the AI community on training small vs large models, as well as model optimization. This could inform future research directions for models like ChatGPT and Claude. We may see the Chinchilla scaling laws are not meant to be rigid physical laws.
In summary, it's an interesting project to monitor, exploring effectiveness of small models on huge datasets.