Bloomberg trained an LLM from scratch on AWS (64 × 8 A100 40GB for 53 days). They constructed a 363 billion token dataset based on Bloomberg’s extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general-purpose datasets.

BloombergGPT outperforms similarly-sized open models on financial NLP tasks by significant margins — without sacrificing performance on general LLM benchmarks