Tools
Data Gravity Calculator
Determine if your data is too heavy to move to cheaper compute providers or if you should burst to the cloud.
Gravity Score & Verdict
The cost of moving data is significant but not prohibitive. Evaluate if compute savings exceed this cost within 3 months.
Planning a massive AI data migration or multi-cloud strategy?
Request a Data Placement AuditHow the Math Works
Most AI infrastructure decisions get made on hourly GPU rates. That's often the wrong input variable. Where your data lives determines what your AI costs.
The Data Gravity Score Formula
Interpreting the Score
- Score > 0.5 (High Gravity): Egress exceeds 50% of your monthly compute cost. The data is too heavy to move economically. Verdict: Stay Put or Full Repatriation.
- Score < 0.1 (Low Gravity): Data is effectively weightless. Egress is a minor factor. Verdict: Hybrid Burst (Cheapest compute wins).
- Between 0.1 and 0.5: The architectural decision space where provider selection and migration planning actually matter.
Pricing Captured: April 2026.
Sources: Egress rates based on standard tiers from AWS, GCP, and Azure.
Disclaimer: Cloud providers frequently update egress policies and free tiers. Please double-check the latest rates on the provider's website before making final architectural decisions.
Frequently Asked Questions
What is 'Data Gravity'?
Data Gravity is the idea that data and applications are pulled toward large datasets, much like gravity pulls objects toward massive bodies. In cloud computing, this effect is magnified by egress fees (the cost charged by providers to move data out of their network).
How do egress fees affect AI training?
If you store 50TB of training data in AWS S3 but want to use cheaper GPUs on a specialized cloud like CoreWeave, you must pay AWS egress fees to transfer that data. This one-time cost can often wipe out the savings you gained from the cheaper GPUs.
What is 'Full Repatriation'?
Full Repatriation means moving both your data and your compute workloads out of the public cloud and into a colocation facility or on-premise data center that you own or rent dedicatedly, eliminating egress fees entirely for steady-state workloads.
How can I reduce egress fees?
Use specialized cloud providers with zero or low egress fees, leverage Direct Connect/Interconnect for discounted rates, or use data compression and delta syncs to minimize data movement.
What is cloud lock-in in AI?
Cloud lock-in occurs when the cost and complexity of moving data out of a cloud provider (due to egress fees and proprietary services) makes it impractical to switch to a competitor.
Should I store data in specialized clouds?
Specialized clouds (like CoreWeave or Lambda) often have much lower egress fees and cheaper GPU rates, but might lack the rich ecosystem of storage and data processing tools found in AWS or GCP.
How does Direct Connect help?
These services provide a dedicated physical network connection to the cloud, often offering lower latency and significantly discounted egress rates compared to the public internet.
What is the impact on multi-region setups?
Data gravity applies between regions too. Moving large datasets between regions within the same cloud still incurs inter-region data transfer costs, which can accumulate quickly.
How often should I re-evaluate?
Regularly, especially when dataset size grows significantly or compute needs change (e.g., shifting from active training to steady-state inference).
Does regulation affect data placement?
Yes, data sovereignty laws (like GDPR) may require data to remain within specific geographic boundaries, overriding pure cost considerations in some cases.