Search

Tools

Data Gravity Calculator

Determine if your data is too heavy to move to cheaper compute providers or if you should burst to the cloud.

1 TB 50 TB 500 TB
$1k $10,000 $50k

Gravity Score & Verdict

One-Time Egress Cost: $4,500
Gravity Score: 45.0%
Verdict:
Architectural Decision Space
Architect Tip

The cost of moving data is significant but not prohibitive. Evaluate if compute savings exceed this cost within 3 months.

Planning a massive AI data migration or multi-cloud strategy?

Request a Data Placement Audit

How the Math Works

Most AI infrastructure decisions get made on hourly GPU rates. That's often the wrong input variable. Where your data lives determines what your AI costs.

The Data Gravity Score Formula

Gravity Score (G) = (Dataset Size in GB × Egress Rate) ÷ Monthly Compute Cost

Interpreting the Score

Pricing Captured: April 2026.
Sources: Egress rates based on standard tiers from AWS, GCP, and Azure.
Disclaimer: Cloud providers frequently update egress policies and free tiers. Please double-check the latest rates on the provider's website before making final architectural decisions.

Frequently Asked Questions

What is 'Data Gravity'?

Data Gravity is the idea that data and applications are pulled toward large datasets, much like gravity pulls objects toward massive bodies. In cloud computing, this effect is magnified by egress fees (the cost charged by providers to move data out of their network).

How do egress fees affect AI training?

If you store 50TB of training data in AWS S3 but want to use cheaper GPUs on a specialized cloud like CoreWeave, you must pay AWS egress fees to transfer that data. This one-time cost can often wipe out the savings you gained from the cheaper GPUs.

What is 'Full Repatriation'?

Full Repatriation means moving both your data and your compute workloads out of the public cloud and into a colocation facility or on-premise data center that you own or rent dedicatedly, eliminating egress fees entirely for steady-state workloads.

How can I reduce egress fees?

Use specialized cloud providers with zero or low egress fees, leverage Direct Connect/Interconnect for discounted rates, or use data compression and delta syncs to minimize data movement.

What is cloud lock-in in AI?

Cloud lock-in occurs when the cost and complexity of moving data out of a cloud provider (due to egress fees and proprietary services) makes it impractical to switch to a competitor.

Should I store data in specialized clouds?

Specialized clouds (like CoreWeave or Lambda) often have much lower egress fees and cheaper GPU rates, but might lack the rich ecosystem of storage and data processing tools found in AWS or GCP.

How does Direct Connect help?

These services provide a dedicated physical network connection to the cloud, often offering lower latency and significantly discounted egress rates compared to the public internet.

What is the impact on multi-region setups?

Data gravity applies between regions too. Moving large datasets between regions within the same cloud still incurs inter-region data transfer costs, which can accumulate quickly.

How often should I re-evaluate?

Regularly, especially when dataset size grows significantly or compute needs change (e.g., shifting from active training to steady-state inference).

Does regulation affect data placement?

Yes, data sovereignty laws (like GDPR) may require data to remain within specific geographic boundaries, overriding pure cost considerations in some cases.