Four-part intro: problem and product, how it works, the API, scope and run.
CloudBroker: An API That Knows the Price of Every VM Across Seven Clouds
The problem and the product: one place to ask "what's cheapest?"
You need a 4 vCPU / 8 GB machine. AWS shows one price, Hetzner another, Scaleway another — different sites, currencies, terms. Doing that once is tedious; doing it automatically at scale is impossible without a single, normalized source. CloudBroker is that source: ingest from seven providers, one schema, one request with constraints, one ranked answer. No provisioning, no Kubernetes — just "here are my constraints; what do you recommend?"
The price-comparison problem
Multi-cloud is real; comparison is manual. Each provider has a different API, different currencies, different update cycles. Cost control and vendor flexibility depend on knowing the options. Checking four provider consoles every time you need a VM doesn't scale. Scripts that scrape pricing pages break when the pages change. What teams need is one place that already has the numbers and can answer: "What's the cheapest VM that fits these specs, in this region, under this price?"
One API to rule them all
CloudBroker aggregates AWS, GCP, Azure, Hetzner, Scaleway, DigitalOcean, and OVH. One database, one request/response. You send constraints — min vCPU, min RAM, architecture, region (e.g. EU only), max price per hour in EUR, allowed providers — and get back a ranked list of instance types and regions. It does not create or delete VMs. It only answers "what do you recommend?"
Why separate this from "the thing that provisions"?
Pricing and scoring can evolve without touching a provisioner. The API is reusable: called by a Kubernetes controller (e.g. Cloudburst Autoscaler), a script, or a dashboard. Consumers can be anything that needs "cheapest VM for these constraints." Keeping the recommendation service separate means one team can own pricing logic and many systems can consume it.
The next piece explains how CloudBroker gets its data: ingestion and the data model.
Where CloudBroker Gets Its Numbers: Ingestion Across Seven Clouds
How it works: from seven APIs to one database.
CloudBroker can't recommend what it doesn't know. Its database is filled by ingestion: connectors that call each provider's pricing or catalog API, pull instance types and hourly rates, and upsert into PostgreSQL. Each provider is different; the connectors do the dirty work. Result: one model — providers, regions, instance types, prices — comparable in EUR.
The hierarchy
Four entities form the core:
- Provider — slug (e.g.
aws, hetzner, scaleway) and type (hyperscaler, EU, regional).
- Region — data center location with country code and EU flag (used for region constraints like "EU only").
- InstanceType — name, vCPU, RAM, architecture, family. Unique per provider.
- Price — hourly price for an instance type in a region. Original currency and EUR-normalized. Append-only; the engine uses the latest price per (instance_type, region).
FxRate — lookup for USD→EUR (and others) so recommendations are apples-to-apples. A stub rate is seeded if missing.
Provider --> Region
|
+--> InstanceType --> Price <-- Region
|
FxRate (lookup)
Ingestion in practice
You run ingestion via the CLI. Each provider has a dedicated connector; the Makefile wraps them:
make ingest-hetzner # Hetzner Cloud API
make ingest-gcp # GCP Billing Catalog (requires ADC)
make ingest-aws # AWS EC2 Pricing API
make ingest-azure # Azure Retail Prices (public API)
make ingest-scaleway # Scaleway API
make ingest-digitalocean
make ingest-ovh
make ingest-all # All of the above
Ingestion is idempotent. Providers, regions, and instance types are upserted by unique keys. Prices are append-only — duplicate (instance_type, region, same hour) are skipped. You can rerun at any time; the recommendation engine always uses the last ingested snapshot. You control how often it runs (e.g. daily cron).
What's in scope for ingestion
Seven providers; on-demand (and optionally spot) pricing. What's not in scope: no real-time tickers, no reserved-instance marketplaces in this baseline. The project focuses on comparable, hourly, normalized data so the recommendation API can rank options.
With this data in place, the next step is the interface: how a request becomes a ranked list — the recommendation API.
How CloudBroker Picks the Cheapest VM: Constraints, Scoring, and the Recommendation API
The interface: from "I need 2 vCPUs in Europe" to a ranked list.
Send a JSON body with constraints; get a list of recommendations, best first. In between: filter (drop what doesn't qualify), then score (price + fit, or multi-criteria). Cheaper and better-fit rank higher. Every recommendation can include an explain block so you see why it ranked where it did.
The request
Parameters you send to POST /api/recommendations:
- min_vcpu, min_ram_gb — minimum specs. Only instance types that meet or exceed these are considered.
- arch —
x86_64 or arm64.
- region_constraint — e.g.
"EU" to restrict to EU regions (uses the is_eu flag on regions).
- max_price_eur_per_hour — ceiling in EUR. No candidate above this is returned.
- allowed_providers — list of provider slugs (e.g.
["gcp", "hetzner", "scaleway"]). Only these providers are considered.
Optional: preferred_providers (score boost), purchase_model (on_demand, spot), limit (how many results).
Example:
{
"min_vcpu": 2,
"min_ram_gb": 4,
"arch": "x86_64",
"region_constraint": "EU",
"max_price_eur_per_hour": 0.50,
"allowed_providers": ["gcp", "hetzner", "scaleway"]
}
The pipeline: filter, then score
Filter: Drop candidates that don't match specs (vcpu >= min, ram_gb >= min, arch match), price (<= max_price_eur_per_hour), region (e.g. is_eu when region_constraint is EU), and allowed providers. Only the latest price per (instance_type, region) is used.
Score: Two modes. Legacy mode (default): weighted blend of normalized price (cheaper = higher) and resource fit (closer to what you asked = higher). Multi-criteria mode: adds performance and reliability dimensions; hyperscalers get a higher reliability score than regional providers. After the base score, adjustments: preferred providers get a boost; USD-only prices can get a small penalty; spot gets bonus and interruption-risk penalty.
Results are sorted by score descending. Top of the list = recommendation.
The response
Each item includes provider_slug, region_slug, instance_type_name, price_eur_per_hour, score, and optionally explain (resource_fit, normalized_price, weights, region_is_eu, etc.).
{
"instance_type_name": "cx23",
"provider_slug": "hetzner",
"region_slug": "fsn1",
"vcpu": 2,
"ram_gb": 4.0,
"price_eur_per_hour": 0.0048,
"score": 0.9898,
"explain": {
"resource_fit": 1.0,
"normalized_price": 0.9797,
"price_weight": 0.5,
"fit_weight": 0.5,
"region_is_eu": true
}
}
What CloudBroker doesn't do
No Kubernetes. No VM lifecycle. No notion of "who's calling" — it's a generic API. You (or another service) take the recommendation and provision elsewhere. CloudBroker is the price brain; something like Cloudburst Autoscaler is the provisioner.
So we have an API that, given constraints, returns the best option. The next piece is scope, limits, and how to run it.
CloudBroker in Practice: Scope, Stack, and Getting It Running
Scope, limits, and how to run it.
We've covered the problem, the data, and the interface. This piece ties it up: scope (seven providers, ingestion, recommendation API, optional analytics), what CloudBroker does not do (no provisioning, no Kubernetes, no real-time spot streaming), and how to get from clone to first recommendation.
Scope recap
CloudBroker ingests instance types and hourly prices from seven providers into one PostgreSQL database, normalized to EUR. It exposes a recommendation API: send constraints, get a ranked list. Optional price analytics endpoints (changes, trends) operate on the stored history. The API uses the last ingested snapshot — you control how often ingestion runs.
Out of scope (by design)
- No VM lifecycle — CloudBroker does not create or delete VMs.
- No cluster awareness — it doesn't know about Kubernetes or pods.
- Self-hosted — there is no SaaS offering in this project; you run it, you own the data and credentials.
- No non-compute products — focus is compute instance pricing for recommendation.
How to run it
- Clone the repo.
- Copy
.env.example to .env and set any required vars (e.g. DB URL, optional API key).
- Start the stack:
make up (API + PostgreSQL).
- Migrate and seed:
make migrate, make seed.
- Ingest for the providers you care about:
make ingest-hetzner, make ingest-gcp, or make ingest-all.
The API is available at http://localhost:8000; interactive docs at /docs. Any client can call POST /api/recommendations. For a Kubernetes controller that uses this API to provision burst nodes, see Cloudburst Autoscaler (separate project and docs).
Quick start:
git clone https://github.com/braghettos/cloudbroker
cd cloudbroker
cp .env.example .env
make up
make migrate
make seed
make ingest-all
# API at http://localhost:8000, Swagger at http://localhost:8000/docs
Closing
CloudBroker is the place you ask "what's cheapest?" What you do with the answer — script, dashboard, or autoscaler — is up to you.