# NEBU
A toolkit for building Stellar indexers.
A stable Go contract, standalone processors, and Unix-composable pipelines. From on-chain truth to your applications in two minutes — no orchestrator, no YAML, no daemon.
go install github.com/withObsrvr/nebu/cmd/nebu@latest export PATH="$HOME/go/bin:$PATH"
curl -sSL https://nebu.withobsrvr.com/install.sh | sh
docker run --rm withobsrvr/nebu:latest \ token-transfer --start-ledger 60200000 --end-ledger 60200001
## ABOUT
> "The GUI is a filter. The pipe is a lens."
nebu (pronounced "neh-boo") is built on the supported building blocks Stellar provides for modern indexing — RPC-backed ledger access, the ingest SDK, XDR-native extraction — and packages them into a stable Go contract, standalone processor binaries, and Unix-composable pipelines.
Named after the Nebuchadnezzar from The Matrix. nebu is the vessel that carries data from the on-chain truth to your applications.
* One repo, two products.
* A stable contract in pkg/processor.
* A CLI and reference processors for everyone else.
[ VESSEL_LOG ]
## ARCHITECTURE
Two ledger sources, one contract. Use --mode rpc for live tailing and recent-range backfills. Use --mode archive against a GCS or S3 ledger archive to read deep history without RPC rate limits — the same XDR, the same processors, the same JSON on the way out.
Everything downstream of [ ORIGIN ] communicates via Unix pipes (stdin/stdout, newline-delimited JSON) — trivial to compose with jq, duckdb, or anything that reads a stream.
## FEATURES
pkg/processor and pkg/source are committed-stable, enforced by CI against API snapshots in .api/. Build your own processor, ship it as its own binary.
No daemon. No orchestrator. Origin | Transform | Sink, composed with the same pipe you've used since 1973. Pipe to jq, to duckdb, to tee.
Fetch raw XDR directly from GCS or S3 at 100–500 ledgers/sec — bypassing RPC rate limits entirely. Backfill Stellar's full history into a data lakehouse without a single custom worker.
Every processor emits a JSON-schema-valid --describe-json manifest. Agents and tools can discover flags, inputs, outputs, and schema IDs without reading source.
Every event carries _schema and _nebu_version. Field rename? Bump the version. Your downstream queries never break silently.
Register any Git repo as a processor source via description.yml. nebu install your-processor just works.
## QUICKSTART
go install github.com/withObsrvr/nebu/cmd/nebu@latest export PATH="$HOME/go/bin:$PATH"
nebu list nebu install token-transfer
token-transfer --start-ledger 60200000 --end-ledger 60200001 | jq
token-transfer --start-ledger 60200000 --follow \ | jq -c 'select(.transfer.assetCode == "USDC")' \ | dedup --key meta.txHash \ | json-file-sink --out usdc.jsonl
## COOKBOOK
[ NAMED_RECIPES ]Read deep history from the public aws-public-blockchain S3 archive — no AWS account, no RPC rate limits.
nebu fetch --mode archive \ --datastore-type S3 \ --bucket-path "aws-public-blockchain/v1.1/stellar/ledgers/pubnet" \ --region us-east-2 \ 62080000 62080100 | gzip > historical.xdr.gz
Use authenticated RPC endpoints via NEBU_RPC_AUTH for higher throughput and private lanes.
export NEBU_RPC_AUTH="Api-Key YOUR_KEY" token-transfer \ --rpc-url https://rpc-pubnet.nodeswithobsrvr.co \ --start-ledger 60200000 --end-ledger 60200100
Extract → filter → dedupe → store, composed with only the shell's pipe.
token-transfer --start-ledger 60200000 --follow \ | jq -c 'select(.transfer.assetCode == "USDC")' \ | dedup --key meta.txHash \ | json-file-sink --out usdc.jsonl
One stream, many sinks. Live NATS for consumers, a JSONL archive on disk, a human-readable tail in the terminal.
token-transfer --start-ledger 60200000 --follow \ | tee >(nats-sink --subject "stellar.live" --jetstream) \ | tee >(json-file-sink --out archive.jsonl) \ | jq -r '"L\(.meta.ledgerSequence): \(.transfer.amount)"'
SQL aggregations directly over the stream. No database setup, no schema, no ETL job.
token-transfer --start-ledger 60200000 --end-ledger 60200100 \
| duckdb -c "
SELECT json_extract_string(transfer,'\$.assetCode') AS asset,
COUNT(*) AS n
FROM read_json('/dev/stdin')
WHERE transfer IS NOT NULL
GROUP BY asset ORDER BY n DESC"
Capture raw XDR once, replay through any number of processors. Great for backfills and schema diffs.
nebu fetch 60200000 60200100 > ledgers.xdr cat ledgers.xdr | token-transfer | jq 'select(.transfer)' cat ledgers.xdr | contract-events | grep -i swap
See also: nebu(1) › EXAMPLES for the exhaustive reference, and docs/DUCKDB_COOKBOOK.md for deeper SQL recipes.
Separate fetch from process. nebu fetch writes raw LedgerCloseMeta XDR to stdout — capture a ledger range once, replay it through any number of processors. Great for backfills, regression tests, and comparing two schema versions over the same ledgers.
## SAMPLE_EVENT
{ "_schema": "nebu.token_transfer.v1", "_nebu_version": "v0.6.7", "meta": { "ledgerSequence": 60200000, "closedAtUnix": "1765158311", "txHash": "abc...", "contractAddress": "CA..." }, "transfer": { "from": "GA...", "to": "GB...", "assetCode": "USDC", "amount": "1000000" } } // teal = nebu meta · coral = payload data
## REGISTRY
## TWO_WAYS_IN
Install the CLI, pick a processor, pipe events through jq or duckdb. You're done. No Go required.
[ QUICKSTART → ]The contract is just Processor, Origin, Transform, Sink, Emitter[T], and Reporter. Implement, ship as a binary, register via description.yml.
[ BUILD_GUIDE → ]## NEBU vs FLOWCTL
Every processor is agent-legible.
--describe-json emits flags, inputs, outputs, and schema IDs. Runtime hooks expose metrics, tracing, progress, and agent gates. The contract is small enough for an LLM to hold in its head.