Indexa reference
Everything to take a source from raw rows to a live, queryable REST API — declaratively.
Introduction
Writing an indexer by hand means re-solving the same hard problems every time: tracking where you left off, resuming after a crash without double-writing, backfilling history while tailing new data, and exposing a query API. Indexa solves these once. You only describe your data.
declare (yaml + optional handlers) → indexa deploy → REST query API
Install
Requires Node.js ≥ 22.5 — Indexa uses the built-in node:sqlite, so the default setup needs zero database installation.
$ npm install # installs js-yaml (pg is optional, only for Postgres) $ npm link # optional: makes the `indexa` command global
Quickstart
Scaffold a starter project, then deploy. That backfills the sample data into SQLite and starts a REST API on port 4000.
$ indexa init my-app $ cd my-app $ indexa deploy --config indexa.config.yaml
$ curl localhost:4000/ # metadata + schema $ curl "localhost:4000/orders?status=paid" # filter $ curl "localhost:4000/orders/1" # get by id
The config file
Everything lives in one file: a name, a source, a target, and a schema of output entities. If a stream's raw columns already match an entity (case/plural-insensitive: orders → Order), Indexa maps them automatically — no handler needed.
name: orders-indexer
source:
type: csv # csv | postgres | evm | <your own>
sources:
- { key: orders, file: data/orders.csv }
target:
type: sqlite # sqlite | postgres
path: ./orders.db
schema: # your output entities
Order:
id: ID # every entity needs an id
customer: String
total: BigDecimal
status: String
items: Int
created_at: Timestamp
Field types
Plus references to other entities. ${ENV_VAR} and ${ENV_VAR:-default} interpolation is supported anywhere.
Handlers
Add a handlers: ./handlers.js line only when raw data needs transforming or aggregating. Each source row is processed exactly once, so read-modify-write increments are safe.
export default {
async orders(row, ctx) {
await ctx.store.upsert('Order', row.id, {
id: row.id, customer: row.customer, total: row.total, status: row.status,
});
// Stateful aggregate: read prior entity, then write.
const c = await ctx.store.get('Customer', row.customer);
await ctx.store.upsert('Customer', row.customer, {
id: row.customer, name: row.customer,
totalSpent: (c ? Number(c.totalSpent) : 0) + Number(row.total),
orderCount: (c ? Number(c.orderCount) : 0) + 1,
});
},
};
Run indexa types to generate indexa-types.d.ts for autocomplete on entities and ctx.store.
REST endpoints
From your schema, every entity becomes a REST resource — no code.
Query params: any schema field (?status=paid&customer=Bob), plus limit, offset, orderBy, desc=true. Try them live in the API Explorer →
CLI
$ indexa init [dir] scaffold a starter project
$ indexa deploy --config <file> backfill + live tail + API
[--port 4000] [--once] [--no-api]
$ indexa validate --config <file> check config without running
$ indexa types --config <file> generate TypeScript types
--once runs a single backfill and stops (good for batch jobs / CI). Without it, Indexa keeps polling the source for new rows (pollIntervalMs, default 2000).
Sources
Built-in: csv (zero-dependency files), postgres (tails a table by a monotonic cursor column), and evm (indexes blockchain events with automatic reorg handling). A connector exposes ordered streams with a monotonic cursor; the engine persists the cursor of the last written record inside the same transaction as the writes — that is what makes resumption idempotent.
source:
type: postgres
connection: ${SOURCE_DB_URL}
tables:
- { key: orders, table: orders, cursorColumn: updated_at }
Write your own by implementing init / streams / close, then register it:
import { registerConnector } from 'indexa';
import KafkaConnector from './kafka-connector.js';
registerConnector('kafka', KafkaConnector);
Targets
sqlite (default, built-in) or postgres (requires npm install pg). Tables are created automatically from your schema.
How it works
source connector ──stream(cursor)──▶ engine ──transaction──▶ target store ──▶ REST API
│ ▲
└─ checkpoint persisted ─┘ (same txn = idempotent)
1. Backfill — drain each stream from its last checkpoint until caught up. 2. Live tail — poll for records after the cursor on an interval. 3. Idempotency — entity writes + checkpoint advance commit together; a crash never double-writes or skips.
Reorg handling
The evm connector handles chain reorganizations automatically. Every entity write made while indexing an unfinalized block is recorded in an undo journal. When a previously-indexed block hash changes, the engine rolls the affected entities back to the last common ancestor — including aggregated values like running balances — then re-indexes the new canonical chain. The full walkthrough is in the EVM guide →
Deploy with Docker
# put your config + handlers + data under ./app, then: $ docker build -t my-indexer . $ docker run -p 4000:4000 -e CONFIG=app/indexa.config.yaml my-indexer
The image has a healthcheck on /_health and respects INDEXA_LOG_LEVEL. Use Postgres as the target for production.
Roadmap
Deliberately left out to keep the core small and the "just deploy" promise intact: