bss-cli
v1.7.0How the nine services and two portals fit together — cockpit, write-through policy, and end-to-end traces.
The operator cockpit (REPL canonical)
bss REPL is the cockpit. Type natural language; the agent calls tools; results render as ASCII cards. Slash commands (/ports, /360, /focus, /confirm) cover deterministic operator flows. v0.20 adds knowledge.search / knowledge.get so the cockpit can answer how-to questions from the indexed handbook + runbooks with cited anchors.
Since v0.13 the REPL is the canonical conversational cockpit. The browser
surface on port 9002 shares the same Postgres-backed Conversation store —
exit bss, open the page, see the same turns — and since v1.6 it carries a
full operator CRM around that chat: customer 360s, a case workbench, order
and catalog screens, subscription lifecycle — direct policy-gated CRUD with a two-step
confirm panel on every destructive or money-moving verb (see
Portals). No login wall: the cockpit is
single-operator-by-design behind a secure perimeter, with audit attribution coming from
.bss-cli/settings.toml. On the conversational path, destructive actions propose
first and wait for /confirm.
This is what "LLM-native" actually feels like in practice. The agent is not a separate surface grafted
on top of the system — it goes through the same HTTP endpoints, the same typed tool signatures, the
same write-through policy layer, and the same audit trail a human CLI user does. The only difference is
the front door: a conversation instead of a verb. When the agent tries to do something illegal, the
policy layer hands back a structured PolicyViolation with a rule code and
machine-readable context, and the agent self-corrects from that.
v1.5 unlocks multi-step orchestration — compound questions
("investigate CASE-…", "register CUST + create order", "show customer and their
subscriptions") chain in one operator turn. A new
BSS_REPL_LLM_AUTONOMY env var (granular default /
batched opt-in) controls how many /confirms a compound
destructive action needs: granular re-gates after each destructive, batched authorises
the whole loop on the first /confirm. The destructive-tools list is
unchanged — autonomy controls how many /confirms, not
which tools require one. Two safety rails sit alongside: a 3-strike loop bail
catches LLM thrash on repeated tool failures, and a cockpit chrome filter strips
banner-shape and narrated-call mimicry from both history rehydration and live emits so
the bubble the operator reads matches the action that actually ran. Pre-v1.5
propose-and-execute bubbles both said "Done." — one of them was lying; v1.5
replaces them with "Proposed X(args). Type /confirm to authorise." and
"Executed X(args)." respectively.
Architecture
Every BSS write goes through the per-service policy layer. Three trigger paths feed it:
direct via bss-clients — every CLI/REPL command, the
entire signup funnel, every post-login self-serve route, and every read; sub-second,
deterministic, no LLM. Orchestrator-mediated via astream_once
— the customer-portal chat surface (the only orchestrator-mediated route post-v0.11),
wrapping a LangGraph ReAct agent over the same tool registry. In-process tick loops
— the v0.18 renewal worker fires automatic renewals and sends upcoming-renewal reminder
emails on a 60-second tick; FOR UPDATE SKIP LOCKED makes it multi-replica safe by
construction.
Inside, two planes connect the services: synchronous HTTP over the typed
bss-clients layer (TMF APIs, named BSS_*_API_TOKEN) for calls that
need an immediate answer, and asynchronous events over a RabbitMQ topic
exchange bss.events for reactions. Every service writes directly to its own
schema in one shared PostgreSQL 16; audit.domain_event is written in the
same transaction as the domain write, with the RabbitMQ publish happening after commit
(simplified outbox). Every service exports OpenTelemetry spans to Jaeger;
bss trace renders the same spans as an ASCII swimlane in the terminal.
The audit log gets a coherent attribution on every write: actor,
channel (portal-self-serve / portal-csr /
portal-chat / cli / system:renewal_worker), and
service_identity (named-token perimeter, v0.9).
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────────┐
│ Self-serve UI │ │ Operator cockpit │ │ bss (CLI + REPL) │
│ port 9001 (v0.4) │ │ port 9002 (v0.13)│ │ + LangGraph orchestrator│
│ signup · chat │ │ browser veneer │ │ canonical cockpit │
└────────┬─────────┘ └─────────┬────────┘ └────────────┬─────────────┘
│ │ │
┌────────┴────────┐ │ │
│ direct │ │ shared cockpit.session│
│ bss-clients │ │ Conversation store │
│ (signup, post- │ │ (Postgres-backed) │
│ login, reads) │ │ │
│ │ │ │
│ chat → astream │ │ │
│ (customer_self_ │ │ │
│ serve profile) │ │ │
└────────┬────────┘ │ │
│ ▼ ▼
│ ┌──────────────────────────────────────────────────┐
│ │ bss_orchestrator.session.astream_once(channel, │
│ │ actor=…, service_identity=…) · ReAct agent · │
│ │ allow_destructive gated by /confirm │
│ └──────────────────────────┬───────────────────────┘
│ │
└────────────────────►────────────┴──────────────────────►
│ HTTP (TMF APIs) + bss-clients
┌──────┬────────┬──────────────┼───────┬────────┐
▼ ▼ ▼ ▼ ▼
┌─────┐┌─────┐ ┌─────┐ ┌─────┐┌─────┐
│CRM* ││Pay │ │Cat │ │COM ││Subs │
│8002 ││8003 │ │8001 │ │8004 ││8006 │
└──┬──┘└──┬──┘ └──┬──┘ └──┬──┘└──┬──┘
│ │ │ │ │ ↑
│ └─ HTTP (e.g. Pay→CRM "customer exists?")
│ │ │
│ ┌───────────────────────┼──────┘
│ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌──────┐
│ │ │SOM │ │Med │ │Rate │ │Prov │
│ │ │8005 │ │8007 │ │8008 │ │ Sim │
│ │ └──┬──┘ └──┬──┘ └──┬──┘ │ 8010 │
│ │ │ │ │ └──┬───┘
▼ ▼ ▼ ▼ ▼ ▼
═══════════════════════════════════════════════════════════
║ RabbitMQ — topic exchange: bss.events ║
║ order.* · service_order.* · service.* · provisioning.* ║
║ subscription.* · usage.* · crm.* · payment.* · port_* ║
═══════════════════════════════════════════════════════════
In-process tick loop (v0.18): subscription service runs a
60s renewal-sweep + reminder-email worker in lifespan; FOR
UPDATE SKIP LOCKED keeps it safe across replicas.
External providers (v0.14 → v0.16):
Resend ─ transactional email (magic link, OTP, renewal)
Didit ─ KYC: hosted UI + signed-webhook corroboration
Stripe ─ Checkout + webhook reconciliation (PCI SAQ A)
External sibling adapter (v1.1, OPTIONAL):
loyalty-cli ─ promo entitlements. Bearer-auth HTTP at
loyalty-http:8080; catalog + com + crm hold the client.
Unset BSS_LOYALTY_API_TOKEN → promo subsystem off,
every core flow unchanged (graceful degradation).
Cockpit knowledge tool (v0.20+):
knowledge.search / knowledge.get over the indexed doc corpus
(HANDBOOK + CLAUDE + runbooks). Tier-0 Postgres FTS by default;
Tier-1 pgvector hybrid behind BSS_KNOWLEDGE_BACKEND=hybrid.
Operator-cockpit profile only; never customer chat.
┌────────────────────────────────────────────────┐ ┌──────────────┐
│ PostgreSQL 16 + pgvector (single instance) │ │ Jaeger │
│ │ │ (v0.2+) │
│ crm · catalog · inventory · payment · │ │ OTLP/HTTP │
│ order_mgmt · service_inventory · provisioning │ │ → traces UI │
│ subscription · mediation · billing · cockpit ·│ └──────────────┘
│ audit · integrations · portal_auth · │
│ knowledge (v0.20+ doc-corpus RAG) │
└────────────────────────────────────────────────┘
* CRM hosts the Inventory sub-domain (MSISDN + eSIM pools) on port 8002
under /inventory-api/v1/...; not a separate container in v0.x.
Trace anything end-to-end
bss trace for-order ORD-… — the swimlane, rendered in the terminal.
Same trace, two surfaces — bss trace for the terminal, Jaeger UI for when you want to
drill down.
OpenTelemetry is the source of truth. The ASCII view is a terminal-native lens on the same spans, not a parallel tracing system; no second pipeline to maintain, nothing to drift. If something is happening in the real traces, it is happening in the swimlane — and if you need timing breakdowns, span tags, or log correlation, Jaeger is right there.
trace 4825e0bb25ae0870 766ms 125 spans 8 services 0 errors
POST /tmf-api/productOrderingManagement/v4/productOrder [com ] ████████████████████████ 766ms
└─ POST /tmf-api/customerManagement/v4/customer/CUST-022 [crm ] ██ 18ms
└─ POST /tmf-api/paymentMethod/v1/charge [payment ] ████ 34ms
└─ POST /tmf-api/serviceOrderingManagement/v4/serviceOrder [som ] ████████████████ 512ms
└─ INSERT INTO service_inventory.cfs [postgres ] ▌ 2ms
└─ POST /tmf-api/resourceInventoryManagement/v4/resource [inventory ] ███ 31ms
└─ POST /provisioning/task [provisioning] ████████████ 381ms
└─ AMQP publish bss.events provisioning.task.completed [rabbitmq ] ▌ 3ms
└─ POST /tmf-api/subscription/v1/activate [subscription] ███ 47ms
└─ AMQP publish bss.events order.completed [rabbitmq ] ▌ 3ms