Files
opencode-krates-connector/design/backend.md
Hermes Agent 89bc5e8c15 docs: Finalize all design documents
Signed-off-by: Hermes Agent <hermes@nosuchhost>
2026-06-16 09:00:33 -04:00

6.9 KiB

Design Document: Backend & Real-Time Sync Feature Specification

Overview

The backend service aggregates Kubernetes API data and provides real-time WebSocket connections for shell, logs, and multi-user collaboration.

Backend Architecture

  • Language: Go (for Kubernetes client integration)
  • Kubernetes Client: kubernetes/client-go
  • WebSockets: gorilla/websocket
  • CRDT Layer: Yjs + y-websocket
  • Storage: Redis or in-memory (single-node), y-leveldb/y-mongodb (multi-node)

Service Responsibilities

  1. Wrap kubectl API access
  2. Provide WebSocket multiplexer for:
    • Shell sessions
    • Log streaming
    • Watch API updates
  3. Handle CRDT sync for shared workspace state
  4. Broadcast user presence/awareness

Kubernetes Data Sources

Required Resources

  • Namespaces
  • Deployments
  • StatefulSets
  • DaemonSets
  • Services
  • Ingresses
  • ConfigMaps
  • Secrets (with base64 decoding server-side)
  • PVCs
  • Events
  • CRDs

Live Updates

Implementation: Kubernetes Watch API

GET /api/v1/namespaces/{ns}/pods?watch=true
GET /api/v1/namespaces/{ns}/services?watch=true
GET /apis/apps/v1/namespaces/{ns}/deployments?watch=true
# etc for each resource type

Aggregation Strategy

  • Single endpoint: /api/watch that multiplexes all watch streams
  • Or: Resource-specific endpoints (/api/watch/pods, /api/watch/services, etc.)
  • Include namespace filtering in requests

Secret Decoding

  • Option 1: Server-side decode before sending
  • Option 2: Client-side decode (safer, never log)
  • If client-side: Include decoded flag in response
  • Never log decoded values

WebSocket Endpoints

Shell Endpoint

/ws/shell
  • Upstream: kubectl exec -it <pod> -n <ns> -- sh
  • Protocol: Bidirectional WebSocket
    • Client → Server: Keycodes, terminal resize
    • Server → Client: stdout, stderr
  • Terminal: xterm.js on client
  • Backend: Use k8s Go client's Exec API with streaming

Logs Endpoint

/ws/logs
  • Upstream: kubectl logs --follow <pod> -n <ns>
  • Protocol: WebSocket text frames
  • Streaming: Each line as separate message
  • Client: Auto-scroll unless user scrolled up

Watch Endpoint

/ws/watch
  • Purpose: Broadcast resource changes to all connected clients
  • Payload: JSON Patch or full object update
  • Filtering: Per-client namespace/resource filters

Multi-User Sync (The Yard)

CRDT Strategy

  • Library: Yjs (proven CRDT implementation)
  • Provider:
    • Single-node: in-memory or Redis
    • Multi-node: y-leveldb, y-mongodb, or y-webrtc
  • WebSocket: y-websocket for real-time sync

Synced State

Only sync the structure of the workspace, not content:

  • Krate positions (wx, wy)
  • Krate existence (create/delete)
  • Krate minimize state
  • Window layout (grid positions, sizes)
  • Room/canvas camera state (optional)
  • Window content (logs, YAML, shell) — each client streams independently

Presence/Awareness

Broadcast ephemeral user state:

{
  userId: string,
  name: string,
  color: string,
  cursorPosition: { x, y } | null,
  activeKrateId: string | null,
  spotlightQuery: string | null,
  timestamp: number
}

Implementation:

  • Each client publishes to Yjs Awareness channel
  • Admin panel subscribes to awareness updates
  • Update interval: every 5 seconds or on significant change

Real-Time Shell Implementation

Client-Side (xterm.js)

1. Connect WebSocket
2. Initialize xterm with {
   rows: 24,
   cols: 80,
   fontFamily: 'IBM Plex Mono',
   fontSize: 12
}
3. Attach xterm to WebSocket
   - term.write() → send to server
   - WebSocket.onmessage → term.write()
4. Handle resize: term.resize(cols, rows)
5. Attach to DOM

Server-Side (Go + k8s client)

1. Extract pod/ns from URL: /ws/shell?pod=xxx&ns=yyy
2. Use k8s-exec Go library:
   req := clientset.CoreV1().RESTClient().Post()
   req.Name(pod).Namespace(ns).Resource("pods").SubResource("exec")
   req.VersionedParams(&v1.ExecOptions{
     Stdin: true,
     Stdout: true,
     Stderr: true,
     Terminal: true,
   }, scheme.ParameterCodec)
   // Create exec stream
3. Duplex: WebSocket ↔ Exec stream
4. Handle close gracefully

Real-Time Logs Implementation

Client-Side

1. Connect WebSocket
2. Maintain scroll position state
3. On new message:
   - Append to log buffer
   - If wasAtBottom: scroll to bottom
   - Else: keep existing scroll position
4. Colorize lines on receive (ERROR/WARN/INFO patterns)

Server-Side

1. kubectl logs --follow logic via exec stream
2. Stream lines as WebSocket messages
3. Add keep-alive ping every 30s
4. Reconnect on disconnect (client-side)

Backend API Endpoints

HTTP Endpoints (REST)

GET  /api/cluster              # Cluster metadata
GET  /api/resources            # All resources (cached, for initial load)
GET  /api/resources/pods       # Filtered by namespace (optional)
GET  /api/resources/deployments
GET  /api/resources/services
GET  /api/resources/secrets
GET  /api/resources/configmaps
GET  /api/resources/namespaces
GET  /api/resources/crds
GET  /api/resource/{kind}/{name}?ns={namespace}
GET  /api/health               # Backend health check

WebSocket Endpoints

/ws/shell                      # Shell session
/ws/logs                       # Log streaming
/ws/watch                      # Resource watch updates
/ws/sync                       # CRDT sync + awareness

Error Handling

WebSocket Disconnects

  • Shell: Notify user, keep terminal buffer (don't clear)
  • Logs: Auto-reconnect with exponential backoff
  • Watch: Auto-reconnect, fetch delta on reconnect

Backpressure

  • Shell: Buffer output if client slow, drop old lines if buffer full
  • Logs: Same strategy
  • Watch: Batch updates if high frequency

Connection Limits

  • Per-user limits: max concurrent shells/logs per namespace
  • Global limits: max connections
  • Reject with 429 if limit exceeded

Security Considerations

Authentication

  • JWT or session cookie
  • Validate token on WebSocket upgrade
  • Bind connections to user ID

Authorization

  • Namespace-level RBAC
  • User can only access resources they have permission for
  • Backend must enforce (not just client-side filtering)

Secret Handling

  • Never log decoded secrets
  • Sanitize before display
  • Consider explicit user action to reveal (future enhancement)

Scalability

Single-Node (Development)

  • In-memory Yjs provider
  • Redis for presence
  • Direct k8s API calls

Multi-Node (Production)

  • Shared Redis for Yjs CRDT state
  • WebSocket leader election (one node handles all CRDT updates)
  • or: y-webrtc for peer-to-peer (simpler, less reliable)

Health Monitoring

  • WebSocket connection count
  • Active shell sessions
  • API latency (p50, p95, p99)
  • CRDT sync lag
  • Error rates by endpoint

Testing Strategy

  1. Unit tests for Kubernetes API wrappers
  2. Integration tests with kind (Kubernetes in Docker)
  3. Load tests for WebSocket connections
  4. CRDT conflict resolution tests