- Canvas: Infinite zoomable workspace with LOD and navigation - Spotlight: Fuzzy search with type filters and view shortcuts - Krate: Window group container with non-overlapping placement - Detail Window: YAML/Describe/Logs/Shell views with maximize - Top Bar: Cluster info, user presence, admin toggle - Admin Drawer: Multi-user presence and spectate functionality - Minimap: Browse and navigate canvas overview - Collection Window: List/tree views with filtering and sorting - Shell/Logs: Real-time terminal and log streaming - Backend: Go service with K8s API, WebSocket handlers, CRDT sync - Architecture: Full project structure and tech stack
265 lines
6.9 KiB
Markdown
265 lines
6.9 KiB
Markdown
# Backend & Real-Time Sync Feature Specification
|
|
|
|
## Overview
|
|
The backend service aggregates Kubernetes API data and provides real-time WebSocket connections for shell, logs, and multi-user collaboration.
|
|
|
|
## Backend Architecture
|
|
|
|
### Recommended Stack
|
|
- **Language**: Go (for Kubernetes client integration)
|
|
- **Kubernetes Client**: kubernetes/client-go
|
|
- **WebSockets**: gorilla/websocket
|
|
- **CRDT Layer**: Yjs + y-websocket
|
|
- **Storage**: Redis or in-memory (single-node), y-leveldb/y-mongodb (multi-node)
|
|
|
|
### Service Responsibilities
|
|
1. Wrap kubectl API access
|
|
2. Provide WebSocket multiplexer for:
|
|
- Shell sessions
|
|
- Log streaming
|
|
- Watch API updates
|
|
3. Handle CRDT sync for shared workspace state
|
|
4. Broadcast user presence/awareness
|
|
|
|
## Kubernetes Data Sources
|
|
|
|
### Required Resources
|
|
- Namespaces
|
|
- Deployments
|
|
- StatefulSets
|
|
- DaemonSets
|
|
- Services
|
|
- Ingresses
|
|
- ConfigMaps
|
|
- Secrets (with base64 decoding server-side)
|
|
- PVCs
|
|
- Events
|
|
- CRDs
|
|
|
|
### Live Updates
|
|
Implementation: Kubernetes Watch API
|
|
```
|
|
GET /api/v1/namespaces/{ns}/pods?watch=true
|
|
GET /api/v1/namespaces/{ns}/services?watch=true
|
|
GET /apis/apps/v1/namespaces/{ns}/deployments?watch=true
|
|
# etc for each resource type
|
|
```
|
|
|
|
### Aggregation Strategy
|
|
- Single endpoint: `/api/watch` that multiplexes all watch streams
|
|
- Or: Resource-specific endpoints (`/api/watch/pods`, `/api/watch/services`, etc.)
|
|
- Include namespace filtering in requests
|
|
|
|
### Secret Decoding
|
|
- Option 1: Server-side decode before sending
|
|
- Option 2: Client-side decode (safer, never log)
|
|
- If client-side: Include decoded flag in response
|
|
- **Never log decoded values**
|
|
|
|
## WebSocket Endpoints
|
|
|
|
### Shell Endpoint
|
|
```
|
|
/ws/shell
|
|
```
|
|
- **Upstream**: `kubectl exec -it <pod> -n <ns> -- sh`
|
|
- **Protocol**: Bidirectional WebSocket
|
|
- Client → Server: Keycodes, terminal resize
|
|
- Server → Client: stdout, stderr
|
|
- **Terminal**: xterm.js on client
|
|
- **Backend**: Use k8s Go client's Exec API with streaming
|
|
|
|
### Logs Endpoint
|
|
```
|
|
/ws/logs
|
|
```
|
|
- **Upstream**: `kubectl logs --follow <pod> -n <ns>`
|
|
- **Protocol**: WebSocket text frames
|
|
- **Streaming**: Each line as separate message
|
|
- **Client**: Auto-scroll unless user scrolled up
|
|
|
|
### Watch Endpoint
|
|
```
|
|
/ws/watch
|
|
```
|
|
- **Purpose**: Broadcast resource changes to all connected clients
|
|
- **Payload**: JSON Patch or full object update
|
|
- **Filtering**: Per-client namespace/resource filters
|
|
|
|
## Multi-User Sync (The Yard)
|
|
|
|
### CRDT Strategy
|
|
- **Library**: Yjs (proven CRDT implementation)
|
|
- **Provider**:
|
|
- Single-node: in-memory or Redis
|
|
- Multi-node: y-leveldb, y-mongodb, or y-webrtc
|
|
- **WebSocket**: y-websocket for real-time sync
|
|
|
|
### Synced State
|
|
Only sync the *structure* of the workspace, not content:
|
|
- ✅ Krate positions (wx, wy)
|
|
- ✅ Krate existence (create/delete)
|
|
- ✅ Krate minimize state
|
|
- ✅ Window layout (grid positions, sizes)
|
|
- ✅ Room/canvas camera state (optional)
|
|
- ❌ Window content (logs, YAML, shell) — each client streams independently
|
|
|
|
### Presence/Awareness
|
|
Broadcast ephemeral user state:
|
|
```
|
|
{
|
|
userId: string,
|
|
name: string,
|
|
color: string,
|
|
cursorPosition: { x, y } | null,
|
|
activeKrateId: string | null,
|
|
spotlightQuery: string | null,
|
|
timestamp: number
|
|
}
|
|
```
|
|
|
|
Implementation:
|
|
- Each client publishes to Yjs Awareness channel
|
|
- Admin panel subscribes to awareness updates
|
|
- Update interval: every 5 seconds or on significant change
|
|
|
|
## Real-Time Shell Implementation
|
|
|
|
### Client-Side (xterm.js)
|
|
```
|
|
1. Connect WebSocket
|
|
2. Initialize xterm with {
|
|
rows: 24,
|
|
cols: 80,
|
|
fontFamily: 'IBM Plex Mono',
|
|
fontSize: 12
|
|
}
|
|
3. Attach xterm to WebSocket
|
|
- term.write() → send to server
|
|
- WebSocket.onmessage → term.write()
|
|
4. Handle resize: term.resize(cols, rows)
|
|
5. Attach to DOM
|
|
```
|
|
|
|
### Server-Side (Go + k8s client)
|
|
```
|
|
1. Extract pod/ns from URL: /ws/shell?pod=xxx&ns=yyy
|
|
2. Use k8s-exec Go library:
|
|
req := clientset.CoreV1().RESTClient().Post()
|
|
req.Name(pod).Namespace(ns).Resource("pods").SubResource("exec")
|
|
req.VersionedParams(&v1.ExecOptions{
|
|
Stdin: true,
|
|
Stdout: true,
|
|
Stderr: true,
|
|
Terminal: true,
|
|
}, scheme.ParameterCodec)
|
|
// Create exec stream
|
|
3. Duplex: WebSocket ↔ Exec stream
|
|
4. Handle close gracefully
|
|
```
|
|
|
|
## Real-Time Logs Implementation
|
|
|
|
### Client-Side
|
|
```
|
|
1. Connect WebSocket
|
|
2. Maintain scroll position state
|
|
3. On new message:
|
|
- Append to log buffer
|
|
- If wasAtBottom: scroll to bottom
|
|
- Else: keep existing scroll position
|
|
4. Colorize lines on receive (ERROR/WARN/INFO patterns)
|
|
```
|
|
|
|
### Server-Side
|
|
```
|
|
1. kubectl logs --follow logic via exec stream
|
|
2. Stream lines as WebSocket messages
|
|
3. Add keep-alive ping every 30s
|
|
4. Reconnect on disconnect (client-side)
|
|
```
|
|
|
|
## Backend API Endpoints
|
|
|
|
### HTTP Endpoints (REST)
|
|
```
|
|
GET /api/cluster # Cluster metadata
|
|
GET /api/resources # All resources (cached, for initial load)
|
|
GET /api/resources/pods # Filtered by namespace (optional)
|
|
GET /api/resources/deployments
|
|
GET /api/resources/services
|
|
GET /api/resources/secrets
|
|
GET /api/resources/configmaps
|
|
GET /api/resources/namespaces
|
|
GET /api/resources/crds
|
|
GET /api/resource/{kind}/{name}?ns={namespace}
|
|
GET /api/health # Backend health check
|
|
```
|
|
|
|
### WebSocket Endpoints
|
|
```
|
|
/ws/shell # Shell session
|
|
/ws/logs # Log streaming
|
|
/ws/watch # Resource watch updates
|
|
/ws/sync # CRDT sync + awareness
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
### WebSocket Disconnects
|
|
- Shell: Notify user, keep terminal buffer (don't clear)
|
|
- Logs: Auto-reconnect with exponential backoff
|
|
- Watch: Auto-reconnect, fetch delta on reconnect
|
|
|
|
### Backpressure
|
|
- Shell: Buffer output if client slow, drop old lines if buffer full
|
|
- Logs: Same strategy
|
|
- Watch: Batch updates if high frequency
|
|
|
|
### Connection Limits
|
|
- Per-user limits: max concurrent shells/logs per namespace
|
|
- Global limits: max connections
|
|
- Reject with 429 if limit exceeded
|
|
|
|
## Security Considerations
|
|
|
|
### Authentication
|
|
- JWT or session cookie
|
|
- Validate token on WebSocket upgrade
|
|
- Bind connections to user ID
|
|
|
|
### Authorization
|
|
- Namespace-level RBAC
|
|
- User can only access resources they have permission for
|
|
- Backend must enforce (not just client-side filtering)
|
|
|
|
### Secret Handling
|
|
- Never log decoded secrets
|
|
- Sanitize before display
|
|
- Consider explicit user action to reveal (future enhancement)
|
|
|
|
## Scalability
|
|
|
|
### Single-Node (Development)
|
|
- In-memory Yjs provider
|
|
- Redis for presence
|
|
- Direct k8s API calls
|
|
|
|
### Multi-Node (Production)
|
|
- Shared Redis for Yjs CRDT state
|
|
- WebSocket leader election (one node handles all CRDT updates)
|
|
- or: y-webrtc for peer-to-peer (simpler, less reliable)
|
|
|
|
## Health Monitoring
|
|
- WebSocket connection count
|
|
- Active shell sessions
|
|
- API latency (p50, p95, p99)
|
|
- CRDT sync lag
|
|
- Error rates by endpoint
|
|
|
|
## Testing Strategy
|
|
1. Unit tests for Kubernetes API wrappers
|
|
2. Integration tests with kind (Kubernetes in Docker)
|
|
3. Load tests for WebSocket connections
|
|
4. CRDT conflict resolution tests
|