NovaVantage Backend Reference

Logging · Observability · Operations

NovaVantage Backend

Authoritative reference for logging, observability & operations under apps/backend/src

v1.0.0
🛡️
Entry Points
Honeypot, Correlation Middleware, JWT Verification, GeoIP, CLS Store
5 components
📋
Core Processing
NovaLogger, Winston Pipeline, HTTP Interceptor, Exception Filters, Client Bridge
5 components
🔭
Observability
Discord, MongoDB Audit, Pulse Anomaly, Alert Dispatcher, PII Redaction
5 components
📊
Analytics
Financial Tracker, Frustration Detector, AI Predictor, Circuit Observer
4 components
🔐
Security
Security Fingerprint, Black Box, Ghost Mode, GDPR Sanitizer
4 components
♻️
Self-Healing
SelfHealingService, CircuitObserver, StateSnapshotService
3 components

Mental Model

Anything that calls NovaLogger during a request automatically merges CLS context (correlation ID, user, geo, route) into every Winston log line. Separately, AlertDispatcherService reacts to error/fatal-level logs via AlertTransport to email, Monday.com, and signature-based deduplication.

Request flow: Inbound → Honeypot → CorrelationMiddleware (IP/Geo/JWT) → NestJS → CLS enrichment → NovaLogger → Winston → [Console · File · Discord · MongoDB · Alert]
HoneypotMiddleware
middleware/honeypot.middleware.ts
Security Fully Wired

Purpose

Detects scanners and automated attacks by trapping requests to known exploit paths before they reach any real route handler. Applied globally via LoggerModule.configure — runs first, before CorrelationMiddleware.

Behavior

Compares req.path.toLowerCase() against the HONEYPOT_PATHS set (e.g. /.env, /.git, /wp-admin, /phpmyadmin, /actuator). On match: logs a warn with tags HONEYPOT_HIT, SECURITY, plus honeypotPath, ip, userAgent, method. Responds 404 with { success: false, data: null, error: 'Not Found' }. Does not call next() — request ends here.

CorrelationMiddleware
middleware/correlation.middleware.ts
Middleware Fully Wired

Steps (in order)

1. IP: reads x-forwarded-for first segment, else req.socket.remoteAddress.
2. Sets CLS: ip, userAgent, route (originalUrl), method.
3. Geo: GeoIpService.lookup(ip) → sets geoCountry, geoCity, geoFlag.
4. JWT: If Authorization: Bearer present — verifies with JWT_SECRET. On success sets userId, studentId, isGhostAdmin. On failure: silent (no throw).
5. Entity IDs: EntityContextService.extract(req) → sets entityIds on CLS if non-empty.
6. Calls next().

Runs after HoneypotMiddleware. It enriches the CLS store created by ClsModule — it does NOT replace the correlationId set at CLS mount.
JWT Verification
Part of CorrelationMiddleware
Security Fully Wired

Fields extracted from JWT payload

CLS FieldJWT Source
userIdpayload.userId or payload.sub
studentIdpayload.studentId
isGhostAdminpayload.ghostMode === true
Verification failure is silent — the auth layer elsewhere handles unauthorized access. This middleware only enriches context when a valid token exists.
GeoIpService
services/geo-ip.service.ts
Utility Fully Wired

Resolution logic

Uses geoip-lite (offline DB). Normalizes IPv4-mapped IPv6 addresses (::ffff:x). Private IPs return { country: 'LOCAL', city: 'localhost', flag: '🏠' }. Public IPs return ISO country, city, and a flag emoji derived from regional indicator codepoints.

CLS Store (NovaClsStore)
core/cls-setup.service.ts · nestjs-cls (AsyncLocalStorage)
Core Fully Wired

Fields on the CLS store

FieldMeaning
correlationIdUUID per request (set at CLS mount in AppModule)
requestStartTimeDate.now() at request start
userIdFrom JWT userId or sub
studentIdFrom JWT studentId
geoCountry / geoCity / geoFlagFrom GeoIpService
ip / userAgent / route / methodRequest metadata
isGhostAdmintrue if JWT has ghostMode: true
entityIdsMap of extracted route/query IDs
isBlackBoxTargetFor forcing DEBUG verbosity on a user
NovaLogger
core/nova-logger.service.ts
Core Fully Wired

Log levels & methods

MethodLevelUse
log / infoinfo (3)General information
warnwarn (2)Warnings, shadow exceptions, security
errorerror (1)Errors — triggers alert transport
fatalfatal (0)Catastrophic — triggers all alert channels
debugdebug (4)Verbose diagnostics
systemsystem (2)Boot, deployment, operational banners

Automatic context merging

Every write() call builds LogContext via getContext() and passes it to Winston. This means correlationId, userId, geo, route and all CLS fields are merged automatically — you never need to pass them manually when logging inside a request.

setLevel(level) mutates Winston level at runtime — used by the Discord /logs-level slash command.
Winston Pipeline
core/winston.config.ts
Core Fully Wired

Transports (all attached to main logger)

TransportDetails
ConsoleconsoleFormat with chalk colors per level
Daily Rotate Filelogs/nova-%DATE%.log · max 20MB · 30-day retention · zipped archive
Error-only Filelogs/nova-errors-%DATE%.log · error level only · 60-day retention
DiscordTransportSilent if no DISCORD_BOT_TOKEN. Buffers 100 items until channel resolver is set
MongoDbAuditTransportBuffers 200 entries. Note: modelResolver may be unset — logs may not persist until bound
AlertTransportlevel: error. Invokes AlertDispatcherService for error/fatal events
exitOnError: false, handleExceptions: true, handleRejections: true — uncaught errors/rejections are logged instead of crashing the process.
HttpLoggingInterceptor
interceptors/http-logging.interceptor.ts
Interceptor Fully Wired

Behavior

Skips non-HTTP contexts. On first request: calls BootBannerService.recordFirstRequest() to record TTFR on the deployment fingerprint.

EventLog levelTags
Request ininfoREQUEST_IN
Response out (success)infoRESPONSE_OUT + durationMs + statusCode
Slow request (>500ms)warn🐢 SLOW_REQUEST
Response errorerrorRESPONSE_ERROR
Exception Filters
filters/all-exceptions.filter.ts · filters/domain-exception.filter.ts
Filters Fully Wired

AllExceptionsFilter (500+)

Catches everything for HTTP. Status from HttpException or 500. If status ≥ 500: logger.fatal with tags UNHANDLED_EXCEPTION, FATAL, stack, path, method. Response: { success: false, data: null, error: message }.

DomainExceptionFilter (4xx — "Shadow")

Catches only HttpException. For status 400–499: logger.warn with 👻 [SHADOW], tags SHADOW_EXCEPTION, full stack. Implements "shadow tracking" — client errors are tracked without treating them as server crashes.

Client Bridge
POST /api/logs/client · ClientBridgeController + ClientBridgeService
Core Fully Wired

Frontend log ingestion pipeline

1. Validates payload via Zod (clientLogPayloadSchema): level, message (1–2000 chars), optional stack, context, rrwebEvents (max 500).
2. PiiRedactorService.redact(payload) on full payload.
3. Logs via NovaLogger at appropriate level with tags CLIENT_LOG.
4. If rrwebEvents non-empty and userId set: stores session via RrwebSessionService (in-memory Map, max 500 sessions).
5. Returns { correlationId: "client-{Date.now()}", replayUrl }.

RrwebSessionService is in-memory only — no Mongo/S3 persistence yet. The /api/logs/replay/:id route does not exist yet; replayUrl is a future contract.
Discord Integration
discord/ module — multiple services
Observability Fully Wired

Services

ServiceRole
DiscordBotServiceBot login, env-aware channel resolution (dev vs prod). Sets channel resolver on DiscordTransport.
DiscordFeedServicesendLog (drops if isGhostAdmin), sendEmbed, sendAlert to configured channels.
DiscordDashboardServicesetInterval 30s → updates Dependency Health + Traffic Hub embeds.
DiscordIncidentsServiceAlert embeds with Acknowledge / Mute 1hr / Create Ticket buttons.

Slash Commands

CommandAccessWhat it does
/server-healthpublicStateSnapshot + deployment fingerprint embed
/logs-levelownerNovaLogger.setLevel() at runtime
/logs-sampleownerUpdates in-memory sampling rate
/debug-userdevopsBlackBoxService.activate(userId) for 10 min
/user-tracedevopsAuditService.getByActor() — last 10 actions
/export-tracedevopsgetRecent(200) filtered by correlationId → JSON file
/run-diagnosticsdevopsParallel: Mongo ping, Redis, Discord WS, Resend, Monday, OpenAI
MongoDB Audit
audit/audit.schema.ts · AuditService · MongoDbAuditTransport
Storage WiredModel resolver unset

Schema — collection: audit_logs

FieldTypeNotes
actionstring (required)Stable action string e.g. logs.level_changed
actorobjecttype: user|admin|discord|system · id · name?
metadataobjectdefault {}
sourceenumapi | discord | system
correlationIdstring?optional
createdAtDateTTL index: 90 days auto-delete

Indexes

actor.id + createdAt · action + createdAt · TTL on createdAt (expireAfterSeconds: 7,776,000)

MongoDbAuditTransport: setModelResolver is never called — buffered logs (up to 200) may not persist to Mongo until you bind a Mongoose model to this transport.
PulseAnomalyService
analytics/pulse-anomaly.service.ts
Analytics Needs recordRequest() calls Every Minute

How it works

Call recordRequest() to increment a per-minute counter. The @Cron(EVERY_MINUTE) checkPulse method: rolls the minute bucket, skips if fewer than 10 buckets, computes baseline as average of all buckets, compares latest minute to baseline.

If deviation > 40%: logs 📈 SPIKE or 📉 DROP with tag PULSE_ANOMALY. Keeps up to 1440 snapshots (~24h) in history.

recordRequest() is not called from HttpLoggingInterceptor — you must wire it from middleware/interceptor to get real traffic baselines.
AlertDispatcherService
alerts/alert-dispatcher.service.ts
Alerts Fully Wired

dispatch(alert) flow

1. Key = alert.errorSignature ?? alert.title.
2. rateLimiter.shouldAlert(signature) — if false, muted.
3. If severity === CRITICAL: stateSnapshot.capture('fatal').
4. Expand ALL channels → Discord + Email + Monday.
5. Parallel Promise.allSettled: Resend (email), Monday.com.

Discord case in the channel switch is empty — Discord alerting for structured dispatches uses DiscordIncidentsService or the Winston DiscordTransport separately.

Alert Rate Limiter

Per signature: within 60 seconds, if more than 3 alerts → mute for another 60s. muteSignature(sig, 3_600_000) used by Discord's "Mute 1hr" button.

Error Signature deduplication

ErrorSignatureService.evaluate(error): strips line numbers + node_modules frames, SHA-256 hex slice(0,16). New errors → escalate to ALL channels. Known errors → Discord only.

PII Redaction & Payload Truncation
services/pii-redactor.service.ts · services/payload-truncator.service.ts
Utility Manual use recommended

PiiRedactorService

Recursively walks objects (max depth 10). Keys matching PII_SENSITIVE_KEYS (case-insensitive) → [REDACTED]. Strings: regex replace for email, phone, IPv4 patterns → [REDACTED]. Strings > 5120 bytes truncated first. Used automatically by ClientBridgeService and GdprSanitizerService.

PayloadTruncatorService

Deep truncation: max depth 8, strings 5120 chars, arrays max 50 items, objects max 100 keys, Buffer summarized. Not auto-applied to every Winston log — call manually when logging large objects.

FinancialTrackerService
analytics/financial-tracker.service.ts
Analytics Needs manual wiring

trackApiCost(userId, service, estimatedCost)

Rolling hourly bucket per user. Logs each track with tag COST_TRACKED. If hourly total > $10 USD: logs 💸 [FINANCIAL_ALERT]. Call from code that wraps OpenAI / Resend / etc. with estimated USD cost.

FrustrationDetectorService
analytics/frustration-detector.service.ts
Analytics Needs manual wiring

recordUserError(userId)

Keeps timestamps in a 2-minute (120,000ms) sliding window. If count ≥ 5 errors and not yet alerted: logs 😤 [FRUSTRATION]. Resets alerted when count drops back below threshold. Call from shadow filter or auth failure handler.

AiPredictorService
analytics/ai-predictor.service.ts
Analytics Daily 6AM

Daily OpenAI analysis

Requires OPENAI_API_KEY env var. Every day at 06:00: loads 100 recent audit docs via AuditService.getRecent(100), builds a text summary of actions + metadata snippets, calls OpenAI gpt-4o-mini with a DevOps-style system prompt. Logs result as system with tag 🔮 [PREDICTIVE_WARNING].

SecurityFingerprintService
security/security-fingerprint.service.ts
Security Needs manual wiring

recordFailedAuth(ip, userId?)

5-minute sliding window per IP. Each failure logs 🛡️ [SECURITY]. At ≥ 5 failures: logs 🚨 brute-force style error. Call from your auth failure path.

BlackBoxService
services/black-box.service.ts
Debug markCurrentRequest not auto-called

API

MethodBehavior
activate(userId, reason)Stores target for 10 minutes (BLACK_BOX_RECORDING_DURATION_MS = 600,000)
isActive(userId)Returns true + expires old entries
markCurrentRequest()If CLS userId is active, sets cls isBlackBoxTarget = true
markCurrentRequest() is not called from CorrelationMiddleware today — flagging alone does not change log level automatically. The Discord /debug-user command calls activate() directly.
GhostModeService
security/ghost-mode.service.ts
Security Wired via JWT

Behavior

isGhostAdmin() reads CLS isGhostAdmin (from JWT ghostMode). activateForCurrentRequest() sets CLS flag manually. DiscordFeedService.sendLog drops messages when isGhostAdmin is true — reduces noise for admin ghost actions.

Winston DiscordTransport does not filter ghost in the same way — consistency depends on which path emits the message.
GdprSanitizerService
security/gdpr-sanitizer.service.ts
Compliance Daily 3AM

scrubExpiredPii (cron)

Finds audit docs with createdAt older than 14 days (batch 1000). Redacts metadata via PiiRedactorService, then updateOne.

purgeUserData(userId)

deleteMany on actor.id. Returns count. Logs purge action for audit trail.

SelfHealingService
self-healing/self-healing.service.ts
Self-Healing Needs manual wiring

API

MethodBehavior
registerStrategy(name, async fn)Stores a recovery function for the named service
attemptRecovery(name, reason)Logs phases: Analysis → Strategy → Execution → Verification. Invokes strategy. On success: updates CircuitObserver to CLOSED.
onCircuitOpen(name, reason)If a strategy exists for this service, starts recovery automatically
onCircuitOpen is not auto-wired to CircuitObserverService — the observer only logs state changes; it does not call self-heal automatically. You must wire this from your circuit breaker library.
CircuitObserverService
analytics/circuit-observer.service.ts
Self-Healing Needs manual wiring

reportState(serviceName, newState, reason?)

On state change, pushes to history (max 500 entries). Logs: OPEN as ⚡ [CIRCUIT_OPEN], HALF_OPEN as 🔄, CLOSED as . Methods: getState, getHistory, getAllStates for diagnostics. Call from your resilience layer (e.g. opossum).

StateSnapshotService
alerts/state-snapshot.service.ts
Self-Healing Wired via Discord + Alerts

capture(reason?)

Reasons: 'fatal' (default) | 'health-check' | 'diagnostic'. Captures: heap used/total (MB), RSS, external, uptime (s), PID, Node version, NODE_ENV, active handle count, CPU user/system (ms). Logs as system level with distinct message per reason.

Scheduled Jobs (Cron)
Requires ScheduleModule.forRoot() — present in AppModule
Cron Registered
ServiceScheduleBehavior
PulseAnomalyServiceEvery minuteBaseline deviation check (needs 10+ buckets)
GdprSanitizerServiceDaily 03:00Scrub PII on audit docs older than 14 days
AiPredictorServiceDaily 06:00OpenAI analysis of recent audit entries (if OPENAI_API_KEY set)
Constants Reference
logger.constants.ts — single source of truth for all thresholds
Reference
ConstantValueMeaning
SLOW_REQUEST_THRESHOLD_MS500HTTP/WS "slow" warning threshold
SLOW_DB_QUERY_THRESHOLD_MS100Slow query warning
PAYLOAD_TRUNCATION_LIMIT_BYTES5120~5KB string cap for PII/truncation
ALERT_RATE_LIMIT_COUNT3Alerts per window before mute
ALERT_RATE_LIMIT_WINDOW_MS60,0001 minute rate limit window
LOG_GROUP_BUFFER_MS60,000LogGrouper flush interval
FRUSTRATION_ERROR_COUNT5Errors in window to trigger frustration log
FRUSTRATION_WINDOW_MS120,0002-minute frustration detection window
FINANCIAL_HOURLY_THRESHOLD_USD$10Per-user hourly spend alert
BLACK_BOX_RECORDING_DURATION_MS600,00010 minutes of black box recording
GDPR_RETENTION_DAYS14Age threshold for PII metadata scrub
AUDIT_LOG_TTL_DAYS90MongoDB TTL index for audit logs
RRWEB_SESSION_TTL_DAYS7rrweb session retention
PULSE_DEVIATION_THRESHOLD_PERCENT40Pulse anomaly sensitivity
Infrastructure Map
Open the Excalidraw diagram in the chat above for full interactivity. This view shows the architecture overview.

Architecture layers

Entry Points (Red): Inbound requests pass through HoneypotMiddleware first — trap paths are blocked with 404. Legitimate requests flow to CorrelationMiddleware which enriches the CLS store with IP, geo, JWT claims, and entity IDs.

Core Processing (Blue): NestJS AppModule + CLS store per request. NovaLogger automatically merges all CLS context into every log line via Winston. HttpLoggingInterceptor and Exception Filters both route through NovaLogger.

Observability (Green): Winston fans out to Console, rotating files, Discord (live feed + alerts), MongoDB audit, and AlertTransport. AlertDispatcherService deduplicates via error signatures and rate-limits to avoid alert storms.

Self-Healing (Purple): CircuitObserverService tracks circuit breaker state changes. SelfHealingService executes registered recovery strategies. StateSnapshotService captures system health on demand or on CRITICAL events.