Skip to content

Monitoring & Observability

Sorcha uses OpenTelemetry (OTEL) for distributed tracing, metrics, and structured logging, collected by the .NET Aspire Dashboard.

Architecture

┌─────────────┐     OTLP/gRPC       ┌──────────────────┐
│  All Sorcha  │───────────────────>│  Aspire Dashboard  │
│  Services    │  (port 18889)      │  (traces, logs,    │
└─────────────┘                     │   metrics)         │
                                    └────────┬───────────┘

                              ┌──────────────┼──────────────┐
                              v                             v
                     ┌────────────────┐           ┌─────────────────┐
                     │ :18888 direct  │           │ /admin/dashboard │
                     │ (dev only)     │           │ (SystemAdmin JWT)│
                     └────────────────┘           └─────────────────┘

All Sorcha services export telemetry data via OTLP gRPC to the Aspire Dashboard container. The dashboard provides a web UI for exploring traces, logs, and metrics.

Accessing the Dashboard

Direct Access (Development)

http://localhost:18888

The Aspire Dashboard is accessible without authentication in the default Docker configuration (DOTNET_DASHBOARD_UNSECURED_ALLOW_ANONYMOUS=true).

Via API Gateway (Production)

http://localhost/admin/dashboard

Access through the API Gateway requires a JWT token with the SystemAdmin role. This is the recommended approach for production deployments.

Securing the Dashboard

For production, disable anonymous access:

yaml
aspire-dashboard:
  environment:
    - DOTNET_DASHBOARD_UNSECURED_ALLOW_ANONYMOUS=false
    - DASHBOARD__FRONTEND__AUTHMODE=BrowserToken
    - DASHBOARD__FRONTEND__BROWSERTOKEN__TOKEN=<secure-token>

Health Check Endpoints

Every Sorcha service exposes a /health endpoint. Via the API Gateway:

EndpointServiceCheck Includes
http://localhost/healthAPI GatewayGateway process
http://localhost/blueprint/healthBlueprintRedis, MongoDB, downstream services
http://localhost/tenant/healthTenantPostgreSQL, Redis
http://localhost/wallet/healthWalletPostgreSQL, Redis, encryption provider
http://localhost/register/healthRegisterMongoDB, Redis
http://localhost/validator/healthValidatorRedis, MongoDB
http://localhost/peer/healthPeerRedis, MongoDB

Direct Service Health (bypassing gateway)

EndpointService
http://localhost:5000/healthBlueprint
http://localhost:5450/healthTenant
http://localhost:5380/healthRegister
http://localhost:5800/healthValidator

Docker Health Checks

Docker Compose includes built-in health checks for all services. Monitor container health:

bash
# Show all container statuses
docker-compose ps

# Show only unhealthy containers
docker ps --filter health=unhealthy

# Inspect a specific container's health check history
docker inspect --format='{{json .State.Health}}' sorcha-blueprint-service | jq

Health check configuration (per container):

  • Interval: 10 seconds
  • Timeout: 5 seconds
  • Retries: 10
  • Start period: 30 seconds

Logging

Log Levels

Configure log verbosity per service via ASPNETCORE_ENVIRONMENT:

EnvironmentDefault LevelSQL QueriesRequest Details
DevelopmentDebugVisibleVerbose
DockerInformationHiddenStandard
ProductionWarningHiddenMinimal

Viewing Logs

Docker Compose logs:

bash
# All services
docker-compose logs -f

# Specific service
docker-compose logs -f blueprint-service

# Last 100 lines
docker-compose logs --tail=100 tenant-service

# Since a specific time
docker-compose logs --since="2026-01-01T00:00:00" wallet-service

Aspire Dashboard: The Structured Logs tab in the Aspire Dashboard provides filtering, search, and correlation of log entries across all services.

Structured Logging Format

All Sorcha services use Serilog for structured logging. Log entries include:

FieldDescription
TimestampISO 8601 timestamp
LevelLog level (Debug, Information, Warning, Error, Fatal)
MessageTemplateStructured message with named placeholders
PropertiesKey-value pairs (correlation ID, user ID, etc.)
ExceptionException details (if applicable)
SourceContextOriginating class/namespace
TraceIdOpenTelemetry trace ID for correlation
SpanIdOpenTelemetry span ID

Log Output Configuration

By default, logs are written to stdout (captured by Docker). To add file-based logging or external sinks, configure Serilog in appsettings.json or via environment variables:

bash
# Set minimum log level
Serilog__MinimumLevel__Default=Information

# Override for specific namespaces
Serilog__MinimumLevel__Override__Microsoft=Warning
Serilog__MinimumLevel__Override__System=Warning

Distributed Tracing

Viewing Traces

Open the Aspire Dashboard Traces tab to see:

  • End-to-end request flows across services
  • Latency breakdown per service hop
  • Error traces highlighted in red
  • Dependency calls (database queries, Redis operations, HTTP clients)

Trace Correlation

All HTTP requests flowing through the API Gateway receive a trace ID that propagates to downstream services. This enables end-to-end visibility of a single user request across all services.

Key Trace Attributes

AttributeDescription
service.nameService that generated the span
deployment.environmentdocker or production
http.methodHTTP method (GET, POST, etc.)
http.urlRequest URL
http.status_codeResponse status code
db.systemDatabase type (postgresql, mongodb, redis)
db.statementDatabase query (in Development mode)

Metrics

The Aspire Dashboard Metrics tab shows runtime metrics including:

  • ASP.NET Core: Request rate, response time, active connections
  • Runtime: GC collections, thread pool usage, memory
  • HTTP Client: Outbound request rate and latency
  • Database: Connection pool size, query duration

External OTEL Integration

To send telemetry to external observability platforms instead of (or in addition to) the Aspire Dashboard, change the OTLP endpoint:

Datadog

yaml
environment:
  OTEL_EXPORTER_OTLP_ENDPOINT: http://datadog-agent:4317
  OTEL_EXPORTER_OTLP_PROTOCOL: grpc

Grafana (via Grafana Agent / Alloy)

yaml
environment:
  OTEL_EXPORTER_OTLP_ENDPOINT: http://grafana-agent:4317
  OTEL_EXPORTER_OTLP_PROTOCOL: grpc

Azure Monitor (Application Insights)

yaml
environment:
  APPLICATIONINSIGHTS_CONNECTION_STRING: InstrumentationKey=<key>;IngestionEndpoint=https://<region>.in.applicationinsights.azure.com/

Azure Monitor integration uses the Application Insights SDK rather than pure OTLP. Add the Azure.Monitor.OpenTelemetry.AspNetCore NuGet package for native support.

Dual Export (Dashboard + External)

To keep the Aspire Dashboard while also exporting to an external system, use an OpenTelemetry Collector as an intermediary:

Services --> OTEL Collector --> Aspire Dashboard
                            --> External Platform

Alerting Recommendations

For production deployments, configure alerts on:

MetricThresholdSeverity
Health check failureAny service unhealthy > 2 minCritical
Response time (p95)> 2 secondsWarning
Error rate (5xx)> 1% of requestsCritical
Disk usage> 80% on database volumesWarning
Memory usage> 90% per containerWarning
MongoDB oplog lag> 10 secondsWarning
PostgreSQL connection pool> 80% utilizedWarning

Released under the MIT License.