Files
ASF_01_sys_sw_arch/1 software design/features/SF-DIAG_Diagnostics_Health.md
2026-02-01 19:47:53 +01:00

8.5 KiB

Software Feature Specification

SF-DIAG: Diagnostics & Health Monitoring

Software Feature ID: SF-DIAG
Mapped System Feature: F-DIAG (Diagnostics & Health Monitoring Features)
Version: 1.0
Date: 2025-02-01

1. Feature Overview

The Diagnostics & Health Monitoring software feature implements comprehensive system health monitoring, fault detection, diagnostic data collection, and engineering access capabilities. This feature provides the software implementation of diagnostic code management, diagnostic data storage, diagnostic sessions, and layered watchdog systems.

1.1 Mapped System Features

  • F-DIAG-01: Diagnostic Code Management
  • F-DIAG-02: Diagnostic Data Storage
  • F-DIAG-03: Diagnostic Session
  • F-DIAG-04: Layered Watchdog System

2. Static View - Component Architecture

graph TB
    subgraph "Application Layer"
        DM[Diagnostics Manager]
        HS[Health Monitor]
        DS[Diagnostic Session]
    end
    
    subgraph "Diagnostic Services"
        DC[Diagnostic Collector]
        DR[Diagnostic Reporter]
        WD[Watchdog Manager]
    end
    
    subgraph "Storage Layer"
        DL[Diagnostic Logger]
        DP[Data Pool]
    end
    
    subgraph "Hardware Monitoring"
        TWD[Task Watchdog]
        IWD[Interrupt Watchdog]
        RWD[RTC Watchdog]
        TM[Temperature Monitor]
        VM[Voltage Monitor]
    end
    
    DM --> DC
    DM --> DR
    DM --> WD
    HS --> TM
    HS --> VM
    DS --> DM
    DC --> DL
    DR --> DP
    WD --> TWD
    WD --> IWD
    WD --> RWD

2.1 Component Interfaces

2.1.1 Diagnostics Manager Interfaces

Provided Interfaces:

  • IDiagnosticsManager: Main diagnostics interface
  • IDiagnosticReporter: Diagnostic event reporting interface
  • IHealthMonitor: System health monitoring interface

Required Interfaces:

  • IDiagnosticCollector: Diagnostic data collection interface
  • IDataPool: Data storage interface
  • IEventSystem: Event notification interface

2.1.2 Diagnostic Session Interfaces

Provided Interfaces:

  • IDiagnosticSession: Engineering diagnostic access interface
  • ISystemInspection: System inspection interface
  • ILogAccess: Log access interface

Required Interfaces:

  • IDiagnosticsManager: Diagnostics management interface
  • ISecurityManager: Access control interface

3. Dynamic View - Diagnostic Sequences

3.1 Diagnostic Event Generation Sequence

sequenceDiagram
    participant COMP as System Component
    participant DM as Diagnostics Manager
    participant DC as Diagnostic Collector
    participant DL as Diagnostic Logger
    participant ES as Event System
    
    COMP->>DM: reportDiagnostic(code, severity, context)
    DM->>DM: validateDiagnostic(code, severity)
    DM->>DC: collectDiagnosticData(code, context)
    DC->>DC: enrichDiagnostic(timestamp, source, details)
    DC->>DL: logDiagnostic(diagnostic_record)
    DL->>DL: persistDiagnostic(record)
    DM->>ES: publishDiagnosticEvent(diagnostic)
    
    alt Critical Diagnostic
        DM->>DM: triggerEmergencyAction()
    end

3.2 Health Monitoring Sequence

sequenceDiagram
    participant HM as Health Monitor
    participant TM as Temperature Monitor
    participant VM as Voltage Monitor
    participant WD as Watchdog Manager
    participant DM as Diagnostics Manager
    
    loop Health Check Cycle
        HM->>TM: getTemperature()
        TM-->>HM: temperature_value
        HM->>VM: getVoltage()
        VM-->>HM: voltage_value
        HM->>WD: getWatchdogStatus()
        WD-->>HM: watchdog_status
        
        alt Health Issue Detected
            HM->>DM: reportHealthIssue(issue_type, severity)
        end
    end

3.3 Diagnostic Session Sequence

sequenceDiagram
    participant ENG as Engineer
    participant DS as Diagnostic Session
    participant DM as Diagnostics Manager
    participant DL as Diagnostic Logger
    participant SM as Security Manager
    
    ENG->>DS: requestDiagnosticSession(credentials)
    DS->>SM: authenticateUser(credentials)
    SM-->>DS: authentication_result
    
    alt Authentication Success
        DS->>DM: getDiagnosticSummary()
        DM-->>DS: diagnostic_summary
        DS-->>ENG: session_established(summary)
        
        ENG->>DS: retrieveDiagnostics(filter)
        DS->>DL: queryDiagnostics(filter)
        DL-->>DS: diagnostic_records
        DS-->>ENG: diagnostic_data
        
        ENG->>DS: clearDiagnostics(codes)
        DS->>DM: clearDiagnosticCodes(codes)
        DM-->>DS: clear_result
        DS-->>ENG: operation_complete
    else Authentication Failed
        DS-->>ENG: access_denied
    end

4. Software Constraints

4.1 Performance Constraints

  • SWC-DIAG-001: Diagnostic event processing must complete within 10ms
  • SWC-DIAG-002: Health monitoring cycle must not exceed 1 second
  • SWC-DIAG-003: Diagnostic logging must not block system operations

4.2 Resource Constraints

  • SWC-DIAG-004: Maximum diagnostic buffer size limited to 8KB
  • SWC-DIAG-005: Diagnostic log storage limited to 1MB with rotation
  • SWC-DIAG-006: Maximum 100 active diagnostic codes supported

4.3 Reliability Constraints

  • SWC-DIAG-007: Diagnostic system must remain operational during system faults
  • SWC-DIAG-008: Critical diagnostics must be persisted immediately
  • SWC-DIAG-009: Watchdog system must be independent of main application tasks

4.4 Security Constraints

  • SWC-DIAG-010: Diagnostic session access must be authenticated
  • SWC-DIAG-011: Sensitive diagnostic data must be protected
  • SWC-DIAG-012: Diagnostic clear operations must be logged and audited

5. Traceability Matrix - Software Requirements

Software Requirement ID Feature Mapping Component Verification Method
SWR-DIAG-001 F-DIAG-01 Diagnostics Manager Unit Test
SWR-DIAG-002 F-DIAG-01 Diagnostic Collector Unit Test
SWR-DIAG-003 F-DIAG-01 Diagnostics Manager Unit Test
SWR-DIAG-004 F-DIAG-01 Diagnostic Collector Unit Test
SWR-DIAG-005 F-DIAG-02 Diagnostic Logger Integration Test
SWR-DIAG-006 F-DIAG-02 Diagnostic Logger Unit Test
SWR-DIAG-007 F-DIAG-02 Diagnostic Logger Unit Test
SWR-DIAG-008 F-DIAG-03 Diagnostic Session Integration Test
SWR-DIAG-009 F-DIAG-03 Diagnostic Session Unit Test
SWR-DIAG-010 F-DIAG-03 Diagnostic Session Unit Test
SWR-DIAG-011 F-DIAG-03 Diagnostic Session Unit Test
SWR-DIAG-012 F-DIAG-04 Watchdog Manager Hardware Test
SWR-DIAG-013 F-DIAG-04 Watchdog Manager Hardware Test
SWR-DIAG-014 F-DIAG-04 Watchdog Manager Hardware Test

6. Implementation Notes

6.1 Diagnostic Code System

  • Hierarchical diagnostic codes: CATEGORY-COMPONENT-ERROR (e.g., SEN-TEMP-001)
  • Severity levels: INFO, WARNING, ERROR, CRITICAL, FATAL
  • Diagnostic context includes timestamp, source component, and relevant data
  • Diagnostic codes are versioned and documented in separate specification

6.2 Health Monitoring

  • Continuous monitoring of system vital signs:
    • CPU temperature and usage
    • Memory usage (heap, stack)
    • Storage space and health
    • Communication link status
    • Power supply voltage and current
  • Configurable thresholds via Machine Constants
  • Predictive health analysis for proactive maintenance

6.3 Watchdog System Architecture

  • Task Watchdog (TWDT): Monitors FreeRTOS tasks for deadlocks (10s timeout)
  • Interrupt Watchdog (IWDT): Detects ISR hangs (3s timeout)
  • RTC Watchdog (RWDT): Final safety net for total system freeze (30s timeout)
  • Watchdog feeding distributed across system components
  • Watchdog timeout triggers diagnostic event before system reset

6.4 Diagnostic Storage

  • Diagnostic events stored in circular buffer with persistence
  • Log rotation based on size and age limits
  • Critical diagnostics stored in separate high-priority storage
  • Diagnostic data survives system resets and power cycles

6.5 Engineering Access

  • Secure diagnostic session with role-based access control
  • Remote diagnostic access via communication interface
  • Local diagnostic access via OLED display and buttons
  • Diagnostic data export capabilities for analysis tools

6.6 Error Handling

  • Diagnostic system failures are self-reported when possible
  • Fallback diagnostic mechanisms for critical system failures
  • Diagnostic data integrity verification and recovery
  • Emergency diagnostic flush during system teardown