Files
2026-02-02 00:49:50 +01:00

4.7 KiB

Watchdog Manager Component

ASF Sensor Hub (Sub-Hub) Embedded System

Component ID: C-WATCHDOG-001
Version: 1.0
Date: 2025-02-01
Location: application_layer/services/watchdog_manager/
Platform: ESP32-S3, ESP-IDF v5.4


1. Component Overview

The Watchdog Manager implements a layered watchdog system providing task-level, interrupt-level, and system-level watchdog protection. This component ensures system reliability by detecting deadlocks, hangs, and system freezes.

Primary Purpose: Provide multi-level watchdog protection for system reliability.


2. Responsibilities

2.1 In-Scope

  • Task Watchdog (Task WDT) management
  • Interrupt Watchdog (Interrupt WDT) management
  • RTC Watchdog (RTC WDT) management
  • Watchdog timeout configuration
  • Watchdog feeding coordination
  • Watchdog event reporting

2.2 Out-of-Scope

  • Watchdog hardware management (handled by ESP-IDF)
  • Watchdog reset handling (handled by System State Manager)

3. Layered Watchdog Architecture

┌─────────────────────────────────────┐
│   RTC Watchdog (30s)                │  System-level protection
│   └─> Detects total system freeze   │
├─────────────────────────────────────┤
│   Interrupt Watchdog (3s)            │  Interrupt-level protection
│   └─> Detects ISR hangs             │
├─────────────────────────────────────┤
│   Task Watchdog (10s)                │  Task-level protection
│   └─> Detects task deadlocks         │
└─────────────────────────────────────┘

4. Provided Interfaces

4.1 Task Watchdog Interface

/**
 * @brief Register task with Task Watchdog
 * @param task_handle Task handle
 * @param timeout_ms Timeout in milliseconds
 * @return true on success
 */
bool watchdog_task_register(osal_task_handle_t task_handle, uint32_t timeout_ms);

/**
 * @brief Feed Task Watchdog
 * @param task_handle Task handle
 * @return true on success
 */
bool watchdog_task_feed(osal_task_handle_t task_handle);

/**
 * @brief Unregister task from Task Watchdog
 * @param task_handle Task handle
 * @return true on success
 */
bool watchdog_task_unregister(osal_task_handle_t task_handle);

4.2 Interrupt Watchdog Interface

/**
 * @brief Feed Interrupt Watchdog
 * @return true on success
 */
bool watchdog_interrupt_feed(void);

4.3 RTC Watchdog Interface

/**
 * @brief Feed RTC Watchdog
 * @return true on success
 */
bool watchdog_rtc_feed(void);

/**
 * @brief Initialize RTC Watchdog
 * @param timeout_ms Timeout in milliseconds
 * @return true on success
 */
bool watchdog_rtc_init(uint32_t timeout_ms);

5. Watchdog Configuration

5.1 Default Timeouts

  • Task Watchdog: 10 seconds (baseline)
  • Interrupt Watchdog: 3 seconds (baseline)
  • RTC Watchdog: 30 seconds (baseline)

5.2 Configurable Timeouts

  • Task-specific timeouts (from Machine Constants)
  • Interrupt watchdog timeout (system-wide)
  • RTC watchdog timeout (system-wide)

6. ESP-IDF Integration

6.1 ESP-IDF Services Used

  • esp_task_wdt.h - Task Watchdog
  • esp_intr_wdt.h - Interrupt Watchdog
  • esp_system.h - RTC Watchdog

6.2 Watchdog Behavior

  • Task Watchdog: Resets system on timeout
  • Interrupt Watchdog: Resets system on timeout
  • RTC Watchdog: Resets system on timeout (final safety net)

7. Error Handling

7.1 Watchdog Timeout Handling

  • Log timeout event to Diagnostics Manager
  • Generate diagnostic event (FATAL severity)
  • Trigger system reset (handled by ESP-IDF)
  • State transition to FAULT (if possible before reset)

7.2 Diagnostic Events

  • DIAG-DIAG-WDT-0001: Task Watchdog timeout
  • DIAG-DIAG-WDT-0002: Interrupt Watchdog timeout
  • DIAG-DIAG-WDT-0003: RTC Watchdog timeout

8. Dependencies

8.1 OSAL Dependencies

  • OSAL-TASK: Task handle management

8.2 Application Dependencies

  • Logger: Watchdog event logging
  • Diagnostics Manager: Error reporting
  • System State Manager: State transition on timeout

9. Traceability

9.1 System Requirements

  • SR-DIAG-012: Task Watchdog implementation
  • SR-DIAG-013: Interrupt Watchdog implementation
  • SR-DIAG-014: RTC Watchdog implementation

9.2 Software Requirements

  • SWR-DIAG-012: Task Watchdog with 10-second timeout
  • SWR-DIAG-013: Interrupt Watchdog with 3-second timeout
  • SWR-DIAG-014: RTC Watchdog with 30-second timeout

Document Status: Complete
Next Review: Before implementation