cleanup sw req

This commit is contained in:
2026-02-01 19:47:53 +01:00
parent 0bdbcb1657
commit 304371c6b8
608 changed files with 47798 additions and 0 deletions

View File

@@ -0,0 +1,561 @@
# Feature Specification: Communication
# Feature ID: F-COM (F-COM-001 to F-COM-005)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Communication
## 1. Feature Overview
### 1.1 Feature Purpose
The Communication feature provides comprehensive data exchange capabilities between the ASF Sensor Hub and external entities including the Main Hub and peer Sensor Hubs. This feature ensures reliable, secure, and deterministic transfer of sensor data, diagnostics, configuration updates, and control commands.
### 1.2 Feature Scope
**In Scope:**
- Bidirectional communication with Main Hub via MQTT over TLS 1.2
- On-demand data broadcasting and request/response handling
- Peer-to-peer communication between Sensor Hubs via ESP-NOW
- Long-range fallback communication options (LoRa/Cellular)
- Communication protocol management and error handling
**Out of Scope:**
- Main Hub broker implementation and configuration
- Cloud communication protocols and interfaces
- Internet connectivity and routing management
- Physical network infrastructure design
## 2. Sub-Features
### 2.1 F-COM-001: Main Hub Communication
**Description:** Primary bidirectional communication channel with the Main Hub using MQTT over TLS 1.2 for secure and reliable data exchange.
**Protocol Stack:**
```mermaid
graph TB
subgraph "Communication Protocol Stack"
APP[Application Layer - CBOR Messages]
MQTT[MQTT Layer - QoS 1, Topics, Keepalive]
TLS[TLS 1.2 Layer - mTLS, X.509 Certificates]
TCP[TCP Layer - Reliable Transport]
IP[IP Layer - Network Routing]
WIFI[Wi-Fi 802.11n - 2.4 GHz Physical Layer]
end
APP --> MQTT
MQTT --> TLS
TLS --> TCP
TCP --> IP
IP --> WIFI
```
**MQTT Configuration:**
- **Broker:** Main Hub / Edge Gateway
- **QoS Level:** QoS 1 (At least once delivery)
- **Keepalive:** 60 seconds with 30-second timeout
- **Max Message Size:** 8KB per message
- **Payload Format:** CBOR (Compact Binary Object Representation)
**Topic Structure:**
```
/farm/{site_id}/{house_id}/{node_id}/data/{sensor_type}
/farm/{site_id}/{house_id}/{node_id}/status/heartbeat
/farm/{site_id}/{house_id}/{node_id}/status/system
/farm/{site_id}/{house_id}/{node_id}/cmd/{command_type}
/farm/{site_id}/{house_id}/{node_id}/diag/{severity_level}
/farm/{site_id}/{house_id}/{node_id}/ota/{action}
```
### 2.2 F-COM-002: On-Demand Data Broadcasting
**Description:** Real-time data request/response mechanism allowing the Main Hub to query current sensor data without waiting for periodic updates.
**Request/Response Flow:**
```mermaid
sequenceDiagram
participant MH as Main Hub
participant API as Main Hub APIs
participant DP as Data Pool
participant SM as Sensor Manager
Note over MH,SM: On-Demand Data Request
MH->>API: REQUEST_SENSOR_DATA(sensor_ids)
API->>DP: getLatestSensorData(sensor_ids)
DP-->>API: sensor_data_records
alt Data available and fresh
API->>API: formatCBORResponse(data)
API->>MH: SENSOR_DATA_RESPONSE(data)
else Data stale or unavailable
API->>SM: requestImmediateSample(sensor_ids)
SM->>SM: performSampling()
SM->>DP: updateSensorData(fresh_data)
DP-->>API: fresh_sensor_data
API->>MH: SENSOR_DATA_RESPONSE(fresh_data)
end
Note over MH,SM: Response time < 100ms
```
**Response Characteristics:**
- **Maximum Response Time:** 100ms from request to response
- **Data Freshness:** Timestamp included with all data
- **Validity Status:** Data quality indicators included
- **Batch Support:** Multiple sensors in single response
### 2.3 F-COM-003: Peer Sensor Hub Communication
**Description:** Limited peer-to-peer communication between Sensor Hubs using ESP-NOW for coordination and status exchange.
**ESP-NOW Configuration:**
- **Protocol:** ESP-NOW (IEEE 802.11 vendor-specific)
- **Range:** ~200m line-of-sight, ~50m through walls
- **Security:** Application-layer AES-128 encryption
- **Max Peers:** 20 concurrent peer connections
- **Acknowledgment:** Application-layer retry mechanism
**Peer Message Types:**
```c
typedef enum {
PEER_MSG_PING = 0x01, // Connectivity check
PEER_MSG_PONG = 0x02, // Connectivity response
PEER_MSG_TIME_SYNC_REQ = 0x03, // Time synchronization request
PEER_MSG_TIME_SYNC_RESP = 0x04, // Time synchronization response
PEER_MSG_STATUS_UPDATE = 0x05, // Status information exchange
PEER_MSG_EMERGENCY = 0x06 // Emergency notification
} peer_message_type_t;
typedef struct {
uint8_t message_type;
uint8_t source_id[6]; // MAC address
uint8_t sequence_number;
uint16_t payload_length;
uint8_t payload[ESP_NOW_MAX_DATA_LEN - 10];
uint8_t checksum;
} peer_message_t;
```
### 2.4 F-COM-004: Heartbeat and Status Reporting
**Description:** Continuous system health and status reporting to maintain connection awareness and system monitoring.
**Heartbeat Message Structure:**
```c
typedef struct {
uint32_t uptime_seconds;
char firmware_version[16];
uint32_t free_heap_bytes;
int8_t wifi_rssi_dbm;
uint32_t error_bitmap;
system_state_t current_state;
uint8_t sensor_count_active;
uint8_t sensor_count_total;
uint32_t last_data_timestamp;
uint16_t communication_errors;
} heartbeat_payload_t;
```
**Status Reporting Schedule:**
- **Heartbeat Interval:** 10 seconds (configurable)
- **Status Update:** On state changes (immediate)
- **Error Reporting:** On fault detection (immediate)
- **Performance Metrics:** Every 5 minutes
### 2.5 F-COM-005: Long-Range Fallback Communication
**Description:** Optional long-range communication capability for farm-scale distances where Wi-Fi coverage is insufficient.
**Fallback Options:**
1. **LoRa Module (Optional):**
- External LoRa transceiver (SX1276/SX1262)
- LoRaWAN or proprietary protocol
- Use cases: Emergency alerts, basic status
- Data rate: Low (not suitable for OTA updates)
2. **Cellular Module (Alternative):**
- LTE-M or NB-IoT modem
- Higher data rate than LoRa
- Suitable for OTA updates
- Higher power consumption and cost
**Fallback Activation Logic:**
```mermaid
graph TD
START[Communication Start] --> WIFI{Wi-Fi Available?}
WIFI -->|Yes| CONNECT[Connect to Wi-Fi]
WIFI -->|No| FALLBACK{Fallback Enabled?}
CONNECT --> MQTT{MQTT Connected?}
MQTT -->|Yes| NORMAL[Normal Operation]
MQTT -->|No| RETRY[Retry Connection]
RETRY --> TIMEOUT{Timeout Exceeded?}
TIMEOUT -->|No| MQTT
TIMEOUT -->|Yes| FALLBACK
FALLBACK -->|Yes| LORA[Activate LoRa/Cellular]
FALLBACK -->|No| OFFLINE[Offline Mode]
LORA --> LIMITED[Limited Communication]
OFFLINE --> STORE[Store Data Locally]
NORMAL --> MONITOR[Monitor Connection]
LIMITED --> MONITOR
MONITOR --> WIFI
```
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-COM-001** | SR-COM-001, SR-COM-002, SR-COM-003 | MQTT over TLS communication with Main Hub |
| **F-COM-002** | SR-COM-004, SR-COM-005 | On-demand data requests and responses |
| **F-COM-003** | SR-COM-006, SR-COM-007 | ESP-NOW peer communication |
| **F-COM-004** | SR-COM-008, SR-COM-009 | Heartbeat and status reporting |
| **F-COM-005** | SR-COM-010, SR-COM-011 | Long-range fallback communication |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-COM-001** | SWR-COM-001, SWR-COM-002, SWR-COM-003 | MQTT client, TLS implementation, topic management |
| **F-COM-002** | SWR-COM-004, SWR-COM-005, SWR-COM-006 | Request handling, data formatting, response timing |
| **F-COM-003** | SWR-COM-007, SWR-COM-008, SWR-COM-009 | ESP-NOW driver, peer management, encryption |
| **F-COM-004** | SWR-COM-010, SWR-COM-011, SWR-COM-012 | Status collection, heartbeat scheduling, error reporting |
| **F-COM-005** | SWR-COM-013, SWR-COM-014, SWR-COM-015 | Fallback protocols, activation logic, data prioritization |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Main Hub APIs** | MQTT communication, message handling, protocol management | `application_layer/business_stack/main_hub_apis/` |
| **Network Stack** | Wi-Fi management, TCP/IP, TLS implementation | `drivers/network_stack/` |
| **Peer Communication Manager** | ESP-NOW management, peer coordination | `application_layer/peer_comm/` |
| **Communication Controller** | Protocol coordination, fallback management | `application_layer/comm_controller/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Event System** | Message routing, status notifications | `application_layer/business_stack/event_system/` |
| **Data Pool** | Latest sensor data access | `application_layer/DP_stack/data_pool/` |
| **Security Manager** | Certificate management, encryption | `application_layer/security/` |
| **Diagnostics Task** | Communication error logging | `application_layer/diag_task/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Communication Feature"
API[Main Hub APIs]
NS[Network Stack]
PCM[Peer Comm Manager]
CC[Communication Controller]
end
subgraph "Core System"
ES[Event System]
DP[Data Pool]
SEC[Security Manager]
DIAG[Diagnostics Task]
end
subgraph "Hardware Interfaces"
WIFI[Wi-Fi Radio]
ESPNOW[ESP-NOW Interface]
LORA[LoRa Module]
CELL[Cellular Module]
end
subgraph "External"
MAINHUB[Main Hub]
PEERS[Peer Hubs]
end
API <--> NS
API <--> ES
API <--> DP
API <--> SEC
PCM <--> ESPNOW
PCM <--> ES
PCM <--> SEC
CC --> API
CC --> PCM
CC --> NS
NS --> WIFI
NS --> SEC
NS --> DIAG
WIFI <--> MAINHUB
ESPNOW <--> PEERS
LORA -.-> MAINHUB
CELL -.-> MAINHUB
ES -.->|Status Events| API
DP -.->|Sensor Data| API
DIAG -.->|Error Events| API
```
### 4.4 Communication Flow Sequence
```mermaid
sequenceDiagram
participant SM as Sensor Manager
participant ES as Event System
participant API as Main Hub APIs
participant NS as Network Stack
participant MH as Main Hub
Note over SM,MH: Sensor Data Communication Flow
SM->>ES: publish(SENSOR_DATA_UPDATE, data)
ES->>API: sensorDataEvent(data)
API->>API: formatMQTTMessage(data)
API->>NS: publishMQTT(topic, payload)
NS->>NS: encryptTLS(payload)
NS->>MH: MQTT_PUBLISH(encrypted_data)
MH-->>NS: MQTT_PUBACK
NS-->>API: publishComplete()
alt Communication Error
NS->>DIAG: logCommError(error_details)
NS->>ES: publish(COMM_ERROR, error)
ES->>API: commErrorEvent(error)
API->>API: handleCommError()
end
Note over SM,MH: Heartbeat Flow
loop Every 10 seconds
API->>API: collectSystemStatus()
API->>NS: publishHeartbeat(status)
NS->>MH: MQTT_PUBLISH(heartbeat)
end
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **Connection Establishment:**
- Initialize Wi-Fi connection with configured credentials
- Establish TLS session with Main Hub broker
- Authenticate using device certificate (mTLS)
- Subscribe to command and configuration topics
2. **Data Communication:**
- Publish sensor data on acquisition completion
- Send heartbeat messages at regular intervals
- Handle on-demand data requests from Main Hub
- Process configuration and command messages
3. **Peer Communication:**
- Maintain ESP-NOW peer list and connections
- Exchange status information with nearby hubs
- Coordinate time synchronization when needed
- Handle emergency notifications from peers
4. **Error Recovery:**
- Detect communication failures and timeouts
- Implement exponential backoff for reconnection
- Switch to fallback communication if available
- Store data locally during communication outages
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Wi-Fi Disconnection** | Link status monitoring | Attempt reconnection, activate fallback |
| **MQTT Broker Unreachable** | Connection timeout | Retry with backoff, store data locally |
| **TLS Certificate Error** | Certificate validation failure | Log security event, request new certificate |
| **Message Timeout** | Acknowledgment timeout | Retry message, escalate if persistent |
| **Peer Communication Failure** | ESP-NOW transmission failure | Remove peer, attempt rediscovery |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Establish connections, authenticate, subscribe to topics |
| **RUNNING** | Full communication functionality, all protocols active |
| **WARNING** | Continue communication, increase error reporting |
| **FAULT** | Emergency communication only, diagnostic data priority |
| **OTA_UPDATE** | OTA-specific communication, suspend normal data flow |
| **TEARDOWN** | Send final status, gracefully close connections |
| **SERVICE** | Engineering communication enabled, diagnostic access |
| **SD_DEGRADED** | Continue communication, no local data buffering |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Connection Establishment:** Maximum 30 seconds for initial connection
- **Message Transmission:** Maximum 5 seconds for MQTT publish
- **On-Demand Response:** Maximum 100ms from request to response
- **Heartbeat Interval:** 10 seconds ±1 second tolerance
### 6.2 Resource Constraints
- **Memory Usage:** Maximum 64KB for communication buffers
- **Bandwidth Usage:** Maximum 1 Mbps average, 5 Mbps peak
- **Connection Limit:** 1 Main Hub + 20 peer connections maximum
- **Message Queue:** Maximum 100 pending messages
### 6.3 Security Constraints
- **Encryption:** All communication must use TLS 1.2 or higher
- **Authentication:** Mutual TLS required for Main Hub communication
- **Certificate Validation:** Full certificate chain validation required
- **Key Management:** Automatic key rotation support required
## 7. Interface Specifications
### 7.1 Main Hub APIs Public Interface
```c
// Connection management
bool mainHubAPI_initialize(const comm_config_t* config);
bool mainHubAPI_connect(void);
bool mainHubAPI_disconnect(void);
bool mainHubAPI_isConnected(void);
// Message publishing
bool mainHubAPI_publishSensorData(const sensor_data_record_t* data);
bool mainHubAPI_publishHeartbeat(const heartbeat_payload_t* heartbeat);
bool mainHubAPI_publishDiagnostic(const diagnostic_event_t* event);
bool mainHubAPI_publishStatus(const system_status_t* status);
// Message handling
bool mainHubAPI_subscribeToCommands(command_handler_t handler);
bool mainHubAPI_subscribeToConfig(config_handler_t handler);
bool mainHubAPI_handleOnDemandRequest(const data_request_t* request);
// Status and statistics
comm_status_t mainHubAPI_getConnectionStatus(void);
comm_stats_t mainHubAPI_getStatistics(void);
bool mainHubAPI_resetStatistics(void);
```
### 7.2 Peer Communication Manager API
```c
// Peer management
bool peerComm_initialize(void);
bool peerComm_addPeer(const uint8_t* mac_address);
bool peerComm_removePeer(const uint8_t* mac_address);
bool peerComm_getPeerList(peer_info_t* peers, size_t* count);
// Message transmission
bool peerComm_sendPing(const uint8_t* peer_mac);
bool peerComm_sendTimeSync(const uint8_t* peer_mac, uint64_t timestamp);
bool peerComm_sendStatus(const uint8_t* peer_mac, const peer_status_t* status);
bool peerComm_broadcastEmergency(const emergency_msg_t* emergency);
// Message reception
bool peerComm_registerMessageHandler(peer_message_handler_t handler);
bool peerComm_setEncryptionKey(const uint8_t* key, size_t key_length);
```
### 7.3 Network Stack Interface
```c
// Network management
bool networkStack_initialize(void);
bool networkStack_connectWiFi(const wifi_config_t* config);
bool networkStack_disconnectWiFi(void);
wifi_status_t networkStack_getWiFiStatus(void);
// MQTT operations
bool networkStack_connectMQTT(const mqtt_config_t* config);
bool networkStack_publishMQTT(const char* topic, const uint8_t* payload, size_t length);
bool networkStack_subscribeMQTT(const char* topic, mqtt_message_handler_t handler);
bool networkStack_disconnectMQTT(void);
// TLS management
bool networkStack_loadCertificate(const uint8_t* cert, size_t cert_length);
bool networkStack_loadPrivateKey(const uint8_t* key, size_t key_length);
bool networkStack_validateCertificate(const uint8_t* cert);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Protocol Implementation:** MQTT, TLS, ESP-NOW protocol compliance
- **Message Formatting:** CBOR encoding/decoding validation
- **Error Handling:** Network failure and recovery scenarios
- **Security:** Certificate validation and encryption testing
### 8.2 Integration Testing
- **Main Hub Communication:** End-to-end MQTT communication testing
- **Peer Communication:** ESP-NOW multi-device testing
- **Fallback Systems:** LoRa/Cellular fallback activation
- **Event Integration:** Communication event publication and handling
### 8.3 System Testing
- **Load Testing:** High-frequency data transmission under load
- **Reliability Testing:** 48-hour continuous communication
- **Security Testing:** Penetration testing and certificate validation
- **Interoperability:** Communication with actual Main Hub systems
### 8.4 Acceptance Criteria
- Successful connection establishment within timing constraints
- 99.9% message delivery success rate under normal conditions
- On-demand responses within 100ms requirement
- Secure communication with proper certificate validation
- Graceful handling of all communication error conditions
- Peer communication functional with multiple concurrent peers
## 9. Dependencies
### 9.1 Internal Dependencies
- **Event System:** Message routing and status notifications
- **Data Pool:** Access to latest sensor data for transmission
- **Security Manager:** Certificate management and encryption
- **State Manager:** System state awareness for communication control
### 9.2 External Dependencies
- **ESP-IDF Framework:** Wi-Fi, TCP/IP, TLS, ESP-NOW drivers
- **Main Hub Broker:** MQTT broker availability and configuration
- **Network Infrastructure:** Wi-Fi access points and internet connectivity
- **Certificate Authority:** X.509 certificates for device authentication
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Adaptive QoS:** Dynamic quality of service based on network conditions
- **Mesh Networking:** Sensor Hub mesh for extended coverage
- **Edge Computing:** Local data processing and filtering
- **5G Integration:** 5G connectivity for high-bandwidth applications
### 10.2 Scalability Considerations
- **Protocol Optimization:** Compressed protocols for bandwidth efficiency
- **Load Balancing:** Multiple Main Hub connections for redundancy
- **Cloud Integration:** Direct cloud connectivity bypass Main Hub
- **IoT Platform Integration:** Standard IoT platform protocol support
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-COM, SWR-COM)
**Next Review:** After component implementation

View File

@@ -0,0 +1,445 @@
# Feature Specification: Sensor Data Acquisition
# Feature ID: F-DAQ (F-DAQ-001 to F-DAQ-005)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Sensor Data Acquisition
## 1. Feature Overview
### 1.1 Feature Purpose
The Sensor Data Acquisition feature provides comprehensive environmental sensor data collection, processing, and preparation capabilities for the ASF Sensor Hub. This feature ensures reliable, high-quality sensor data is available for persistence, communication, and system monitoring.
### 1.2 Feature Scope
**In Scope:**
- Multi-sensor data acquisition from 7 sensor types
- High-frequency sampling with configurable parameters
- Local data filtering and noise reduction
- Timestamped data record generation
- Sensor state management and lifecycle control
**Out of Scope:**
- Sensor hardware design and manufacturing
- Main Hub data processing and analytics
- Control algorithm implementation
- Actuator management
## 2. Sub-Features
### 2.1 F-DAQ-001: Multi-Sensor Data Acquisition
**Description:** Simultaneous data acquisition from multiple heterogeneous environmental sensors.
**Supported Sensor Types:**
- Temperature sensors (I2C/Analog)
- Humidity sensors (I2C)
- Carbon Dioxide (CO₂) sensors (UART/I2C)
- Ammonia (NH₃) sensors (Analog/I2C)
- Volatile Organic Compounds (VOC) sensors (I2C)
- Particulate Matter (PM) sensors (UART/I2C)
- Light Intensity sensors (Analog/I2C)
**Key Capabilities:**
- Concurrent sensor handling without blocking
- Modular sensor driver architecture
- Runtime sensor presence awareness
- Per-sensor enable/disable control
### 2.2 F-DAQ-002: High-Frequency Sampling
**Description:** Multiple raw readings per acquisition cycle with configurable sampling parameters.
**Sampling Characteristics:**
- Default: 10 samples per sensor per cycle
- Configurable sampling count (5-20 samples)
- Bounded sampling time window (max 800ms)
- Deterministic sampling intervals
**Benefits:**
- Noise reduction through oversampling
- Statistical confidence in measurements
- Outlier detection capability
### 2.3 F-DAQ-003: Local Data Filtering
**Description:** Configurable filtering algorithms applied to raw sensor samples.
**Available Filters:**
- **Median Filter:** Removes outliers and impulse noise
- **Moving Average:** Smooths data and reduces random noise
- **Rate-of-Change Limiter:** Prevents unrealistic value jumps
**Filter Configuration:**
- Filter type selectable per sensor
- Filter parameters configurable via Machine Constants
- Real-time filter switching capability
### 2.4 F-DAQ-004: Timestamped Data Generation
**Description:** Association of processed sensor values with accurate timestamps.
**Timestamp Characteristics:**
- Generated after filtering completion
- System time based (RTC or synchronized)
- Accuracy: ±1 second
- ISO 8601 format for persistence
**Data Record Structure:**
```c
typedef struct {
uint8_t sensor_id;
sensor_type_t sensor_type;
float filtered_value;
char unit[8];
uint64_t timestamp_ms;
data_validity_t validity;
uint16_t sample_count;
float raw_min, raw_max;
} sensor_data_record_t;
```
### 2.5 F-DAQ-005: Sensor State Management
**Description:** Comprehensive sensor lifecycle and state management.
**Sensor States:**
- **UNKNOWN:** Initial state, not yet detected
- **DETECTED:** Sensor presence confirmed
- **INITIALIZED:** Driver loaded and configured
- **ENABLED:** Active data acquisition
- **DISABLED:** Present but not acquiring data
- **FAULTY:** Detected failure condition
- **REMOVED:** Previously present, now absent
**State Transitions:**
```mermaid
stateDiagram-v2
[*] --> UNKNOWN
UNKNOWN --> DETECTED : Presence detected
DETECTED --> INITIALIZED : Driver loaded
INITIALIZED --> ENABLED : Acquisition started
ENABLED --> DISABLED : Manual disable
DISABLED --> ENABLED : Manual enable
ENABLED --> FAULTY : Failure detected
FAULTY --> ENABLED : Recovery successful
DETECTED --> REMOVED : Presence lost
INITIALIZED --> REMOVED : Presence lost
ENABLED --> REMOVED : Presence lost
DISABLED --> REMOVED : Presence lost
FAULTY --> REMOVED : Presence lost
REMOVED --> DETECTED : Presence restored
```
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-DAQ-001** | SR-DAQ-001 | Multi-sensor support for 7 sensor types |
| **F-DAQ-002** | SR-DAQ-002 | High-frequency sampling (min 10 samples/cycle) |
| **F-DAQ-003** | SR-DAQ-003 | Local data filtering with configurable algorithms |
| **F-DAQ-004** | SR-DAQ-004 | Timestamped data generation (±1 second accuracy) |
| **F-DAQ-005** | SR-DAQ-005 | Sensor state management and lifecycle control |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-DAQ-001** | SWR-DAQ-001, SWR-DAQ-002, SWR-DAQ-003 | Sensor driver abstraction, type enumeration, concurrent handling |
| **F-DAQ-002** | SWR-DAQ-004, SWR-DAQ-005, SWR-DAQ-006 | Configurable sampling, time windows, buffer management |
| **F-DAQ-003** | SWR-DAQ-007, SWR-DAQ-008, SWR-DAQ-009 | Median filter, moving average, filter selection |
| **F-DAQ-004** | SWR-DAQ-010, SWR-DAQ-011, SWR-DAQ-012 | Time interface, timestamp API, data record structure |
| **F-DAQ-005** | SWR-DAQ-013, SWR-DAQ-014, SWR-DAQ-015 | State enumeration, transition logic, persistence interface |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Sensor Manager** | Acquisition coordination, filtering, state management | `application_layer/business_stack/sensor_manager/` |
| **Sensor Drivers** | Hardware interface, raw data acquisition | `drivers/sensor_drivers/` |
| **Event System** | Data publication, component coordination | `application_layer/business_stack/event_system/` |
| **Data Pool** | Latest sensor data storage | `application_layer/DP_stack/data_pool/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Time Utils** | Timestamp generation | `utils/time_utils/` |
| **Logger** | Debug and diagnostic logging | `utils/logger/` |
| **Machine Constant Manager** | Filter configuration, sensor parameters | `application_layer/business_stack/mc_manager/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Sensor Data Acquisition Feature"
SM[Sensor Manager]
SD[Sensor Drivers]
ES[Event System]
DP[Data Pool]
TU[Time Utils]
MCM[MC Manager]
end
subgraph "External Interfaces"
SENSORS[Physical Sensors]
PERSIST[Persistence]
COMM[Communication]
end
SENSORS -->|I2C/SPI/UART/ADC| SD
SD -->|Raw Samples| SM
MCM -->|Filter Config| SM
TU -->|Timestamp| SM
SM -->|Filtered Data| ES
ES -->|Data Update Event| DP
ES -->|Data Update Event| PERSIST
ES -->|Data Update Event| COMM
SM -.->|State Changes| ES
SM -.->|Diagnostics| ES
```
### 4.4 Data Flow Sequence
```mermaid
sequenceDiagram
participant Sensor as Physical Sensor
participant Driver as Sensor Driver
participant Manager as Sensor Manager
participant MCMgr as MC Manager
participant TimeUtil as Time Utils
participant EventSys as Event System
participant DataPool as Data Pool
Note over Sensor,DataPool: Acquisition Cycle (1 second)
Manager->>MCMgr: getSamplingConfig(sensor_id)
MCMgr-->>Manager: sampling_config
loop 10 samples
Manager->>Driver: readSensor(sensor_id)
Driver->>Sensor: I2C/SPI/UART read
Sensor-->>Driver: raw_value
Driver-->>Manager: raw_sample
end
Manager->>Manager: applyFilter(raw_samples)
Manager->>TimeUtil: getCurrentTimestamp()
TimeUtil-->>Manager: timestamp
Manager->>Manager: createDataRecord()
Manager->>EventSys: publish(SENSOR_DATA_UPDATE, record)
EventSys->>DataPool: updateSensorData(record)
Note over Manager,DataPool: Data available for persistence and communication
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **Initialization Phase:**
- Load sensor configuration from Machine Constants
- Initialize sensor drivers for detected sensors
- Configure sampling and filtering parameters
- Transition sensors to ENABLED state
2. **Acquisition Cycle (1 second):**
- For each enabled sensor:
- Perform high-frequency sampling (10 samples)
- Apply configured filter to raw samples
- Generate timestamp for filtered value
- Create sensor data record
- Publish data update event
3. **State Management:**
- Monitor sensor health during acquisition
- Detect and handle sensor failures
- Update sensor states based on conditions
- Report state changes via events
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Sensor Communication Failure** | Timeout or invalid response | Mark sensor as FAULTY, continue with other sensors |
| **Out-of-Range Values** | Range validation | Mark data as invalid, log diagnostic event |
| **Sampling Timeout** | Bounded time window exceeded | Use partial samples, mark data quality degraded |
| **Filter Failure** | Exception in filter algorithm | Use raw average, log diagnostic event |
| **Memory Allocation Failure** | Buffer allocation failure | Skip cycle, trigger system diagnostic |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Initialize sensors, load configuration |
| **RUNNING** | Normal acquisition cycles |
| **WARNING** | Continue acquisition, increase diagnostic reporting |
| **FAULT** | Stop acquisition, preserve sensor states |
| **OTA_UPDATE** | Stop acquisition, save sensor states |
| **MC_UPDATE** | Stop acquisition, reload configuration after update |
| **TEARDOWN** | Stop acquisition, flush pending data |
| **SERVICE** | Limited acquisition for diagnostics |
| **SD_DEGRADED** | Continue acquisition, no persistence |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Acquisition Cycle:** Must complete within 1 second
- **Sampling Window:** Maximum 800ms per sensor
- **Filter Processing:** Maximum 50ms per sensor
- **Event Publication:** Maximum 10ms delay
### 6.2 Resource Constraints
- **Memory Usage:** Maximum 32KB for sensor data buffers
- **CPU Usage:** Maximum 20% of available CPU time
- **I/O Bandwidth:** Shared among all sensors, priority-based
### 6.3 Quality Constraints
- **Data Accuracy:** Within sensor specification limits
- **Timestamp Accuracy:** ±1 second of system time
- **Filter Effectiveness:** >90% noise reduction for median filter
- **State Consistency:** 100% accurate state representation
## 7. Interface Specifications
### 7.1 Sensor Manager Public API
```c
// Initialization and configuration
bool sensorMgr_initialize(void);
bool sensorMgr_loadConfiguration(const machine_constants_t* mc);
bool sensorMgr_detectSensors(void);
// Acquisition control
bool sensorMgr_startAcquisition(void);
bool sensorMgr_stopAcquisition(void);
bool sensorMgr_pauseAcquisition(void);
bool sensorMgr_resumeAcquisition(void);
// Sensor control
bool sensorMgr_enableSensor(uint8_t sensor_id);
bool sensorMgr_disableSensor(uint8_t sensor_id);
bool sensorMgr_configureSensor(uint8_t sensor_id, const sensor_config_t* config);
// Data access
bool sensorMgr_getLatestData(uint8_t sensor_id, sensor_data_record_t* record);
bool sensorMgr_getAllSensorData(sensor_data_record_t* records, size_t* count);
// State management
sensor_state_t sensorMgr_getSensorState(uint8_t sensor_id);
bool sensorMgr_isSensorPresent(uint8_t sensor_id);
bool sensorMgr_isSensorEnabled(uint8_t sensor_id);
bool sensorMgr_isSensorHealthy(uint8_t sensor_id);
// Statistics and diagnostics
bool sensorMgr_getSensorStatistics(uint8_t sensor_id, sensor_stats_t* stats);
bool sensorMgr_resetSensorStatistics(uint8_t sensor_id);
```
### 7.2 Event System Integration
**Published Events:**
- `EVENT_SENSOR_DATA_UPDATE`: New sensor data available
- `EVENT_SENSOR_STATE_CHANGED`: Sensor state transition
- `EVENT_SENSOR_FAULT_DETECTED`: Sensor failure detected
- `EVENT_SENSOR_RECOVERY`: Sensor recovered from fault
**Subscribed Events:**
- `EVENT_STATE_CHANGED`: System state transitions
- `EVENT_MC_UPDATED`: Machine constants updated
- `EVENT_TEARDOWN_INITIATED`: System teardown requested
### 7.3 Data Pool Integration
**Data Storage:**
- Latest sensor data records for all sensors
- Sensor state information
- Acquisition statistics and health metrics
**Access Patterns:**
- Write: Sensor Manager updates after each acquisition cycle
- Read: Communication and Persistence components access latest data
- Query: HMI and Diagnostics access for display and analysis
## 8. Testing and Validation
### 8.1 Unit Testing
- **Sensor Driver Interface:** Mock sensors for driver testing
- **Filtering Algorithms:** Known input/output validation
- **State Machine:** All state transitions and edge cases
- **Data Record Generation:** Structure and content validation
### 8.2 Integration Testing
- **Sensor Manager + Drivers:** Real sensor hardware testing
- **Event System Integration:** Event publication and subscription
- **Data Pool Integration:** Data storage and retrieval
- **State Management:** Cross-component state coordination
### 8.3 System Testing
- **Multi-Sensor Acquisition:** All 7 sensor types simultaneously
- **Performance Testing:** Timing constraints under load
- **Fault Injection:** Sensor failure scenarios
- **Long-Duration Testing:** 24-hour continuous operation
### 8.4 Acceptance Criteria
- All sensor types successfully detected and initialized
- Acquisition cycles complete within 1-second constraint
- Filter algorithms reduce noise by >90%
- State transitions occur correctly under all conditions
- No memory leaks during continuous operation
- Graceful handling of all error conditions
## 9. Dependencies
### 9.1 Internal Dependencies
- **Machine Constant Manager:** Sensor configuration and parameters
- **Event System:** Inter-component communication
- **Data Pool:** Data storage and access
- **Time Utils:** Timestamp generation
- **Logger:** Debug and diagnostic output
### 9.2 External Dependencies
- **ESP-IDF Framework:** Hardware abstraction and drivers
- **FreeRTOS:** Task scheduling and timing
- **Hardware Sensors:** Physical sensor devices
- **System State Manager:** State-aware operation control
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Adaptive Filtering:** Machine learning-based filter optimization
- **Predictive Maintenance:** Sensor degradation prediction
- **Advanced Calibration:** Multi-point calibration support
- **Sensor Fusion:** Cross-sensor validation and fusion
### 10.2 Scalability Considerations
- **Additional Sensor Types:** Framework supports easy extension
- **Higher Sampling Rates:** Configurable for future requirements
- **Distributed Processing:** Support for sensor processing offload
- **Cloud Integration:** Direct sensor data streaming capability
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (45 SR, 122 SWR)
**Next Review:** After component implementation

View File

@@ -0,0 +1,581 @@
# Feature Specification: Diagnostics & Health Monitoring
# Feature ID: F-DIAG (F-DIAG-001 to F-DIAG-004)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Diagnostics & Health Monitoring
## 1. Feature Overview
### 1.1 Feature Purpose
The Diagnostics & Health Monitoring feature provides comprehensive system health assessment, fault detection, diagnostic event management, and engineering access capabilities for the ASF Sensor Hub. This feature ensures system reliability through proactive monitoring, structured fault reporting, and maintenance support.
### 1.2 Feature Scope
**In Scope:**
- Structured diagnostic code framework with severity classification
- Persistent diagnostic event storage and management
- Engineering diagnostic sessions with secure access
- System health monitoring and performance metrics
- Cross-component fault correlation and root cause analysis
**Out of Scope:**
- Main Hub diagnostic aggregation and analysis
- Predictive maintenance algorithms (future enhancement)
- Hardware fault injection testing equipment
- Remote diagnostic access without Main Hub coordination
## 2. Sub-Features
### 2.1 F-DIAG-001: Diagnostic Code Management
**Description:** Comprehensive diagnostic code framework for standardized fault identification, classification, and reporting across all system components.
**Diagnostic Code Structure:**
```c
typedef struct {
uint16_t code; // Unique diagnostic code (0x0001-0xFFFF)
diagnostic_severity_t severity; // INFO, WARNING, ERROR, FATAL
diagnostic_category_t category; // SENSOR, COMM, STORAGE, SYSTEM, SECURITY
uint64_t timestamp_ms; // Event occurrence time
uint8_t source_component_id; // Component that generated the event
char description[64]; // Human-readable description
uint8_t data[32]; // Context-specific diagnostic data
uint16_t occurrence_count; // Number of times this event occurred
} diagnostic_event_t;
typedef enum {
DIAG_SEVERITY_INFO = 0, // Informational, no action required
DIAG_SEVERITY_WARNING = 1, // Warning, monitoring required
DIAG_SEVERITY_ERROR = 2, // Error, corrective action needed
DIAG_SEVERITY_FATAL = 3 // Fatal, system functionality compromised
} diagnostic_severity_t;
typedef enum {
DIAG_CATEGORY_SENSOR = 0, // Sensor-related diagnostics
DIAG_CATEGORY_COMM = 1, // Communication diagnostics
DIAG_CATEGORY_STORAGE = 2, // Storage and persistence diagnostics
DIAG_CATEGORY_SYSTEM = 3, // System management diagnostics
DIAG_CATEGORY_SECURITY = 4, // Security-related diagnostics
DIAG_CATEGORY_POWER = 5, // Power and fault handling diagnostics
DIAG_CATEGORY_OTA = 6 // OTA update diagnostics
} diagnostic_category_t;
```
**Diagnostic Code Registry (Examples):**
| Code | Severity | Category | Description |
|------|----------|----------|-------------|
| 0x1001 | WARNING | SENSOR | Sensor communication timeout |
| 0x1002 | ERROR | SENSOR | Sensor out-of-range value detected |
| 0x1003 | FATAL | SENSOR | Critical sensor hardware failure |
| 0x2001 | WARNING | COMM | Wi-Fi signal strength low |
| 0x2002 | ERROR | COMM | MQTT broker connection failed |
| 0x2003 | FATAL | COMM | TLS certificate validation failed |
| 0x3001 | WARNING | STORAGE | SD card space low (< 10%) |
| 0x3002 | ERROR | STORAGE | SD card write failure |
| 0x3003 | FATAL | STORAGE | SD card not detected |
| 0x4001 | INFO | SYSTEM | System state transition |
| 0x4002 | WARNING | SYSTEM | Memory usage high (> 80%) |
| 0x4003 | FATAL | SYSTEM | Watchdog timer reset |
### 2.2 F-DIAG-002: Diagnostic Data Storage
**Description:** Persistent storage of diagnostic events in non-volatile memory with efficient storage management and retrieval capabilities.
**Storage Architecture:**
```mermaid
graph TB
subgraph "Diagnostic Storage System"
GEN[Diagnostic Generator] --> BUF[Ring Buffer]
BUF --> FILTER[Severity Filter]
FILTER --> PERSIST[Persistence Layer]
PERSIST --> SD[SD Card Storage]
PERSIST --> NVS[NVS Flash Storage]
end
subgraph "Storage Policy"
CRITICAL[FATAL/ERROR Events] --> NVS
NORMAL[WARNING/INFO Events] --> SD
OVERFLOW[Buffer Overflow] --> DISCARD[Discard Oldest]
end
subgraph "Retrieval Interface"
QUERY[Query Interface] --> PERSIST
EXPORT[Export Interface] --> PERSIST
CLEAR[Clear Interface] --> PERSIST
end
```
**Storage Management:**
- **Ring Buffer:** 100 events in RAM for immediate access
- **NVS Storage:** Critical events (ERROR/FATAL) persisted to flash
- **SD Card Storage:** All events stored to SD card when available
- **Retention Policy:** 30 days or 10,000 events maximum
- **Compression:** Event data compressed for efficient storage
### 2.3 F-DIAG-003: Diagnostic Session
**Description:** Secure engineering access interface for diagnostic data retrieval, system inspection, and maintenance operations.
**Session Types:**
| Session Type | Access Level | Authentication | Capabilities |
|-------------|-------------|----------------|--------------|
| **Read-Only** | Basic | PIN code | View diagnostics, system status |
| **Engineering** | Advanced | Certificate | Diagnostic management, configuration |
| **Service** | Full | Multi-factor | System control, debug access |
**Session Interface:**
```c
typedef struct {
session_id_t session_id;
session_type_t type;
uint64_t start_time;
uint64_t last_activity;
uint32_t timeout_seconds;
bool authenticated;
char user_id[32];
} diagnostic_session_t;
// Session management API
session_id_t diag_createSession(session_type_t type);
bool diag_authenticateSession(session_id_t session, const auth_credentials_t* creds);
bool diag_closeSession(session_id_t session);
bool diag_isSessionValid(session_id_t session);
// Diagnostic access API
bool diag_getEvents(session_id_t session, diagnostic_filter_t* filter,
diagnostic_event_t* events, size_t* count);
bool diag_clearEvents(session_id_t session, diagnostic_filter_t* filter);
bool diag_exportEvents(session_id_t session, export_format_t format,
uint8_t* buffer, size_t* size);
bool diag_getSystemHealth(session_id_t session, system_health_t* health);
```
### 2.4 F-DIAG-004: System Health Monitoring
**Description:** Continuous monitoring of system performance metrics, resource utilization, and component health status.
**Health Metrics:**
```c
typedef struct {
// CPU and Memory
uint8_t cpu_usage_percent;
uint32_t free_heap_bytes;
uint32_t min_free_heap_bytes;
uint16_t task_count;
// Storage
uint64_t sd_free_bytes;
uint64_t sd_total_bytes;
uint32_t nvs_free_entries;
uint32_t nvs_used_entries;
// Communication
int8_t wifi_rssi_dbm;
uint32_t mqtt_messages_sent;
uint32_t mqtt_messages_failed;
uint32_t comm_error_count;
// Sensors
uint8_t sensors_active;
uint8_t sensors_total;
uint8_t sensors_failed;
uint32_t sensor_error_count;
// System
uint32_t uptime_seconds;
uint32_t reset_count;
system_state_t current_state;
uint32_t state_change_count;
// Power
float supply_voltage;
bool brownout_detected;
uint32_t power_cycle_count;
} system_health_t;
```
**Health Monitoring Flow:**
```mermaid
sequenceDiagram
participant HM as Health Monitor
participant COMP as System Components
participant DIAG as Diagnostic Storage
participant ES as Event System
participant HMI as Local HMI
Note over HM,HMI: Health Monitoring Cycle (10 seconds)
loop Every 10 seconds
HM->>COMP: collectHealthMetrics()
COMP-->>HM: health_data
HM->>HM: analyzeHealthTrends()
HM->>HM: detectAnomalies()
alt Anomaly detected
HM->>DIAG: logDiagnosticEvent(anomaly)
HM->>ES: publish(HEALTH_ANOMALY, details)
end
HM->>ES: publish(HEALTH_UPDATE, metrics)
ES->>HMI: updateHealthDisplay(metrics)
end
```
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-DIAG-001** | SR-DIAG-001, SR-DIAG-002, SR-DIAG-003, SR-DIAG-004 | Diagnostic code framework and event management |
| **F-DIAG-002** | SR-DIAG-005, SR-DIAG-006, SR-DIAG-007 | Persistent diagnostic storage and retention |
| **F-DIAG-003** | SR-DIAG-008, SR-DIAG-009, SR-DIAG-010, SR-DIAG-011 | Engineering diagnostic sessions and access control |
| **F-DIAG-004** | SR-DIAG-012, SR-DIAG-013, SR-DIAG-014 | System health monitoring and performance metrics |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-DIAG-001** | SWR-DIAG-001, SWR-DIAG-002, SWR-DIAG-003 | Event structure, code registry, severity classification |
| **F-DIAG-002** | SWR-DIAG-004, SWR-DIAG-005, SWR-DIAG-006 | Storage management, persistence, retrieval interface |
| **F-DIAG-003** | SWR-DIAG-007, SWR-DIAG-008, SWR-DIAG-009 | Session management, authentication, access control |
| **F-DIAG-004** | SWR-DIAG-010, SWR-DIAG-011, SWR-DIAG-012 | Health metrics collection, anomaly detection, reporting |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Diagnostics Task** | Health monitoring, event coordination, session management | `application_layer/diag_task/` |
| **Error Handler** | Diagnostic event generation, fault classification | `application_layer/error_handler/` |
| **Diagnostic Storage Manager** | Event persistence, retrieval, storage management | `application_layer/diag_storage/` |
| **Health Monitor** | System metrics collection, anomaly detection | `application_layer/health_monitor/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Event System** | Diagnostic event distribution, component coordination | `application_layer/business_stack/event_system/` |
| **Data Persistence** | Storage abstraction, NVS and SD card access | `application_layer/DP_stack/persistence/` |
| **Security Manager** | Session authentication, access control | `application_layer/security/` |
| **State Manager** | System state awareness, state-dependent diagnostics | `application_layer/business_stack/STM/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Diagnostics & Health Monitoring Feature"
DT[Diagnostics Task]
EH[Error Handler]
DSM[Diagnostic Storage Manager]
HM[Health Monitor]
end
subgraph "Core System Components"
ES[Event System]
DP[Data Persistence]
SEC[Security Manager]
STM[State Manager]
end
subgraph "System Components"
SM[Sensor Manager]
COM[Communication]
OTA[OTA Manager]
PWR[Power Manager]
end
subgraph "Storage"
NVS[NVS Flash]
SD[SD Card]
end
subgraph "Interfaces"
HMI[Local HMI]
UART[UART Debug]
NET[Network Session]
end
DT <--> ES
DT <--> DSM
DT <--> HM
DT <--> SEC
EH --> ES
EH --> DSM
DSM <--> DP
DSM --> NVS
DSM --> SD
HM --> SM
HM --> COM
HM --> OTA
HM --> PWR
HM --> STM
ES -.->|Health Events| HMI
ES -.->|Diagnostic Events| COM
DT -.->|Session Access| UART
DT -.->|Session Access| NET
```
### 4.4 Diagnostic Event Flow
```mermaid
sequenceDiagram
participant COMP as System Component
participant EH as Error Handler
participant ES as Event System
participant DSM as Diagnostic Storage
participant DT as Diagnostics Task
participant COM as Communication
Note over COMP,COM: Diagnostic Event Generation and Processing
COMP->>EH: reportError(error_info)
EH->>EH: classifyError(error_info)
EH->>EH: generateDiagnosticEvent()
EH->>ES: publish(DIAGNOSTIC_EVENT, event)
ES->>DSM: storeDiagnosticEvent(event)
ES->>DT: processDiagnosticEvent(event)
ES->>COM: reportDiagnosticEvent(event)
DSM->>DSM: checkStoragePolicy(event.severity)
alt Critical Event (ERROR/FATAL)
DSM->>NVS: persistToFlash(event)
end
DSM->>SD: persistToSDCard(event)
DT->>DT: updateHealthMetrics(event)
DT->>DT: checkSystemHealth()
alt Health degradation detected
DT->>ES: publish(HEALTH_DEGRADATION, metrics)
end
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **System Initialization:**
- Initialize diagnostic storage and load existing events
- Start health monitoring tasks and metric collection
- Register diagnostic event handlers with all components
- Establish baseline health metrics and thresholds
2. **Continuous Monitoring:**
- Collect system health metrics every 10 seconds
- Process diagnostic events from all system components
- Store events according to severity and storage policy
- Analyze health trends and detect anomalies
3. **Event Processing:**
- Classify and timestamp all diagnostic events
- Apply filtering and correlation rules
- Persist events to appropriate storage (NVS/SD)
- Distribute events to interested components
4. **Session Management:**
- Handle engineering session requests and authentication
- Provide secure access to diagnostic data and system health
- Log all diagnostic session activities for audit
- Enforce session timeouts and access controls
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Storage Full** | Storage capacity monitoring | Implement retention policy, discard oldest events |
| **SD Card Failure** | Write operation failure | Switch to NVS-only storage, log degradation |
| **Memory Exhaustion** | Heap monitoring | Reduce buffer sizes, increase event filtering |
| **Session Timeout** | Activity monitoring | Close session, clear authentication |
| **Authentication Failure** | Credential validation | Reject session, log security event |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Initialize storage, load existing events, start monitoring |
| **RUNNING** | Full diagnostic functionality, continuous health monitoring |
| **WARNING** | Enhanced monitoring, increased event generation |
| **FAULT** | Critical diagnostics only, preserve fault information |
| **OTA_UPDATE** | Suspend monitoring, log OTA-related events |
| **TEARDOWN** | Flush pending events, preserve diagnostic state |
| **SERVICE** | Full diagnostic access, engineering session support |
| **SD_DEGRADED** | NVS-only storage, reduced event retention |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Event Processing:** Maximum 10ms from generation to storage
- **Health Monitoring:** 10-second monitoring cycle with ±1 second tolerance
- **Session Response:** Maximum 500ms for diagnostic queries
- **Storage Operations:** Maximum 100ms for event persistence
### 6.2 Resource Constraints
- **Memory Usage:** Maximum 32KB for diagnostic buffers and storage
- **Event Storage:** Maximum 10,000 events or 30 days retention
- **Session Limit:** Maximum 2 concurrent diagnostic sessions
- **CPU Usage:** Maximum 5% of available CPU time for diagnostics
### 6.3 Security Constraints
- **Session Authentication:** All diagnostic access must be authenticated
- **Data Protection:** Diagnostic data encrypted when stored
- **Access Logging:** All diagnostic activities logged for audit
- **Privilege Separation:** Role-based access to diagnostic functions
## 7. Interface Specifications
### 7.1 Diagnostics Task Public API
```c
// Initialization and control
bool diagTask_initialize(void);
bool diagTask_start(void);
bool diagTask_stop(void);
bool diagTask_isRunning(void);
// Event management
bool diagTask_reportEvent(const diagnostic_event_t* event);
bool diagTask_getEvents(const diagnostic_filter_t* filter,
diagnostic_event_t* events, size_t* count);
bool diagTask_clearEvents(const diagnostic_filter_t* filter);
bool diagTask_exportEvents(export_format_t format, uint8_t* buffer, size_t* size);
// Health monitoring
bool diagTask_getSystemHealth(system_health_t* health);
bool diagTask_getHealthHistory(health_history_t* history, size_t* count);
bool diagTask_resetHealthMetrics(void);
// Session management
session_id_t diagTask_createSession(session_type_t type);
bool diagTask_authenticateSession(session_id_t session, const auth_credentials_t* creds);
bool diagTask_closeSession(session_id_t session);
bool diagTask_isSessionValid(session_id_t session);
```
### 7.2 Error Handler API
```c
// Error reporting
bool errorHandler_reportError(component_id_t source, error_code_t code,
const char* description, const uint8_t* context_data);
bool errorHandler_reportWarning(component_id_t source, warning_code_t code,
const char* description);
bool errorHandler_reportInfo(component_id_t source, info_code_t code,
const char* description);
// Error classification
diagnostic_severity_t errorHandler_classifyError(error_code_t code);
diagnostic_category_t errorHandler_categorizeError(component_id_t source, error_code_t code);
bool errorHandler_isErrorCritical(error_code_t code);
// Error statistics
bool errorHandler_getErrorStatistics(error_statistics_t* stats);
bool errorHandler_resetErrorStatistics(void);
```
### 7.3 Health Monitor API
```c
// Health monitoring
bool healthMonitor_initialize(void);
bool healthMonitor_startMonitoring(void);
bool healthMonitor_stopMonitoring(void);
bool healthMonitor_getCurrentHealth(system_health_t* health);
// Metric collection
bool healthMonitor_collectMetrics(void);
bool healthMonitor_updateMetric(health_metric_id_t metric_id, float value);
bool healthMonitor_getMetricHistory(health_metric_id_t metric_id,
metric_history_t* history, size_t* count);
// Anomaly detection
bool healthMonitor_setThreshold(health_metric_id_t metric_id, float threshold);
bool healthMonitor_enableAnomalyDetection(health_metric_id_t metric_id, bool enable);
bool healthMonitor_getAnomalies(anomaly_t* anomalies, size_t* count);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Event Generation:** Diagnostic event creation and classification
- **Storage Management:** Event persistence and retrieval operations
- **Health Monitoring:** Metric collection and anomaly detection
- **Session Management:** Authentication and access control
### 8.2 Integration Testing
- **Cross-Component Events:** Diagnostic events from all system components
- **Storage Integration:** NVS and SD card storage operations
- **Event Distribution:** Event system integration and notification
- **Session Integration:** Engineering access via multiple interfaces
### 8.3 System Testing
- **Long-Duration Monitoring:** 48-hour continuous diagnostic operation
- **Storage Stress Testing:** High-frequency event generation and storage
- **Session Security Testing:** Authentication bypass attempts
- **Fault Injection Testing:** Component failure simulation and detection
### 8.4 Acceptance Criteria
- All diagnostic events properly classified and stored
- Health monitoring detects system anomalies within timing constraints
- Engineering sessions provide secure access to diagnostic data
- Storage management maintains data integrity under all conditions
- No diagnostic overhead impact on core system functionality
- Complete audit trail of all diagnostic activities
## 9. Dependencies
### 9.1 Internal Dependencies
- **Event System:** Diagnostic event distribution and coordination
- **Data Persistence:** Storage abstraction for diagnostic data
- **Security Manager:** Session authentication and access control
- **State Manager:** System state awareness for state-dependent diagnostics
### 9.2 External Dependencies
- **ESP-IDF Framework:** NVS, SD card, and system monitoring APIs
- **FreeRTOS:** Task scheduling and system resource monitoring
- **Hardware Components:** SD card, NVS flash, UART interface
- **System Components:** All components for health metric collection
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Predictive Analytics:** Machine learning for failure prediction
- **Advanced Correlation:** Multi-component fault correlation analysis
- **Remote Diagnostics:** Cloud-based diagnostic data analysis
- **Automated Recovery:** Self-healing mechanisms based on diagnostics
### 10.2 Scalability Considerations
- **Distributed Diagnostics:** Cross-hub diagnostic correlation
- **Cloud Integration:** Real-time diagnostic streaming to cloud
- **Advanced Analytics:** Big data analytics for fleet-wide diagnostics
- **Mobile Interface:** Smartphone app for field diagnostic access
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-DIAG, SWR-DIAG)
**Next Review:** After component implementation

View File

@@ -0,0 +1,528 @@
# Feature Specification: Data Quality & Calibration
# Feature ID: F-DQC (F-DQC-001 to F-DQC-005)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Data Quality & Calibration
## 1. Feature Overview
### 1.1 Feature Purpose
The Data Quality & Calibration feature ensures that all sensor data generated by the ASF Sensor Hub is valid, trustworthy, correctly classified, and calibrated throughout the system lifecycle. This feature provides mechanisms for automatic sensor identification, compatibility enforcement, failure detection, and centralized calibration management.
### 1.2 Feature Scope
**In Scope:**
- Automatic sensor detection and identification
- Sensor-slot compatibility enforcement
- Real-time sensor failure detection and isolation
- Machine constants and calibration parameter management
- Redundant sensor support and sensor fusion
**Out of Scope:**
- Sensor hardware manufacturing and design
- External calibration equipment and procedures
- Main Hub calibration algorithms
- Sensor replacement and maintenance procedures
## 2. Sub-Features
### 2.1 F-DQC-001: Automatic Sensor Detection
**Description:** Dynamic detection and identification of connected sensors using hardware-based presence detection mechanisms.
**Detection Methods:**
- **GPIO Presence Pins:** Dedicated detection signals per sensor slot
- **I2C Device Scanning:** Automatic I2C address enumeration
- **Device ID Reading:** Sensor-specific identification protocols
- **Electrical Signature:** Voltage/resistance-based detection
**Detection Process:**
```mermaid
sequenceDiagram
participant SM as Sensor Manager
participant DD as Device Detector
participant SD as Sensor Driver
participant MC as Machine Constants
Note over SM,MC: Sensor Detection Cycle
SM->>DD: scanForSensors()
loop For each sensor slot
DD->>DD: checkPresencePin(slot_id)
alt Presence detected
DD->>SD: probeSensorType(slot_id)
SD->>SD: readDeviceID()
SD-->>DD: sensor_type_info
DD->>MC: validateSensorSlot(slot_id, sensor_type)
MC-->>DD: validation_result
end
end
DD-->>SM: detected_sensors_list
SM->>SM: updateSensorRegistry()
```
### 2.2 F-DQC-002: Sensor Type Enforcement
**Description:** Enforcement of sensor-slot compatibility to prevent incorrect sensor installation and configuration errors.
**Enforcement Mechanisms:**
- **Physical Slot Design:** Mechanical keying for sensor types
- **Electrical Validation:** Pin configuration verification
- **Software Validation:** Machine constants cross-reference
- **Protocol Validation:** Communication interface verification
**Slot Mapping Table:**
| Slot ID | Sensor Type | Interface | Detection Pin | Validation Method |
|---------|-------------|-----------|---------------|-------------------|
| SLOT_01 | Temperature | I2C/Analog | GPIO_12 | Device ID + Range |
| SLOT_02 | Humidity | I2C | GPIO_13 | Device ID + Protocol |
| SLOT_03 | CO2 | UART/I2C | GPIO_14 | Device ID + Calibration |
| SLOT_04 | NH3 | Analog/I2C | GPIO_15 | Range + Sensitivity |
| SLOT_05 | VOC | I2C | GPIO_16 | Device ID + Algorithm |
| SLOT_06 | PM | UART/I2C | GPIO_17 | Protocol + Range |
| SLOT_07 | Light | Analog/I2C | GPIO_18 | Range + Spectral |
### 2.3 F-DQC-003: Sensor Failure Detection
**Description:** Continuous monitoring of sensor behavior to detect and isolate failures in real-time.
**Failure Detection Methods:**
- **Communication Timeouts:** I2C/UART/SPI response monitoring
- **Range Validation:** Out-of-specification value detection
- **Trend Analysis:** Abnormal rate-of-change detection
- **Cross-Validation:** Multi-sensor consistency checking
- **Health Monitoring:** Sensor self-diagnostic features
**Failure Classification:**
```c
typedef enum {
SENSOR_FAILURE_NONE = 0,
SENSOR_FAILURE_COMMUNICATION, // Timeout, NACK, protocol error
SENSOR_FAILURE_OUT_OF_RANGE, // Values outside physical limits
SENSOR_FAILURE_STUCK_VALUE, // No change over time
SENSOR_FAILURE_ERRATIC, // Excessive noise or variation
SENSOR_FAILURE_CALIBRATION, // Drift beyond acceptable limits
SENSOR_FAILURE_HARDWARE, // Self-diagnostic failure
SENSOR_FAILURE_UNKNOWN // Unclassified failure
} sensor_failure_type_t;
```
### 2.4 F-DQC-004: Machine Constants & Calibration Management
**Description:** Centralized management of sensor configuration, calibration parameters, and system identity information.
**Machine Constants Structure:**
```c
typedef struct {
// System Identity
char device_id[32];
char site_id[16];
char house_id[16];
uint32_t firmware_version;
// Sensor Configuration
sensor_config_t sensors[MAX_SENSORS];
// Calibration Parameters
calibration_params_t calibration[MAX_SENSORS];
// Communication Settings
comm_config_t communication;
// System Limits
system_limits_t limits;
// Validation
uint32_t checksum;
uint64_t timestamp;
} machine_constants_t;
typedef struct {
uint8_t sensor_id;
sensor_type_t type;
bool enabled;
uint32_t sampling_rate;
filter_config_t filter;
float min_value, max_value;
char unit[8];
} sensor_config_t;
typedef struct {
float offset;
float scale;
float[] polynomial_coeffs;
uint8_t coeff_count;
uint64_t calibration_date;
uint32_t calibration_interval;
} calibration_params_t;
```
### 2.5 F-DQC-005: Redundant Sensor Support
**Description:** Support for redundant sensors and sensor fusion for critical measurements.
**Redundancy Strategies:**
- **Dual Sensors:** Two sensors of same type for critical parameters
- **Cross-Validation:** Different sensor types measuring related parameters
- **Voting Logic:** Majority voting for multiple sensors
- **Graceful Degradation:** Fallback to single sensor operation
**Sensor Fusion Algorithm:**
```mermaid
graph TB
subgraph "Sensor Fusion Process"
S1[Sensor 1] --> V[Validator]
S2[Sensor 2] --> V
S3[Sensor 3] --> V
V --> F[Fusion Algorithm]
F --> O[Output Value]
F --> C[Confidence Level]
end
subgraph "Fusion Methods"
AVG[Weighted Average]
MED[Median Filter]
KAL[Kalman Filter]
VOT[Voting Logic]
end
F --> AVG
F --> MED
F --> KAL
F --> VOT
```
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-DQC-001** | SR-DQC-001, SR-DQC-002 | Automatic sensor detection and enumeration |
| **F-DQC-002** | SR-DQC-003, SR-DQC-004 | Sensor type enforcement and slot validation |
| **F-DQC-003** | SR-DQC-005, SR-DQC-006 | Failure detection and isolation |
| **F-DQC-004** | SR-DQC-007, SR-DQC-008 | Machine constants management and persistence |
| **F-DQC-005** | SR-DQC-009, SR-DQC-010 | Redundant sensor support and fusion |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-DQC-001** | SWR-DQC-001, SWR-DQC-002, SWR-DQC-003 | Detection algorithms, device probing, registry management |
| **F-DQC-002** | SWR-DQC-004, SWR-DQC-005, SWR-DQC-006 | Slot mapping, validation logic, error reporting |
| **F-DQC-003** | SWR-DQC-007, SWR-DQC-008, SWR-DQC-009 | Health monitoring, failure classification, isolation |
| **F-DQC-004** | SWR-DQC-010, SWR-DQC-011, SWR-DQC-012 | MC structure, persistence, update mechanisms |
| **F-DQC-005** | SWR-DQC-013, SWR-DQC-014, SWR-DQC-015 | Fusion algorithms, redundancy management, voting logic |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Machine Constant Manager** | MC loading, validation, update coordination | `application_layer/business_stack/machine_constant_manager/` |
| **Sensor Manager** | Detection coordination, failure monitoring | `application_layer/business_stack/sensor_manager/` |
| **Device Detector** | Hardware detection, sensor probing | `drivers/device_detector/` |
| **Calibration Manager** | Calibration algorithms, parameter management | `application_layer/calibration/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Sensor Drivers** | Hardware interface, device identification | `drivers/sensors/` |
| **Event System** | Detection events, failure notifications | `application_layer/business_stack/event_system/` |
| **Data Persistence** | MC storage, calibration data persistence | `application_layer/DP_stack/persistence/` |
| **Diagnostics Task** | Failure logging, health reporting | `application_layer/diag_task/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Data Quality & Calibration Feature"
MCM[Machine Constant Manager]
SM[Sensor Manager]
DD[Device Detector]
CAL[Calibration Manager]
end
subgraph "Supporting Components"
SD[Sensor Drivers]
ES[Event System]
DP[Data Persistence]
DIAG[Diagnostics Task]
end
subgraph "Hardware"
SENSORS[Physical Sensors]
GPIO[Detection Pins]
I2C[I2C Bus]
UART[UART Interface]
end
MCM <--> DP
MCM --> SM
MCM --> CAL
SM <--> ES
SM --> DD
SM --> DIAG
DD --> SD
DD --> GPIO
CAL --> MCM
CAL <--> DP
SD --> I2C
SD --> UART
SD --> SENSORS
ES -.->|Detection Events| SM
ES -.->|Failure Events| DIAG
DIAG -.->|Health Data| SM
```
### 4.4 Detection and Validation Flow
```mermaid
sequenceDiagram
participant HW as Hardware
participant DD as Device Detector
participant SD as Sensor Driver
participant SM as Sensor Manager
participant MCM as MC Manager
participant ES as Event System
participant DIAG as Diagnostics
Note over HW,DIAG: Sensor Detection and Validation
SM->>DD: initiateSensorScan()
loop For each sensor slot
DD->>HW: checkPresencePin(slot_id)
HW-->>DD: presence_status
alt Sensor present
DD->>SD: probeSensorType(slot_id)
SD->>HW: readDeviceID()
HW-->>SD: device_info
SD-->>DD: sensor_type_info
DD->>MCM: validateSensorSlot(slot_id, sensor_type)
MCM-->>DD: validation_result
alt Validation successful
DD->>SM: registerSensor(slot_id, sensor_info)
SM->>ES: publish(SENSOR_DETECTED, sensor_info)
else Validation failed
DD->>DIAG: reportValidationError(slot_id, error)
DIAG->>ES: publish(SENSOR_VALIDATION_FAILED, error)
end
end
end
DD-->>SM: scanComplete(detected_sensors)
SM->>ES: publish(SENSOR_SCAN_COMPLETE, summary)
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **System Initialization:**
- Load machine constants from persistent storage
- Validate MC integrity and version compatibility
- Initialize sensor detection and calibration systems
- Perform initial sensor scan and validation
2. **Sensor Detection Cycle:**
- Scan all sensor slots for presence
- Identify sensor types and validate compatibility
- Register detected sensors in system registry
- Configure sensors according to machine constants
3. **Continuous Monitoring:**
- Monitor sensor health and communication status
- Detect failures and classify failure types
- Apply calibration corrections to sensor data
- Manage redundant sensors and sensor fusion
4. **Configuration Management:**
- Handle machine constants updates from Main Hub
- Validate new configurations before application
- Coordinate system teardown for MC updates
- Persist updated configurations to storage
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Sensor Mismatch** | Type validation against MC | Reject sensor, log diagnostic event |
| **Communication Failure** | Timeout or protocol error | Mark sensor as faulty, continue with others |
| **Calibration Drift** | Value trend analysis | Flag for recalibration, continue operation |
| **MC Corruption** | Checksum validation | Use backup MC, request update from Main Hub |
| **Detection Hardware Failure** | GPIO or I2C failure | Disable detection, use last known configuration |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Load MC, detect sensors, validate configuration |
| **RUNNING** | Continuous monitoring, failure detection, calibration |
| **WARNING** | Enhanced monitoring, diagnostic reporting |
| **FAULT** | Minimal operation, preserve sensor states |
| **MC_UPDATE** | Stop monitoring, update MC, re-detect sensors |
| **SERVICE** | Full diagnostic access, manual sensor control |
| **SD_DEGRADED** | Continue operation, no MC persistence |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Detection Cycle:** Maximum 5 seconds for complete sensor scan
- **Failure Detection:** Maximum 3 seconds to detect communication failure
- **MC Update:** Maximum 30 seconds for complete MC reload
- **Calibration Application:** Maximum 100ms per sensor
### 6.2 Resource Constraints
- **Memory Usage:** Maximum 8KB for MC data and sensor registry
- **Detection Frequency:** Maximum once per minute for presence scan
- **Calibration Storage:** Maximum 1KB per sensor for calibration data
- **Failure History:** Maximum 100 failure events in memory
### 6.3 Quality Constraints
- **Detection Accuracy:** 99.9% accurate sensor presence detection
- **Type Validation:** 100% prevention of incorrect sensor installation
- **Failure Detection:** 95% of failures detected within 10 seconds
- **Calibration Accuracy:** Within ±2% of reference standards
## 7. Interface Specifications
### 7.1 Machine Constant Manager API
```c
// Machine constants management
bool mcMgr_initialize(void);
bool mcMgr_loadMachineConstants(machine_constants_t* mc);
bool mcMgr_validateMachineConstants(const machine_constants_t* mc);
bool mcMgr_updateMachineConstants(const machine_constants_t* new_mc);
bool mcMgr_backupMachineConstants(void);
// Sensor configuration access
bool mcMgr_getSensorConfig(uint8_t sensor_id, sensor_config_t* config);
bool mcMgr_getCalibrationParams(uint8_t sensor_id, calibration_params_t* params);
bool mcMgr_validateSensorSlot(uint8_t slot_id, sensor_type_t sensor_type);
// System configuration
bool mcMgr_getSystemIdentity(system_identity_t* identity);
bool mcMgr_getCommunicationConfig(comm_config_t* config);
bool mcMgr_getSystemLimits(system_limits_t* limits);
```
### 7.2 Device Detector API
```c
// Detection operations
bool detector_initialize(void);
bool detector_scanSensors(detected_sensor_t* sensors, size_t* count);
bool detector_probeSensorType(uint8_t slot_id, sensor_type_info_t* info);
bool detector_validateSensorPresence(uint8_t slot_id);
// Hardware interface
bool detector_checkPresencePin(uint8_t slot_id);
bool detector_readDeviceID(uint8_t slot_id, device_id_t* id);
bool detector_testCommunication(uint8_t slot_id, comm_interface_t interface);
```
### 7.3 Calibration Manager API
```c
// Calibration operations
bool cal_initialize(void);
bool cal_applyCalibratio(uint8_t sensor_id, float raw_value, float* calibrated_value);
bool cal_updateCalibrationParams(uint8_t sensor_id, const calibration_params_t* params);
bool cal_validateCalibration(uint8_t sensor_id, validation_result_t* result);
// Calibration management
bool cal_scheduleRecalibration(uint8_t sensor_id, uint32_t interval_days);
bool cal_isCalibrationExpired(uint8_t sensor_id);
bool cal_getCalibrationStatus(uint8_t sensor_id, calibration_status_t* status);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Detection Logic:** Mock hardware for detection algorithm testing
- **Validation Rules:** Sensor-slot compatibility matrix testing
- **Calibration Math:** Known input/output calibration validation
- **MC Management:** Configuration loading and validation testing
### 8.2 Integration Testing
- **Hardware Detection:** Real sensor hardware detection testing
- **Communication Interfaces:** I2C, UART, SPI sensor communication
- **Event Integration:** Detection and failure event publication
- **Persistence Integration:** MC storage and retrieval testing
### 8.3 System Testing
- **Multi-Sensor Detection:** All 7 sensor types simultaneously
- **Failure Scenarios:** Sensor disconnection and failure injection
- **MC Update Testing:** Complete configuration update scenarios
- **Long-Duration Testing:** 48-hour continuous monitoring
### 8.4 Acceptance Criteria
- All supported sensor types correctly detected and identified
- 100% prevention of incorrect sensor-slot configurations
- Failure detection within specified timing constraints
- Machine constants updates complete without data loss
- Calibration accuracy within specified tolerances
- No false positive or negative detection events
## 9. Dependencies
### 9.1 Internal Dependencies
- **Sensor Manager:** Sensor lifecycle and state management
- **Event System:** Detection and failure event communication
- **Data Persistence:** Machine constants and calibration storage
- **Diagnostics Task:** Failure logging and health reporting
### 9.2 External Dependencies
- **ESP-IDF Framework:** GPIO, I2C, UART, SPI drivers
- **Hardware Sensors:** Physical sensor devices and interfaces
- **Detection Hardware:** Presence pins and identification circuits
- **Main Hub:** Machine constants updates and calibration data
## 10. Future Enhancements
### 10.1 Planned Improvements
- **AI-Based Detection:** Machine learning for sensor identification
- **Predictive Calibration:** Drift prediction and automatic correction
- **Advanced Fusion:** Multi-sensor data fusion algorithms
- **Remote Calibration:** Over-the-air calibration updates
### 10.2 Scalability Considerations
- **Additional Sensor Types:** Framework supports easy extension
- **Enhanced Validation:** Multi-level validation mechanisms
- **Cloud Calibration:** Cloud-based calibration management
- **Sensor Networks:** Cross-hub sensor validation and fusion
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-DQC, SWR-DQC)
**Next Review:** After component implementation

View File

@@ -0,0 +1,650 @@
# Feature Specification: Hardware Abstraction
# Feature ID: F-HW (F-HW-001 to F-HW-002)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Hardware Abstraction
## 1. Feature Overview
### 1.1 Feature Purpose
The Hardware Abstraction feature provides a clean separation between application logic and hardware interfaces for the ASF Sensor Hub. This feature ensures hardware independence, maintainability, and testability by abstracting all hardware access through well-defined interfaces and preventing direct hardware access from the application layer.
### 1.2 Feature Scope
**In Scope:**
- Sensor Abstraction Layer (SAL) for uniform sensor access
- Hardware interface abstraction for I2C, SPI, UART, ADC, GPIO
- Storage interface abstraction for SD Card and NVM
- Display and user interface abstraction
- GPIO discipline enforcement and resource conflict detection
**Out of Scope:**
- Hardware driver implementation details (delegated to ESP-IDF)
- Hardware-specific performance optimizations
- Physical hardware design and pin assignments
- Low-level hardware debugging interfaces
## 2. Sub-Features
### 2.1 F-HW-001: Sensor Abstraction Layer (SAL)
**Description:** Comprehensive sensor abstraction layer providing uniform access to all sensor types while maintaining hardware independence and enabling runtime sensor management.
**Sensor Abstraction Interface:**
```c
typedef struct {
uint8_t sensor_id; // Unique sensor identifier
sensor_type_t type; // Sensor type enumeration
char name[32]; // Human-readable sensor name
char unit[8]; // Measurement unit (°C, %, ppm, etc.)
float min_value; // Minimum valid measurement
float max_value; // Maximum valid measurement
float accuracy; // Sensor accuracy specification
uint32_t warmup_time_ms; // Required warmup time
uint32_t sampling_interval_ms; // Minimum sampling interval
sensor_interface_t interface; // Hardware interface type
} sensor_metadata_t;
typedef enum {
SENSOR_TYPE_TEMPERATURE = 0, // Temperature sensors
SENSOR_TYPE_HUMIDITY = 1, // Humidity sensors
SENSOR_TYPE_CO2 = 2, // Carbon dioxide sensors
SENSOR_TYPE_AMMONIA = 3, // Ammonia sensors
SENSOR_TYPE_VOC = 4, // Volatile organic compounds
SENSOR_TYPE_LIGHT = 5, // Light intensity sensors
SENSOR_TYPE_PARTICULATE = 6 // Particulate matter sensors
} sensor_type_t;
typedef enum {
SENSOR_INTERFACE_I2C = 0, // I2C interface
SENSOR_INTERFACE_SPI = 1, // SPI interface
SENSOR_INTERFACE_UART = 2, // UART interface
SENSOR_INTERFACE_ADC = 3, // Analog ADC interface
SENSOR_INTERFACE_GPIO = 4 // Digital GPIO interface
} sensor_interface_t;
```
**Sensor State Management:**
```c
typedef enum {
SENSOR_STATE_UNKNOWN = 0, // Initial state, not detected
SENSOR_STATE_DETECTED = 1, // Presence confirmed
SENSOR_STATE_INITIALIZED = 2, // Driver loaded and configured
SENSOR_STATE_WARMUP = 3, // Warming up, not stable
SENSOR_STATE_STABLE = 4, // Operational and stable
SENSOR_STATE_ENABLED = 5, // Active data acquisition
SENSOR_STATE_DISABLED = 6, // Present but not acquiring
SENSOR_STATE_DEGRADED = 7, // Operational but degraded
SENSOR_STATE_FAULTY = 8, // Detected failure condition
SENSOR_STATE_REMOVED = 9 // Previously present, now absent
} sensor_state_t;
typedef struct {
sensor_state_t current_state; // Current sensor state
sensor_state_t previous_state; // Previous sensor state
uint64_t state_change_time; // Last state change timestamp
uint32_t state_change_count; // Total state changes
uint32_t fault_count; // Number of faults detected
uint32_t recovery_count; // Number of successful recoveries
float last_valid_reading; // Last known good reading
uint64_t last_reading_time; // Timestamp of last reading
} sensor_state_info_t;
```
**Sensor State Machine:**
```mermaid
stateDiagram-v2
[*] --> UNKNOWN
UNKNOWN --> DETECTED : Presence detected
DETECTED --> INITIALIZED : Driver loaded
INITIALIZED --> WARMUP : Acquisition started
WARMUP --> STABLE : Warmup complete
STABLE --> ENABLED : Enable command
ENABLED --> DISABLED : Disable command
DISABLED --> ENABLED : Enable command
ENABLED --> DEGRADED : Performance degradation
DEGRADED --> ENABLED : Performance restored
ENABLED --> FAULTY : Failure detected
DEGRADED --> FAULTY : Failure detected
FAULTY --> ENABLED : Recovery successful
FAULTY --> REMOVED : Hardware removed
ENABLED --> REMOVED : Hardware removed
DISABLED --> REMOVED : Hardware removed
DEGRADED --> REMOVED : Hardware removed
REMOVED --> DETECTED : Hardware restored
```
**Uniform Sensor API:**
```c
// Sensor lifecycle management
bool sal_initializeSensor(uint8_t sensor_id);
bool sal_enableSensor(uint8_t sensor_id);
bool sal_disableSensor(uint8_t sensor_id);
bool sal_resetSensor(uint8_t sensor_id);
// Sensor data operations
bool sal_readSensor(uint8_t sensor_id, float* value);
bool sal_calibrateSensor(uint8_t sensor_id, const calibration_data_t* cal_data);
bool sal_validateReading(uint8_t sensor_id, float value);
bool sal_performHealthCheck(uint8_t sensor_id);
// Sensor information and status
bool sal_getSensorMetadata(uint8_t sensor_id, sensor_metadata_t* metadata);
sensor_state_t sal_getSensorState(uint8_t sensor_id);
bool sal_getSensorStateInfo(uint8_t sensor_id, sensor_state_info_t* info);
bool sal_isSensorPresent(uint8_t sensor_id);
bool sal_isSensorHealthy(uint8_t sensor_id);
```
**Sensor Driver Interface:**
```c
typedef struct {
// Driver identification
char driver_name[32]; // Driver name
char driver_version[16]; // Driver version
sensor_type_t supported_type; // Supported sensor type
// Driver operations
bool (*initialize)(uint8_t sensor_id);
bool (*read_raw)(uint8_t sensor_id, uint32_t* raw_value);
bool (*convert_value)(uint32_t raw_value, float* converted_value);
bool (*calibrate)(uint8_t sensor_id, const calibration_data_t* cal_data);
bool (*health_check)(uint8_t sensor_id);
bool (*reset)(uint8_t sensor_id);
// Driver configuration
sensor_metadata_t metadata; // Sensor metadata
void* driver_config; // Driver-specific configuration
} sensor_driver_interface_t;
```
### 2.2 F-HW-002: Hardware Interface Abstraction
**Description:** Comprehensive abstraction of all hardware interfaces to prevent direct hardware access from the application layer and ensure consistent interface usage across the system.
**GPIO Discipline and Management:**
```c
typedef struct {
uint8_t gpio_number; // Physical GPIO pin number
gpio_function_t function; // Assigned function
gpio_direction_t direction; // Input/Output direction
gpio_pull_t pull_config; // Pull-up/Pull-down configuration
bool is_strapping_pin; // Strapping pin flag
bool is_reserved; // Reserved for system use
char assigned_component[32]; // Component using this GPIO
} gpio_pin_config_t;
typedef enum {
GPIO_FUNC_UNUSED = 0, // Pin not used
GPIO_FUNC_I2C_SDA = 1, // I2C data line
GPIO_FUNC_I2C_SCL = 2, // I2C clock line
GPIO_FUNC_SPI_MOSI = 3, // SPI master out, slave in
GPIO_FUNC_SPI_MISO = 4, // SPI master in, slave out
GPIO_FUNC_SPI_CLK = 5, // SPI clock
GPIO_FUNC_SPI_CS = 6, // SPI chip select
GPIO_FUNC_UART_TX = 7, // UART transmit
GPIO_FUNC_UART_RX = 8, // UART receive
GPIO_FUNC_ADC_INPUT = 9, // ADC analog input
GPIO_FUNC_DIGITAL_INPUT = 10, // Digital input
GPIO_FUNC_DIGITAL_OUTPUT = 11, // Digital output
GPIO_FUNC_PWM_OUTPUT = 12 // PWM output
} gpio_function_t;
// Strapping pins that must be avoided for general-purpose I/O
#define GPIO_STRAPPING_PINS {0, 3, 45, 46}
```
**I2C Interface Abstraction:**
```c
typedef struct {
uint8_t i2c_port; // I2C port number (0 or 1)
uint8_t sda_pin; // SDA pin assignment
uint8_t scl_pin; // SCL pin assignment
uint32_t frequency_hz; // I2C frequency (100kHz, 400kHz)
bool pullup_enabled; // Internal pull-up enable
uint32_t timeout_ms; // Transaction timeout
} i2c_config_t;
// I2C abstraction API
bool hw_i2c_initialize(uint8_t port, const i2c_config_t* config);
bool hw_i2c_write(uint8_t port, uint8_t device_addr, const uint8_t* data, size_t len);
bool hw_i2c_read(uint8_t port, uint8_t device_addr, uint8_t* data, size_t len);
bool hw_i2c_write_read(uint8_t port, uint8_t device_addr,
const uint8_t* write_data, size_t write_len,
uint8_t* read_data, size_t read_len);
bool hw_i2c_scan_devices(uint8_t port, uint8_t* found_devices, size_t* count);
```
**SPI Interface Abstraction:**
```c
typedef struct {
uint8_t spi_host; // SPI host (SPI2_HOST, SPI3_HOST)
uint8_t mosi_pin; // MOSI pin assignment
uint8_t miso_pin; // MISO pin assignment
uint8_t sclk_pin; // Clock pin assignment
uint8_t cs_pin; // Chip select pin
uint32_t frequency_hz; // SPI frequency
uint8_t mode; // SPI mode (0-3)
uint8_t bit_order; // MSB/LSB first
} spi_config_t;
// SPI abstraction API
bool hw_spi_initialize(uint8_t host, const spi_config_t* config);
bool hw_spi_transmit(uint8_t host, const uint8_t* tx_data, size_t len);
bool hw_spi_receive(uint8_t host, uint8_t* rx_data, size_t len);
bool hw_spi_transmit_receive(uint8_t host, const uint8_t* tx_data,
uint8_t* rx_data, size_t len);
```
**ADC Interface Abstraction:**
```c
typedef struct {
adc_unit_t adc_unit; // ADC1 or ADC2 (ADC1 only when Wi-Fi active)
adc_channel_t channel; // ADC channel
adc_atten_t attenuation; // Input attenuation
adc_bitwidth_t resolution; // ADC resolution
uint32_t sample_count; // Samples for averaging
} adc_config_t;
// ADC abstraction API
bool hw_adc_initialize(const adc_config_t* config);
bool hw_adc_read_raw(adc_unit_t unit, adc_channel_t channel, uint32_t* raw_value);
bool hw_adc_read_voltage(adc_unit_t unit, adc_channel_t channel, float* voltage);
bool hw_adc_calibrate(adc_unit_t unit, adc_channel_t channel);
```
**Storage Interface Abstraction:**
```c
// SD Card abstraction
typedef struct {
uint8_t mosi_pin; // SD card MOSI pin
uint8_t miso_pin; // SD card MISO pin
uint8_t clk_pin; // SD card clock pin
uint8_t cs_pin; // SD card chip select pin
uint32_t frequency_hz; // SD card SPI frequency
bool format_if_mount_failed; // Auto-format on mount failure
} sd_card_config_t;
bool hw_sd_initialize(const sd_card_config_t* config);
bool hw_sd_mount(const char* mount_point);
bool hw_sd_unmount(void);
bool hw_sd_get_info(sd_card_info_t* info);
// NVM (NVS) abstraction
bool hw_nvs_initialize(void);
bool hw_nvs_write_blob(const char* namespace, const char* key,
const void* data, size_t size);
bool hw_nvs_read_blob(const char* namespace, const char* key,
void* data, size_t* size);
bool hw_nvs_erase_key(const char* namespace, const char* key);
bool hw_nvs_get_stats(nvs_stats_t* stats);
```
**GPIO Map Management:**
```c
typedef struct {
gpio_pin_config_t pins[GPIO_NUM_MAX]; // All GPIO pin configurations
uint32_t used_pin_count; // Number of used pins
uint32_t conflict_count; // Number of detected conflicts
bool map_validated; // GPIO map validation status
} gpio_map_t;
// GPIO map management API
bool hw_gpio_initialize_map(void);
bool hw_gpio_reserve_pin(uint8_t gpio_num, gpio_function_t function,
const char* component_name);
bool hw_gpio_release_pin(uint8_t gpio_num);
bool hw_gpio_validate_map(void);
bool hw_gpio_detect_conflicts(gpio_conflict_t* conflicts, size_t* count);
bool hw_gpio_get_map(gpio_map_t* map);
```
**Hardware Resource Conflict Detection:**
```c
typedef enum {
HW_CONFLICT_GPIO_DUPLICATE = 0, // Same GPIO used by multiple components
HW_CONFLICT_I2C_ADDRESS = 1, // I2C address conflict
HW_CONFLICT_SPI_CS = 2, // SPI chip select conflict
HW_CONFLICT_ADC_WIFI = 3, // ADC2 used with Wi-Fi active
HW_CONFLICT_STRAPPING_PIN = 4, // Strapping pin used for I/O
HW_CONFLICT_POWER_DOMAIN = 5 // Power domain conflict
} hw_conflict_type_t;
typedef struct {
hw_conflict_type_t type; // Conflict type
uint8_t resource_id; // Conflicting resource ID
char component1[32]; // First component involved
char component2[32]; // Second component involved
char description[128]; // Human-readable description
} hw_conflict_t;
```
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-HW-001** | SR-HW-001, SR-HW-002, SR-HW-003, SR-HW-004 | Sensor abstraction layer and state management |
| **F-HW-002** | SR-HW-005, SR-HW-006, SR-HW-007, SR-HW-008 | Hardware interface abstraction and GPIO discipline |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-HW-001** | SWR-HW-001, SWR-HW-002, SWR-HW-003 | SAL interface, sensor drivers, state machine |
| **F-HW-002** | SWR-HW-004, SWR-HW-005, SWR-HW-006 | Interface abstraction, GPIO management, conflict detection |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Sensor Abstraction Layer** | Uniform sensor interface, state management | `drivers/sensor_abstraction/` |
| **Hardware Interface Manager** | Interface abstraction, resource management | `drivers/hw_interface_mgr/` |
| **GPIO Manager** | GPIO discipline, conflict detection | `drivers/gpio_manager/` |
| **Sensor Drivers** | Hardware-specific sensor implementations | `drivers/sensors/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **ESP-IDF Wrappers** | Low-level hardware access | `ESP_IDF_FW_wrappers/` |
| **Diagnostics** | Hardware fault reporting | `application_layer/diag_task/` |
| **Machine Constant Manager** | Hardware configuration management | `application_layer/business_stack/machine_constant_manager/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Hardware Abstraction Feature"
SAL[Sensor Abstraction Layer]
HW_MGR[Hardware Interface Manager]
GPIO_MGR[GPIO Manager]
SENSOR_DRV[Sensor Drivers]
end
subgraph "Application Layer"
SENSOR_MGR[Sensor Manager]
PERSIST[Persistence]
DIAG[Diagnostics]
end
subgraph "ESP-IDF Wrappers"
I2C_WRAP[I2C Wrapper]
SPI_WRAP[SPI Wrapper]
UART_WRAP[UART Wrapper]
ADC_WRAP[ADC Wrapper]
GPIO_WRAP[GPIO Wrapper]
end
subgraph "Physical Hardware"
SENSORS[Physical Sensors]
SD_CARD[SD Card]
OLED[OLED Display]
BUTTONS[Buttons]
end
SENSOR_MGR --> SAL
SAL --> SENSOR_DRV
SENSOR_DRV --> HW_MGR
HW_MGR --> I2C_WRAP
HW_MGR --> SPI_WRAP
HW_MGR --> UART_WRAP
HW_MGR --> ADC_WRAP
GPIO_MGR --> GPIO_WRAP
I2C_WRAP --> SENSORS
SPI_WRAP --> SD_CARD
UART_WRAP --> SENSORS
ADC_WRAP --> SENSORS
GPIO_WRAP --> BUTTONS
GPIO_WRAP --> OLED
SAL -.->|Hardware Events| DIAG
HW_MGR -.->|Resource Conflicts| DIAG
```
### 4.4 Sensor Abstraction Flow
```mermaid
sequenceDiagram
participant APP as Application
participant SAL as Sensor Abstraction Layer
participant DRV as Sensor Driver
participant HW as Hardware Interface
participant SENSOR as Physical Sensor
Note over APP,SENSOR: Sensor Access Through Abstraction
APP->>SAL: sal_readSensor(sensor_id)
SAL->>SAL: validateSensorState(sensor_id)
alt Sensor State Valid
SAL->>DRV: driver_read_raw(sensor_id)
DRV->>HW: hw_i2c_read(port, addr, data)
HW->>SENSOR: I2C transaction
SENSOR-->>HW: sensor_data
HW-->>DRV: raw_value
DRV->>DRV: convert_value(raw_value)
DRV-->>SAL: converted_value
SAL->>SAL: validateReading(converted_value)
SAL-->>APP: sensor_reading
else Sensor State Invalid
SAL->>SAL: attemptSensorRecovery(sensor_id)
SAL-->>APP: error_sensor_not_ready
end
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **System Initialization:**
- Initialize GPIO map and validate pin assignments
- Detect hardware resource conflicts and report violations
- Initialize hardware interface abstractions (I2C, SPI, UART, ADC)
- Scan for connected sensors and initialize sensor drivers
- Establish sensor abstraction layer and state management
2. **Sensor Management:**
- Maintain sensor state machine for all detected sensors
- Provide uniform access interface regardless of hardware interface
- Handle sensor failures and recovery attempts
- Monitor sensor health and performance metrics
3. **Hardware Interface Management:**
- Enforce GPIO discipline and prevent strapping pin usage
- Manage shared resources and prevent conflicts
- Provide consistent interface abstraction across all hardware types
- Monitor interface health and detect hardware failures
4. **Resource Conflict Prevention:**
- Validate GPIO assignments during initialization
- Detect and report hardware resource conflicts
- Enforce ADC1/ADC2 separation when Wi-Fi is active
- Maintain canonical GPIO map as single source of truth
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **GPIO Conflict** | Pin assignment validation | Report conflict, prevent initialization |
| **Sensor Communication Failure** | Interface timeout/error | Mark sensor as faulty, attempt recovery |
| **Hardware Interface Failure** | Transaction failure | Report hardware fault, disable interface |
| **Strapping Pin Usage** | GPIO map validation | Prevent usage, report configuration error |
| **ADC2/Wi-Fi Conflict** | Resource validation | Force ADC1 usage, report conflict |
| **I2C Address Conflict** | Device scanning | Report conflict, disable conflicting devices |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Initialize hardware abstractions, detect sensors |
| **RUNNING** | Full hardware abstraction, continuous sensor monitoring |
| **WARNING** | Enhanced hardware monitoring, sensor health checks |
| **FAULT** | Critical hardware functions only, preserve sensor states |
| **OTA_UPDATE** | Maintain hardware state, suspend sensor operations |
| **TEARDOWN** | Graceful hardware shutdown, preserve configurations |
| **SERVICE** | Limited hardware access for diagnostics |
| **SD_DEGRADED** | Continue sensor operations, report storage degradation |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Sensor State Transitions:** Maximum 100ms for state changes
- **Hardware Interface Operations:** Bounded timeouts for all transactions
- **GPIO Conflict Detection:** Complete during system initialization
- **Sensor Recovery Attempts:** Maximum 3 attempts with exponential backoff
### 6.2 Resource Constraints
- **GPIO Usage:** Enforce strapping pin restrictions and conflict prevention
- **I2C Pull-ups:** Physical pull-ups required (2.2kΩ - 4.7kΩ for 3.3V)
- **ADC Constraints:** ADC1 only when Wi-Fi active, ADC2 when Wi-Fi inactive
- **Memory Usage:** Maximum 32KB for abstraction layer buffers and state
### 6.3 Hardware Constraints
- **ESP32-S3 Limitations:** Respect hardware capabilities and restrictions
- **Interface Speeds:** I2C up to 400kHz, SPI up to 80MHz
- **Voltage Levels:** 3.3V logic levels for all interfaces
- **Current Limitations:** Respect GPIO current drive capabilities
## 7. Interface Specifications
### 7.1 Sensor Abstraction Layer API
```c
// SAL initialization and management
bool sal_initialize(void);
bool sal_detectSensors(void);
bool sal_getSensorCount(uint8_t* count);
bool sal_getSensorList(uint8_t* sensor_ids, size_t* count);
// Sensor operations
bool sal_readSensor(uint8_t sensor_id, float* value);
bool sal_readMultipleSensors(uint8_t* sensor_ids, size_t count,
sensor_reading_t* readings);
bool sal_calibrateSensor(uint8_t sensor_id, const calibration_data_t* cal_data);
bool sal_resetSensor(uint8_t sensor_id);
// Sensor state and health
sensor_state_t sal_getSensorState(uint8_t sensor_id);
bool sal_getSensorHealth(uint8_t sensor_id, sensor_health_t* health);
bool sal_performSensorDiagnostics(uint8_t sensor_id, sensor_diagnostics_t* diag);
```
### 7.2 Hardware Interface Manager API
```c
// Interface initialization
bool hwMgr_initialize(void);
bool hwMgr_initializeI2C(uint8_t port, const i2c_config_t* config);
bool hwMgr_initializeSPI(uint8_t host, const spi_config_t* config);
bool hwMgr_initializeUART(uint8_t port, const uart_config_t* config);
bool hwMgr_initializeADC(const adc_config_t* config);
// Resource management
bool hwMgr_reserveResource(hw_resource_type_t type, uint8_t resource_id,
const char* component_name);
bool hwMgr_releaseResource(hw_resource_type_t type, uint8_t resource_id);
bool hwMgr_validateResources(void);
bool hwMgr_getResourceConflicts(hw_conflict_t* conflicts, size_t* count);
```
### 7.3 GPIO Manager API
```c
// GPIO management
bool gpioMgr_initialize(void);
bool gpioMgr_reservePin(uint8_t gpio_num, gpio_function_t function,
const char* component_name);
bool gpioMgr_releasePin(uint8_t gpio_num);
bool gpioMgr_configurePin(uint8_t gpio_num, const gpio_config_t* config);
// GPIO validation and conflict detection
bool gpioMgr_validateGPIOMap(void);
bool gpioMgr_isStrappingPin(uint8_t gpio_num);
bool gpioMgr_detectConflicts(gpio_conflict_t* conflicts, size_t* count);
bool gpioMgr_getGPIOMap(gpio_map_t* map);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Sensor Abstraction:** All sensor types and state transitions
- **Interface Abstraction:** I2C, SPI, UART, ADC operations
- **GPIO Management:** Pin assignment and conflict detection
- **Resource Management:** Resource allocation and validation
### 8.2 Integration Testing
- **Cross-Interface Testing:** Multiple interfaces operating simultaneously
- **Sensor Integration:** All sensor types through abstraction layer
- **Resource Conflict Testing:** Deliberate conflict scenarios
- **State Coordination:** Hardware abstraction with system states
### 8.3 System Testing
- **Hardware Compatibility:** All supported sensor hardware configurations
- **Performance Testing:** Interface throughput and timing constraints
- **Fault Injection:** Hardware failure simulation and recovery
- **Long-Duration Testing:** Extended operation with hardware monitoring
### 8.4 Acceptance Criteria
- All sensor types accessible through uniform SAL interface
- Hardware interfaces properly abstracted from application layer
- GPIO conflicts detected and prevented during initialization
- No direct hardware access from application components
- Sensor state management operates correctly under all conditions
- Hardware resource conflicts properly detected and reported
- Complete hardware abstraction maintains system performance
## 9. Dependencies
### 9.1 Internal Dependencies
- **Sensor Manager:** Primary consumer of sensor abstraction layer
- **Diagnostics:** Hardware fault reporting and event logging
- **Machine Constant Manager:** Hardware configuration management
- **State Manager:** Hardware behavior coordination across system states
### 9.2 External Dependencies
- **ESP-IDF Framework:** Low-level hardware drivers and HAL
- **Hardware Components:** Physical sensors, interfaces, and peripherals
- **FreeRTOS:** Task coordination and resource management
- **Hardware Design:** GPIO assignments and interface configurations
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Dynamic Sensor Discovery:** Runtime sensor detection and configuration
- **Advanced Sensor Fusion:** Multi-sensor data correlation and validation
- **Hardware Health Monitoring:** Predictive hardware failure detection
- **Plug-and-Play Support:** Hot-swappable sensor support
### 10.2 Scalability Considerations
- **Additional Sensor Types:** Framework supports easy sensor type extension
- **Multiple Interface Support:** Support for additional hardware interfaces
- **Advanced GPIO Management:** Dynamic GPIO allocation and optimization
- **Hardware Virtualization:** Virtual hardware interfaces for testing
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-HW, SWR-HW)
**Next Review:** After component implementation

View File

@@ -0,0 +1,749 @@
# Feature Specification: Firmware Update (OTA)
# Feature ID: F-OTA (F-OTA-001 to F-OTA-005)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Firmware Update (OTA)
## 1. Feature Overview
### 1.1 Feature Purpose
The Firmware Update (OTA) feature provides secure, reliable over-the-air firmware update capabilities for the ASF Sensor Hub. This feature enables controlled firmware lifecycle management, ensuring system availability, data integrity, and fault containment during firmware update operations.
### 1.2 Feature Scope
**In Scope:**
- OTA negotiation and readiness validation with Main Hub
- Secure firmware reception over encrypted communication channels
- Firmware integrity validation using cryptographic verification
- Safe firmware activation with A/B partitioning and automatic rollback
- Controlled system teardown and data preservation during updates
**Out of Scope:**
- Firmware generation and cryptographic signing infrastructure
- Cloud-side firmware distribution and management
- Main Hub OTA coordination logic
- Hardware-level secure boot implementation (dependency)
## 2. Sub-Features
### 2.1 F-OTA-001: OTA Update Negotiation
**Description:** Comprehensive negotiation phase between Sensor Hub and Main Hub to establish OTA readiness and coordinate update initiation.
**Readiness Validation Criteria:**
```c
typedef struct {
system_state_t current_state; // Must be RUNNING
bool power_stable; // Power supply stable
bool storage_available; // SD card accessible with sufficient space
bool communication_stable; // Network connection stable
uint32_t free_sd_space_mb; // Available SD card space
uint32_t free_nvs_entries; // Available NVS entries
float supply_voltage; // Current supply voltage
uint32_t uptime_seconds; // System uptime for stability
} ota_readiness_t;
typedef enum {
OTA_READY_ACCEPT = 0, // System ready for OTA
OTA_READY_REJECT_STATE = 1, // Invalid system state
OTA_READY_REJECT_POWER = 2, // Power instability
OTA_READY_REJECT_STORAGE = 3, // Storage unavailable
OTA_READY_REJECT_COMM = 4, // Communication unstable
OTA_READY_REJECT_RESOURCES = 5 // Insufficient resources
} ota_readiness_result_t;
```
**Negotiation Sequence:**
```mermaid
sequenceDiagram
participant MH as Main Hub
participant API as Main Hub APIs
participant OTA as OTA Manager
participant STM as State Manager
participant DIAG as Diagnostics
MH->>API: OTA_AVAILABILITY_NOTIFICATION
API->>OTA: otaAvailabilityReceived(metadata)
OTA->>OTA: validateSystemReadiness()
alt System Ready
OTA->>STM: requestStateTransition(OTA_PREP)
STM->>STM: validateTransition()
STM-->>OTA: transitionAccepted()
OTA->>API: otaResponse(ACCEPT, readiness_info)
API->>MH: OTA_NEGOTIATION_RESPONSE(ACCEPT)
else System Not Ready
OTA->>DIAG: logDiagnosticEvent(OTA_REJECTED, reason)
OTA->>API: otaResponse(REJECT, rejection_reason)
API->>MH: OTA_NEGOTIATION_RESPONSE(REJECT)
end
```
**Readiness Validation Logic:**
- **System State Check:** Must be in RUNNING state (not WARNING/FAULT/SERVICE/SD_DEGRADED)
- **Power Stability:** Supply voltage within 3.0V-3.6V range for >30 seconds
- **Storage Availability:** SD card accessible with >100MB free space
- **Communication Stability:** Network connection stable for >60 seconds
- **Resource Availability:** Sufficient NVS entries and heap memory
### 2.2 F-OTA-002: Firmware Reception and Storage
**Description:** Secure reception of firmware image from Main Hub with chunked download, progress monitoring, and temporary storage management.
**Download Configuration:**
```c
typedef struct {
uint32_t chunk_size; // 4096 bytes (optimized for flash page)
uint32_t total_size; // Total firmware size
uint32_t total_chunks; // Number of chunks
char firmware_version[32]; // Target firmware version
uint8_t sha256_hash[32]; // Expected SHA-256 hash
uint32_t timeout_seconds; // Download timeout (600 seconds)
} ota_download_config_t;
typedef struct {
uint32_t chunks_received; // Number of chunks received
uint32_t bytes_received; // Total bytes received
uint32_t chunks_failed; // Failed chunk count
uint32_t retries_performed; // Retry attempts
uint64_t start_time_ms; // Download start timestamp
uint64_t last_chunk_time_ms; // Last chunk received timestamp
ota_download_state_t state; // Current download state
} ota_download_progress_t;
typedef enum {
OTA_DOWNLOAD_IDLE = 0,
OTA_DOWNLOAD_ACTIVE = 1,
OTA_DOWNLOAD_PAUSED = 2,
OTA_DOWNLOAD_COMPLETE = 3,
OTA_DOWNLOAD_FAILED = 4,
OTA_DOWNLOAD_TIMEOUT = 5
} ota_download_state_t;
```
**Storage Management:**
- **Temporary Storage:** SD card path `/ota/firmware_temp.bin`
- **Chunk Verification:** Per-chunk CRC32 validation
- **Progress Persistence:** Download state persisted to NVS for recovery
- **Timeout Handling:** 10-minute maximum download duration
- **Retry Logic:** Up to 3 retries per failed chunk
**Download Flow:**
```mermaid
sequenceDiagram
participant MH as Main Hub
participant NET as Network Stack
participant OTA as OTA Manager
participant SD as SD Card Storage
participant NVS as NVS Storage
Note over MH,NVS: Firmware Download Phase
MH->>NET: firmwareChunk(chunk_id, data, crc32)
NET->>OTA: chunkReceived(chunk_id, data, crc32)
OTA->>OTA: validateChunkCRC(data, crc32)
alt CRC Valid
OTA->>SD: writeChunk(chunk_id, data)
SD-->>OTA: writeComplete()
OTA->>NVS: updateProgress(chunks_received++)
OTA->>NET: chunkAck(chunk_id, SUCCESS)
NET->>MH: CHUNK_ACK(chunk_id, SUCCESS)
else CRC Invalid
OTA->>NET: chunkAck(chunk_id, RETRY)
NET->>MH: CHUNK_ACK(chunk_id, RETRY)
end
OTA->>OTA: checkDownloadComplete()
alt All Chunks Received
OTA->>OTA: transitionToValidation()
end
```
### 2.3 F-OTA-003: Firmware Integrity Validation
**Description:** Comprehensive firmware integrity and authenticity validation using cryptographic verification before activation.
**Validation Methods:**
```c
typedef struct {
bool size_valid; // Firmware size matches metadata
bool sha256_valid; // SHA-256 hash verification
bool signature_valid; // Digital signature verification (if available)
bool partition_valid; // Partition table validation
bool version_valid; // Version number validation
bool compatibility_valid; // Hardware compatibility check
} ota_validation_result_t;
typedef enum {
OTA_VALIDATION_PENDING = 0,
OTA_VALIDATION_IN_PROGRESS = 1,
OTA_VALIDATION_SUCCESS = 2,
OTA_VALIDATION_FAILED_SIZE = 3,
OTA_VALIDATION_FAILED_HASH = 4,
OTA_VALIDATION_FAILED_SIGNATURE = 5,
OTA_VALIDATION_FAILED_PARTITION = 6,
OTA_VALIDATION_FAILED_VERSION = 7,
OTA_VALIDATION_FAILED_COMPATIBILITY = 8
} ota_validation_status_t;
```
**Validation Sequence:**
1. **Size Validation:** Verify received firmware size matches metadata
2. **SHA-256 Verification:** Calculate and compare full image hash
3. **Partition Table Validation:** Verify partition structure compatibility
4. **Version Validation:** Ensure version progression (anti-rollback)
5. **Hardware Compatibility:** Verify target platform compatibility
**Validation Flow:**
```mermaid
sequenceDiagram
participant OTA as OTA Manager
participant SD as SD Card Storage
participant SEC as Security Manager
participant DIAG as Diagnostics
participant API as Main Hub APIs
Note over OTA,API: Firmware Validation Phase
OTA->>SD: readFirmwareImage()
SD-->>OTA: firmware_data
OTA->>OTA: validateSize(firmware_data)
alt Size Valid
OTA->>SEC: calculateSHA256(firmware_data)
SEC-->>OTA: calculated_hash
OTA->>OTA: compareSHA256(calculated_hash, expected_hash)
alt Hash Valid
OTA->>SEC: validateSignature(firmware_data)
SEC-->>OTA: signature_result
alt Signature Valid
OTA->>OTA: validatePartitionTable(firmware_data)
OTA->>OTA: validateVersion(firmware_data)
OTA->>OTA: validateCompatibility(firmware_data)
OTA->>API: validationComplete(SUCCESS)
else Signature Invalid
OTA->>DIAG: logDiagnosticEvent(OTA_VALIDATION_FAILED)
OTA->>API: validationComplete(FAILED_SIGNATURE)
end
else Hash Invalid
OTA->>DIAG: logDiagnosticEvent(OTA_HASH_MISMATCH)
OTA->>API: validationComplete(FAILED_HASH)
end
else Size Invalid
OTA->>DIAG: logDiagnosticEvent(OTA_SIZE_MISMATCH)
OTA->>API: validationComplete(FAILED_SIZE)
end
```
### 2.4 F-OTA-004: Safe Firmware Activation
**Description:** Controlled firmware activation with system teardown, data preservation, and safe transition to new firmware.
**Activation Sequence:**
```c
typedef enum {
OTA_ACTIVATION_IDLE = 0,
OTA_ACTIVATION_TEARDOWN = 1,
OTA_ACTIVATION_DATA_FLUSH = 2,
OTA_ACTIVATION_FLASHING = 3,
OTA_ACTIVATION_PARTITION_UPDATE = 4,
OTA_ACTIVATION_REBOOT = 5,
OTA_ACTIVATION_COMPLETE = 6,
OTA_ACTIVATION_FAILED = 7
} ota_activation_state_t;
typedef struct {
bool sensor_data_flushed; // Latest sensor data preserved
bool diagnostics_flushed; // Diagnostic events preserved
bool machine_constants_flushed; // Machine constants preserved
bool calibration_data_flushed; // Calibration data preserved
bool system_state_flushed; // System state preserved
} ota_data_flush_status_t;
```
**Activation Flow:**
```mermaid
sequenceDiagram
participant OTA as OTA Manager
participant STM as State Manager
participant PERSIST as Persistence
participant FLASH as Flash Manager
participant BOOT as Boot Manager
Note over OTA,BOOT: Firmware Activation Phase
OTA->>STM: requestStateTransition(TEARDOWN)
STM->>STM: initiateTeardown()
STM->>PERSIST: flushCriticalData()
PERSIST->>PERSIST: flushSensorData()
PERSIST->>PERSIST: flushDiagnostics()
PERSIST->>PERSIST: flushMachineConstants()
PERSIST->>PERSIST: flushCalibrationData()
PERSIST-->>STM: flushComplete()
STM->>OTA: teardownComplete()
OTA->>STM: requestStateTransition(OTA_UPDATE)
OTA->>FLASH: flashFirmwareToInactivePartition()
FLASH-->>OTA: flashingComplete()
OTA->>BOOT: updatePartitionTable()
BOOT-->>OTA: partitionTableUpdated()
OTA->>STM: systemReboot()
```
**Data Preservation Priority:**
1. **Critical System Data:** Machine constants, calibration data
2. **Diagnostic Data:** Recent diagnostic events and system health
3. **Sensor Data:** Latest sensor readings and statistics
4. **System State:** Current system state and configuration
### 2.5 F-OTA-005: A/B Partitioning with Rollback
**Description:** A/B partitioning implementation with automatic rollback capability for safe firmware updates.
**Partition Management:**
```c
typedef enum {
OTA_PARTITION_A = 0, // Primary partition (ota_0)
OTA_PARTITION_B = 1, // Secondary partition (ota_1)
OTA_PARTITION_FACTORY = 2 // Factory partition (rescue)
} ota_partition_t;
typedef struct {
ota_partition_t active_partition; // Currently running partition
ota_partition_t inactive_partition; // Target for next update
char active_version[32]; // Version of active firmware
char inactive_version[32]; // Version of inactive firmware
uint32_t boot_count; // Boot attempts since activation
uint64_t activation_time; // Activation timestamp
bool rollback_pending; // Rollback flag
} ota_partition_status_t;
```
**Rollback Triggers:**
- **Boot Failure:** System fails to boot within 60 seconds
- **Health Check Failure:** No health report within 120 seconds after boot
- **Application Crash:** Critical application failure during confirmation period
- **Manual Rollback:** Explicit rollback command from Main Hub
**Rollback Flow:**
```mermaid
sequenceDiagram
participant BOOT as Boot Manager
participant APP as Application
participant HEALTH as Health Monitor
participant OTA as OTA Manager
participant DIAG as Diagnostics
Note over BOOT,DIAG: Firmware Rollback Scenario
BOOT->>APP: startApplication()
alt Boot Successful
APP->>HEALTH: startHealthMonitoring()
HEALTH->>HEALTH: waitForConfirmationWindow(120s)
alt Health Report Received
HEALTH->>OTA: confirmFirmwareStability()
OTA->>OTA: markFirmwareAsValid()
else No Health Report
HEALTH->>OTA: firmwareValidationTimeout()
OTA->>OTA: triggerRollback(HEALTH_TIMEOUT)
end
else Boot Failed
BOOT->>OTA: bootFailureDetected()
OTA->>OTA: triggerRollback(BOOT_FAILURE)
end
alt Rollback Triggered
OTA->>BOOT: switchToInactivePartition()
OTA->>DIAG: logDiagnosticEvent(FIRMWARE_ROLLBACK)
OTA->>BOOT: systemReboot()
end
```
**Rollback Process:**
1. **Failure Detection:** Detect rollback trigger condition
2. **Partition Switch:** Update partition table to boot from previous partition
3. **Diagnostic Logging:** Record rollback event and reason
4. **System Reboot:** Restart system with previous firmware
5. **Rollback Notification:** Report rollback to Main Hub after recovery
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-OTA-001** | SR-OTA-001, SR-OTA-002, SR-OTA-003 | OTA negotiation and readiness validation |
| **F-OTA-002** | SR-OTA-004, SR-OTA-005, SR-OTA-006 | Firmware reception and temporary storage |
| **F-OTA-003** | SR-OTA-007, SR-OTA-008, SR-OTA-009 | Firmware integrity and authenticity validation |
| **F-OTA-004** | SR-OTA-010, SR-OTA-011, SR-OTA-012, SR-OTA-013 | Safe firmware activation and data preservation |
| **F-OTA-005** | SR-OTA-014, SR-OTA-015, SR-OTA-016 | A/B partitioning and automatic rollback |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-OTA-001** | SWR-OTA-001, SWR-OTA-002, SWR-OTA-003 | Readiness validation, negotiation protocol, state coordination |
| **F-OTA-002** | SWR-OTA-004, SWR-OTA-005, SWR-OTA-006 | Chunked download, progress tracking, storage management |
| **F-OTA-003** | SWR-OTA-007, SWR-OTA-008, SWR-OTA-009 | SHA-256 validation, signature verification, compatibility checks |
| **F-OTA-004** | SWR-OTA-010, SWR-OTA-011, SWR-OTA-012 | Teardown coordination, data flush, firmware flashing |
| **F-OTA-005** | SWR-OTA-013, SWR-OTA-014, SWR-OTA-015 | Partition management, rollback detection, recovery procedures |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **OTA Manager** | OTA coordination, validation, activation | `application_layer/business_stack/fw_upgrader/` |
| **State Manager** | System state coordination, teardown management | `application_layer/business_stack/STM/` |
| **Persistence** | Data flush, firmware storage | `application_layer/DP_stack/persistence/` |
| **Security Manager** | Cryptographic validation, signature verification | `application_layer/security/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Main Hub APIs** | OTA communication protocol, message handling | `application_layer/business_stack/main_hub_apis/` |
| **Network Stack** | Secure firmware download transport | `drivers/network_stack/` |
| **SD Card Driver** | Temporary firmware storage | `drivers/SDcard/` |
| **NVM Driver** | Progress persistence, partition management | `drivers/nvm/` |
| **Diagnostics** | OTA event logging, failure reporting | `application_layer/diag_task/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "OTA Firmware Update Feature"
OTA[OTA Manager]
STM[State Manager]
PERSIST[Persistence]
SEC[Security Manager]
end
subgraph "Communication Components"
API[Main Hub APIs]
NET[Network Stack]
end
subgraph "Storage Components"
SD[SD Card Driver]
NVM[NVM Driver]
end
subgraph "System Components"
DIAG[Diagnostics]
BOOT[Boot Manager]
HEALTH[Health Monitor]
end
subgraph "External Interfaces"
MH[Main Hub]
FLASH[Flash Memory]
PART[Partition Table]
end
MH <-->|OTA Protocol| API
API <--> OTA
OTA <--> STM
OTA <--> PERSIST
OTA <--> SEC
OTA --> NET
NET -->|Firmware Download| MH
PERSIST --> SD
PERSIST --> NVM
OTA --> DIAG
OTA --> BOOT
OTA --> HEALTH
BOOT --> FLASH
BOOT --> PART
STM -.->|State Events| DIAG
SEC -.->|Validation Events| DIAG
```
### 4.4 OTA Update Sequence
```mermaid
sequenceDiagram
participant MH as Main Hub
participant API as Main Hub APIs
participant OTA as OTA Manager
participant STM as State Manager
participant PERSIST as Persistence
participant SEC as Security Manager
participant BOOT as Boot Manager
Note over MH,BOOT: Complete OTA Update Flow
MH->>API: OTA_AVAILABILITY_NOTIFICATION
API->>OTA: otaAvailabilityReceived()
OTA->>OTA: validateReadiness()
OTA->>STM: requestStateTransition(OTA_PREP)
OTA->>API: otaResponse(ACCEPT)
loop Firmware Download
MH->>API: firmwareChunk(data)
API->>OTA: chunkReceived(data)
OTA->>PERSIST: storeChunk(data)
end
OTA->>SEC: validateFirmware()
SEC-->>OTA: validationResult(SUCCESS)
OTA->>STM: requestStateTransition(TEARDOWN)
STM->>PERSIST: flushCriticalData()
PERSIST-->>STM: flushComplete()
OTA->>BOOT: flashFirmwareToInactivePartition()
OTA->>BOOT: updatePartitionTable()
OTA->>STM: systemReboot()
Note over MH,BOOT: System Reboots with New Firmware
BOOT->>BOOT: bootFromNewPartition()
BOOT->>OTA: firmwareActivated()
OTA->>API: otaStatus(ACTIVATION_SUCCESS)
API->>MH: OTA_STATUS_REPORT(SUCCESS)
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **OTA Availability Phase:**
- Receive OTA availability notification from Main Hub
- Validate system readiness (state, power, storage, communication)
- Negotiate OTA acceptance or rejection with detailed reasons
- Transition to OTA_PREP state if accepted
2. **Firmware Download Phase:**
- Receive firmware in 4KB chunks over secure channel
- Validate each chunk with CRC32 verification
- Store chunks to SD card temporary location
- Track download progress and handle retries
- Enforce 10-minute download timeout
3. **Validation Phase:**
- Perform comprehensive firmware integrity validation
- Verify SHA-256 hash, digital signature, and compatibility
- Report validation results to Main Hub
- Proceed to activation only if all validations pass
4. **Activation Phase:**
- Coordinate system teardown with State Manager
- Flush all critical data to persistent storage
- Flash validated firmware to inactive partition
- Update partition table for next boot
- Perform controlled system reboot
5. **Confirmation Phase:**
- Boot with new firmware and start health monitoring
- Confirm firmware stability within 120-second window
- Mark firmware as valid or trigger automatic rollback
- Report final OTA status to Main Hub
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **System Not Ready** | Readiness validation failure | Reject OTA request, report specific reason |
| **Download Timeout** | 10-minute timeout exceeded | Abort download, clean up temporary files |
| **Chunk Corruption** | CRC32 validation failure | Request chunk retransmission (up to 3 retries) |
| **Validation Failure** | Integrity check failure | Abort OTA, report validation error |
| **Flash Failure** | Firmware flashing error | Abort OTA, maintain current firmware |
| **Boot Failure** | New firmware boot failure | Automatic rollback to previous firmware |
| **Health Check Failure** | No health report within window | Automatic rollback to previous firmware |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | OTA Manager initialization, partition status check |
| **RUNNING** | Accept OTA requests, perform readiness validation |
| **WARNING** | Reject OTA requests (system not stable) |
| **FAULT** | Reject OTA requests (system in fault state) |
| **OTA_PREP** | Prepare for OTA, coordinate with other components |
| **OTA_UPDATE** | Execute OTA download, validation, and activation |
| **TEARDOWN** | Coordinate data flush before firmware activation |
| **SERVICE** | Reject OTA requests (maintenance mode) |
| **SD_DEGRADED** | Reject OTA requests (storage unavailable) |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Negotiation Response:** Maximum 5 seconds for readiness validation
- **Download Timeout:** Maximum 10 minutes for complete firmware download
- **Validation Time:** Maximum 2 minutes for integrity validation
- **Activation Time:** Maximum 5 minutes for firmware activation
- **Confirmation Window:** 120 seconds for firmware stability confirmation
### 6.2 Resource Constraints
- **Storage Requirements:** Minimum 100MB free space on SD card
- **Memory Usage:** Maximum 64KB for OTA buffers and state
- **Network Bandwidth:** Optimized for 4KB chunk size
- **Flash Wear:** Minimize flash write cycles during activation
### 6.3 Security Constraints
- **Encrypted Transport:** All firmware data must be transmitted over TLS
- **Integrity Validation:** SHA-256 verification mandatory
- **Anti-Rollback:** Version progression enforcement via eFuse
- **Secure Storage:** Temporary firmware files encrypted on SD card
## 7. Interface Specifications
### 7.1 OTA Manager Public API
```c
// OTA lifecycle management
bool otaMgr_initialize(void);
bool otaMgr_isReady(void);
ota_status_t otaMgr_getStatus(void);
// OTA operations
bool otaMgr_handleAvailabilityNotification(const ota_metadata_t* metadata);
bool otaMgr_receiveFirmwareChunk(uint32_t chunk_id, const uint8_t* data,
size_t size, uint32_t crc32);
bool otaMgr_validateFirmware(void);
bool otaMgr_activateFirmware(void);
// Rollback operations
bool otaMgr_triggerRollback(rollback_reason_t reason);
bool otaMgr_confirmFirmwareStability(void);
bool otaMgr_isRollbackPending(void);
// Status and diagnostics
bool otaMgr_getDownloadProgress(ota_download_progress_t* progress);
bool otaMgr_getValidationResult(ota_validation_result_t* result);
bool otaMgr_getPartitionStatus(ota_partition_status_t* status);
```
### 7.2 State Manager Integration
**State Transitions:**
- `RUNNING → OTA_PREP`: OTA request accepted
- `OTA_PREP → TEARDOWN`: Firmware validated, ready for activation
- `TEARDOWN → OTA_UPDATE`: Data flushed, ready for flashing
- `OTA_UPDATE → REBOOT`: Firmware activated, system restart
**State Coordination:**
```c
// State transition requests
bool otaMgr_requestStateTransition(system_state_t target_state,
transition_reason_t reason);
bool otaMgr_onStateChanged(system_state_t new_state, system_state_t old_state);
// Teardown coordination
bool otaMgr_initiateTeardown(teardown_reason_t reason);
bool otaMgr_onTeardownComplete(void);
```
### 7.3 Security Manager Integration
**Validation Interface:**
```c
// Firmware integrity validation
bool secMgr_calculateSHA256(const uint8_t* data, size_t size, uint8_t* hash);
bool secMgr_validateSignature(const uint8_t* firmware, size_t size,
const uint8_t* signature);
bool secMgr_validateAntiRollback(const char* version);
// Secure storage
bool secMgr_encryptFirmwareChunk(const uint8_t* plaintext, size_t size,
uint8_t* ciphertext);
bool secMgr_decryptFirmwareChunk(const uint8_t* ciphertext, size_t size,
uint8_t* plaintext);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Readiness Validation:** All readiness criteria and rejection scenarios
- **Chunk Processing:** Chunk reception, validation, and storage
- **Integrity Validation:** SHA-256, signature, and compatibility checks
- **Rollback Logic:** All rollback triggers and recovery procedures
### 8.2 Integration Testing
- **End-to-End OTA:** Complete OTA flow from negotiation to confirmation
- **State Coordination:** Integration with State Manager and other components
- **Security Integration:** Cryptographic validation and secure storage
- **Network Integration:** Firmware download over encrypted channels
### 8.3 System Testing
- **Fault Injection:** Network failures, power loss, corruption scenarios
- **Performance Testing:** Large firmware downloads and timing constraints
- **Security Testing:** Malicious firmware rejection and rollback scenarios
- **Long-Duration Testing:** Multiple OTA cycles and partition wear
### 8.4 Acceptance Criteria
- OTA negotiation completes within timing constraints
- Firmware download handles all error conditions gracefully
- Integrity validation rejects all invalid firmware images
- Activation preserves all critical data during transition
- Rollback mechanism recovers from all failure scenarios
- No security vulnerabilities in OTA process
- Complete audit trail of all OTA activities
## 9. Dependencies
### 9.1 Internal Dependencies
- **State Manager:** System state coordination and teardown management
- **Persistence:** Data flush and firmware storage operations
- **Security Manager:** Cryptographic validation and secure storage
- **Main Hub APIs:** OTA communication protocol implementation
- **Diagnostics:** OTA event logging and failure reporting
### 9.2 External Dependencies
- **ESP-IDF Framework:** Partition management and flash operations
- **Secure Boot:** Hardware-enforced firmware authentication
- **Network Stack:** Secure communication transport (TLS/DTLS)
- **Hardware Components:** SD card, NVS flash, network interface
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Delta Updates:** Incremental firmware updates to reduce download size
- **Compression:** Firmware compression to optimize storage and bandwidth
- **Multi-Stage Rollback:** Graduated rollback with multiple recovery points
- **Predictive Validation:** Pre-validation of firmware compatibility
### 10.2 Scalability Considerations
- **Fleet Management:** Coordinated OTA updates across multiple sensor hubs
- **Cloud Integration:** Direct cloud-based firmware distribution
- **Advanced Analytics:** OTA success rate monitoring and optimization
- **Automated Testing:** Continuous integration with automated OTA testing
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-OTA, SWR-OTA)
**Next Review:** After component implementation

View File

@@ -0,0 +1,586 @@
# Feature Specification: Power & Fault Handling
# Feature ID: F-PWR (F-PWR-001 to F-PWR-002)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Power & Fault Handling
## 1. Feature Overview
### 1.1 Feature Purpose
The Power & Fault Handling feature provides comprehensive power management and fault recovery capabilities for the ASF Sensor Hub. This feature ensures reliable operation under power fluctuations, graceful recovery from power interruptions, and protection of critical data during power loss events.
### 1.2 Feature Scope
**In Scope:**
- Hardware-based brownout detection and response
- Power-loss data protection with supercapacitor backup
- Graceful shutdown and recovery procedures
- Power quality monitoring and reporting
- Critical data preservation during power events
**Out of Scope:**
- Battery-powered operation modes (system assumes continuous power)
- Advanced power management for low-power modes
- External power supply design and regulation
- Hardware power supply fault diagnosis
## 2. Sub-Features
### 2.1 F-PWR-001: Brownout Detection and Handling
**Description:** Hardware-based brownout detection with immediate response to protect system integrity and preserve critical data during power supply fluctuations.
**Brownout Detection Configuration:**
```c
typedef struct {
float brownout_threshold_v; // 3.0V threshold (configurable)
uint32_t detection_delay_ms; // 10ms detection delay
bool hardware_detection_enabled; // ESP32-S3 BOD enabled
brownout_response_t response; // Immediate response action
uint32_t supercap_runtime_ms; // Available supercapacitor runtime
} brownout_config_t;
typedef enum {
BROWNOUT_RESPONSE_IMMEDIATE_FLUSH = 0, // Flush critical data immediately
BROWNOUT_RESPONSE_GRACEFUL_SHUTDOWN = 1, // Attempt graceful shutdown
BROWNOUT_RESPONSE_EMERGENCY_SAVE = 2 // Emergency data save only
} brownout_response_t;
typedef struct {
bool brownout_detected; // Current brownout status
uint64_t brownout_start_time; // Brownout detection timestamp
uint32_t brownout_duration_ms; // Duration of current brownout
uint32_t brownout_count; // Total brownout events
float min_voltage_recorded; // Minimum voltage during event
power_loss_severity_t severity; // Brownout severity classification
} brownout_status_t;
typedef enum {
POWER_LOSS_MINOR = 0, // Brief voltage dip, no action needed
POWER_LOSS_MODERATE = 1, // Voltage drop, flush critical data
POWER_LOSS_SEVERE = 2, // Extended brownout, emergency shutdown
POWER_LOSS_CRITICAL = 3 // Imminent power loss, immediate save
} power_loss_severity_t;
```
**Hardware Configuration:**
- **Brownout Detector:** ESP32-S3 hardware BOD with 3.0V threshold
- **Supercapacitor:** 0.5-1.0F capacitor providing 1-2 seconds runtime at 3.3V
- **Detection ISR:** High-priority interrupt service routine for immediate response
- **Voltage Monitoring:** Continuous ADC monitoring of supply voltage
**Brownout Response Flow:**
```mermaid
sequenceDiagram
participant PWR as Power Supply
participant BOD as Brownout Detector
participant ISR as Brownout ISR
participant PWR_MGR as Power Manager
participant PERSIST as Persistence
participant DIAG as Diagnostics
Note over PWR,DIAG: Brownout Detection and Response
PWR->>PWR: voltageDropBelow3.0V()
PWR->>BOD: triggerBrownoutDetection()
BOD->>ISR: brownoutInterrupt()
ISR->>ISR: setPowerLossFlag()
ISR->>PWR_MGR: notifyBrownoutDetected()
PWR_MGR->>PWR_MGR: assessSeverity(voltage, duration)
alt Severity >= MODERATE
PWR_MGR->>PERSIST: flushCriticalDataImmediate()
PERSIST->>PERSIST: flushMachineConstants()
PERSIST->>PERSIST: flushCalibrationData()
PERSIST->>PERSIST: flushDiagnosticEvents()
PERSIST-->>PWR_MGR: criticalDataFlushed()
end
alt Severity >= SEVERE
PWR_MGR->>PWR_MGR: initiateGracefulShutdown()
PWR_MGR->>DIAG: logPowerEvent(BROWNOUT_SHUTDOWN)
end
PWR->>PWR: voltageRestored()
BOD->>ISR: brownoutCleared()
ISR->>PWR_MGR: notifyBrownoutCleared()
PWR_MGR->>PWR_MGR: initiatePowerRecovery()
```
**Critical Data Flush Priority:**
1. **Machine Constants:** System configuration and calibration parameters
2. **Diagnostic Events:** Recent fault and warning events
3. **Sensor Calibration:** Current sensor calibration data
4. **System State:** Current system state and operational parameters
5. **Recent Sensor Data:** Latest sensor readings (if time permits)
**Supercapacitor Runtime Management:**
```c
typedef struct {
float capacitance_f; // Supercapacitor capacitance (0.5-1.0F)
float initial_voltage_v; // Initial charge voltage (3.3V)
float cutoff_voltage_v; // Minimum operating voltage (2.7V)
uint32_t estimated_runtime_ms; // Calculated runtime at current load
uint32_t flush_time_budget_ms; // Time allocated for data flush
bool supercap_present; // Supercapacitor detection status
} supercapacitor_config_t;
// Runtime calculation: t = C * (V_initial - V_cutoff) / I_load
uint32_t calculateSupercapRuntime(float capacitance, float v_init,
float v_cutoff, float current_ma);
```
### 2.2 F-PWR-002: Power-Loss Recovery
**Description:** Comprehensive power-loss recovery system that detects power restoration, performs system integrity checks, and restores normal operation with full data consistency validation.
**Recovery Configuration:**
```c
typedef struct {
uint32_t power_stabilization_delay_ms; // 100ms stabilization wait
uint32_t recovery_timeout_ms; // 30s maximum recovery time
bool integrity_check_required; // Data integrity validation
bool state_restoration_enabled; // System state restoration
recovery_mode_t recovery_mode; // Recovery behavior mode
} power_recovery_config_t;
typedef enum {
RECOVERY_MODE_FAST = 0, // Quick recovery, minimal checks
RECOVERY_MODE_SAFE = 1, // Full integrity checks
RECOVERY_MODE_DIAGNOSTIC = 2 // Extended diagnostics during recovery
} recovery_mode_t;
typedef struct {
bool power_restored; // Power restoration status
uint64_t power_loss_start; // Power loss start timestamp
uint64_t power_loss_duration; // Total power loss duration
uint32_t recovery_attempts; // Number of recovery attempts
recovery_status_t status; // Current recovery status
data_integrity_result_t integrity; // Data integrity check results
} power_recovery_status_t;
typedef enum {
RECOVERY_STATUS_PENDING = 0, // Recovery not started
RECOVERY_STATUS_IN_PROGRESS = 1, // Recovery in progress
RECOVERY_STATUS_SUCCESS = 2, // Recovery completed successfully
RECOVERY_STATUS_FAILED = 3, // Recovery failed
RECOVERY_STATUS_PARTIAL = 4 // Partial recovery (degraded mode)
} recovery_status_t;
```
**Data Integrity Validation:**
```c
typedef struct {
bool machine_constants_valid; // MC data integrity
bool calibration_data_valid; // Calibration integrity
bool diagnostic_logs_valid; // Diagnostic data integrity
bool system_state_valid; // State data integrity
bool sensor_data_valid; // Sensor data integrity
uint32_t corrupted_files; // Number of corrupted files
uint32_t recovered_files; // Number of recovered files
} data_integrity_result_t;
typedef enum {
INTEGRITY_CHECK_PASS = 0, // All data intact
INTEGRITY_CHECK_MINOR_LOSS = 1, // Minor data loss, recoverable
INTEGRITY_CHECK_MAJOR_LOSS = 2, // Major data loss, degraded operation
INTEGRITY_CHECK_CRITICAL_LOSS = 3 // Critical data loss, requires intervention
} integrity_check_result_t;
```
**Power Recovery Flow:**
```mermaid
sequenceDiagram
participant PWR as Power Supply
participant PWR_MGR as Power Manager
participant PERSIST as Persistence
participant DIAG as Diagnostics
participant STM as State Manager
participant SENSOR as Sensor Manager
Note over PWR,SENSOR: Power Recovery Sequence
PWR->>PWR: powerRestored()
PWR->>PWR_MGR: notifyPowerRestoration()
PWR_MGR->>PWR_MGR: waitForStabilization(100ms)
PWR_MGR->>PWR_MGR: detectPowerLossDuration()
PWR_MGR->>PERSIST: performIntegrityCheck()
PERSIST->>PERSIST: validateMachineConstants()
PERSIST->>PERSIST: validateCalibrationData()
PERSIST->>PERSIST: validateDiagnosticLogs()
PERSIST-->>PWR_MGR: integrityResults()
alt Integrity Check PASS
PWR_MGR->>STM: restoreSystemState()
STM->>STM: transitionToRunningState()
PWR_MGR->>SENSOR: restoreSensorConfiguration()
PWR_MGR->>DIAG: logPowerEvent(RECOVERY_SUCCESS)
else Integrity Check MINOR_LOSS
PWR_MGR->>PERSIST: attemptDataRecovery()
PWR_MGR->>STM: transitionToWarningState()
PWR_MGR->>DIAG: logPowerEvent(RECOVERY_PARTIAL)
else Integrity Check MAJOR_LOSS
PWR_MGR->>STM: transitionToFaultState()
PWR_MGR->>DIAG: logPowerEvent(RECOVERY_FAILED)
end
PWR_MGR->>PWR_MGR: reportRecoveryStatus()
```
**Recovery Validation Steps:**
1. **Power Stabilization:** Wait 100ms for power supply stabilization
2. **System Clock Recovery:** Restore system time from RTC (if available)
3. **Data Integrity Check:** Validate all persistent data structures
4. **Configuration Restoration:** Reload machine constants and calibration
5. **State Restoration:** Restore system state and component configuration
6. **Sensor Reinitialization:** Reinitialize sensors and resume acquisition
7. **Communication Recovery:** Re-establish network connections
8. **Recovery Reporting:** Log recovery status and any data loss
**RTC Battery Support (Optional):**
```c
typedef struct {
bool rtc_battery_present; // External RTC battery detected
float rtc_battery_voltage; // Current RTC battery voltage
uint64_t time_before_loss; // Time before power loss
uint64_t time_after_recovery; // Time after power recovery
bool time_accuracy_maintained; // Time accuracy status
} rtc_battery_status_t;
// RTC battery specifications: CR2032, 3V, 220mAh
// Maintains time accuracy during power loss up to several months
```
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-PWR-001** | SR-PWR-001, SR-PWR-002, SR-PWR-003, SR-PWR-004 | Brownout detection, data flush, graceful shutdown, clean reboot |
| **F-PWR-002** | SR-PWR-005, SR-PWR-006, SR-PWR-007, SR-PWR-008 | Power recovery, integrity validation, state restoration, event reporting |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-PWR-001** | SWR-PWR-001, SWR-PWR-002, SWR-PWR-003 | BOD configuration, ISR handling, supercapacitor management |
| **F-PWR-002** | SWR-PWR-004, SWR-PWR-005, SWR-PWR-006 | Recovery procedures, integrity checks, state restoration |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Power Manager** | Power event coordination, recovery management | `application_layer/power_manager/` |
| **Error Handler** | Power fault classification, escalation | `application_layer/error_handler/` |
| **Persistence** | Critical data flush, integrity validation | `application_layer/DP_stack/persistence/` |
| **State Manager** | System state coordination during power events | `application_layer/business_stack/STM/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Diagnostics** | Power event logging, recovery reporting | `application_layer/diag_task/` |
| **Sensor Manager** | Sensor state preservation and restoration | `application_layer/business_stack/sensor_manager/` |
| **Machine Constant Manager** | Configuration preservation and restoration | `application_layer/business_stack/machine_constant_manager/` |
| **ADC Driver** | Voltage monitoring, supercapacitor status | `ESP_IDF_FW_wrappers/adc/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Power & Fault Handling Feature"
PWR_MGR[Power Manager]
ERR[Error Handler]
PERSIST[Persistence]
STM[State Manager]
end
subgraph "System Components"
DIAG[Diagnostics]
SENSOR[Sensor Manager]
MC_MGR[MC Manager]
end
subgraph "Hardware Interfaces"
BOD[Brownout Detector]
ADC[ADC Driver]
SUPERCAP[Supercapacitor]
RTC[RTC Battery]
end
subgraph "Storage"
NVS[NVS Flash]
SD[SD Card]
end
BOD -->|Brownout ISR| PWR_MGR
ADC -->|Voltage Monitoring| PWR_MGR
SUPERCAP -->|Runtime Power| PWR_MGR
RTC -->|Time Backup| PWR_MGR
PWR_MGR <--> STM
PWR_MGR <--> ERR
PWR_MGR <--> PERSIST
PWR_MGR --> DIAG
PWR_MGR --> SENSOR
PWR_MGR --> MC_MGR
PERSIST --> NVS
PERSIST --> SD
ERR -.->|Power Faults| DIAG
STM -.->|State Events| DIAG
```
### 4.4 Power Event Sequence
```mermaid
sequenceDiagram
participant HW as Hardware
participant BOD as Brownout Detector
participant PWR as Power Manager
participant PERSIST as Persistence
participant STM as State Manager
participant DIAG as Diagnostics
Note over HW,DIAG: Complete Power Event Cycle
HW->>BOD: voltageDropDetected(2.9V)
BOD->>PWR: brownoutISR()
PWR->>PWR: assessPowerLossSeverity()
alt Critical Power Loss
PWR->>PERSIST: emergencyDataFlush()
PERSIST->>NVS: flushCriticalData()
PWR->>STM: notifyPowerLoss(CRITICAL)
PWR->>DIAG: logPowerEvent(BROWNOUT_CRITICAL)
end
Note over HW,DIAG: Power Loss Period
HW->>HW: powerLost()
Note over HW,DIAG: Power Restoration
HW->>PWR: powerRestored()
PWR->>PWR: waitForStabilization()
PWR->>PERSIST: performIntegrityCheck()
alt Data Integrity OK
PWR->>STM: restoreSystemState()
STM->>STM: transitionToRunning()
PWR->>DIAG: logPowerEvent(RECOVERY_SUCCESS)
else Data Corruption Detected
PWR->>STM: transitionToFault()
PWR->>DIAG: logPowerEvent(RECOVERY_FAILED)
end
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **Continuous Monitoring:**
- Monitor supply voltage using ADC and hardware brownout detector
- Track supercapacitor charge level and estimated runtime
- Maintain power quality statistics and trend analysis
- Report power events to diagnostics system
2. **Brownout Response:**
- Detect voltage drop below 3.0V threshold within 10ms
- Assess brownout severity based on voltage level and duration
- Execute immediate data flush for critical system data
- Coordinate graceful shutdown if extended brownout detected
3. **Power Recovery:**
- Detect power restoration and wait for stabilization
- Perform comprehensive data integrity validation
- Restore system state and component configuration
- Resume normal operation or enter degraded mode if data loss detected
4. **Event Reporting:**
- Log all power events with timestamps and severity
- Report power quality metrics to Main Hub
- Maintain power event history for trend analysis
- Generate diagnostic alerts for recurring power issues
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Supercapacitor Failure** | Voltage monitoring, runtime calculation | Log warning, reduce flush scope |
| **Data Flush Timeout** | Flush operation timeout | Abort flush, log partial completion |
| **Recovery Failure** | Integrity check failure | Enter fault state, request intervention |
| **RTC Battery Low** | Battery voltage monitoring | Log warning, continue without RTC |
| **Repeated Brownouts** | Event frequency analysis | Escalate to system fault, notify Main Hub |
| **Critical Data Loss** | Integrity validation failure | Enter fault state, preserve remaining data |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Initialize power monitoring, configure brownout detection |
| **RUNNING** | Full power monitoring, immediate brownout response |
| **WARNING** | Enhanced power monitoring, preemptive data flush |
| **FAULT** | Critical power functions only, preserve fault data |
| **OTA_UPDATE** | Reject OTA if power unstable, maintain power monitoring |
| **TEARDOWN** | Coordinate with teardown, ensure data preservation |
| **SERVICE** | Limited power monitoring for diagnostics |
| **SD_DEGRADED** | NVS-only data flush, reduced recovery capabilities |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Brownout Detection:** Maximum 10ms detection delay
- **Data Flush:** Must complete within supercapacitor runtime (1-2 seconds)
- **Power Stabilization:** 100ms wait after power restoration
- **Recovery Timeout:** Maximum 30 seconds for complete recovery
### 6.2 Resource Constraints
- **Supercapacitor Runtime:** 1-2 seconds at 3.3V with 0.5-1.0F capacitance
- **Critical Data Size:** Maximum data that can be flushed within runtime
- **Memory Usage:** Maximum 16KB for power management buffers
- **Flash Wear:** Minimize NVS writes during frequent brownouts
### 6.3 Hardware Constraints
- **Brownout Threshold:** 3.0V ±0.1V (ESP32-S3 BOD limitation)
- **Voltage Monitoring:** ADC accuracy ±50mV
- **Supercapacitor Leakage:** Account for self-discharge over time
- **RTC Battery Life:** CR2032 provides several months of timekeeping
## 7. Interface Specifications
### 7.1 Power Manager Public API
```c
// Power management initialization
bool powerMgr_initialize(void);
bool powerMgr_configureBrownoutDetection(const brownout_config_t* config);
bool powerMgr_configureRecovery(const power_recovery_config_t* config);
// Power monitoring
bool powerMgr_getCurrentVoltage(float* voltage);
bool powerMgr_getSupercapStatus(supercapacitor_status_t* status);
bool powerMgr_getPowerQuality(power_quality_metrics_t* metrics);
// Power event handling
bool powerMgr_isBrownoutActive(void);
bool powerMgr_isRecoveryInProgress(void);
bool powerMgr_getPowerEventHistory(power_event_t* events, size_t* count);
// Emergency operations
bool powerMgr_triggerEmergencyFlush(void);
bool powerMgr_estimateFlushTime(uint32_t* estimated_ms);
```
### 7.2 Brownout ISR Interface
```c
// Brownout interrupt service routine
void IRAM_ATTR brownout_isr_handler(void* arg);
// ISR-safe operations
void IRAM_ATTR powerMgr_setBrownoutFlag(void);
void IRAM_ATTR powerMgr_recordBrownoutTime(void);
void IRAM_ATTR powerMgr_triggerEmergencyResponse(void);
```
### 7.3 Recovery Validation Interface
```c
// Data integrity validation
bool powerMgr_validateDataIntegrity(data_integrity_result_t* result);
bool powerMgr_attemptDataRecovery(const char* data_type);
bool powerMgr_restoreSystemConfiguration(void);
// Recovery status
recovery_status_t powerMgr_getRecoveryStatus(void);
bool powerMgr_isRecoveryComplete(void);
bool powerMgr_getRecoveryReport(recovery_report_t* report);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Brownout Detection:** Simulated voltage drops and ISR response
- **Data Flush:** Critical data preservation under time constraints
- **Recovery Logic:** Data integrity validation and state restoration
- **Supercapacitor Management:** Runtime calculation and monitoring
### 8.2 Integration Testing
- **End-to-End Power Cycle:** Complete brownout and recovery sequence
- **State Coordination:** Integration with State Manager during power events
- **Data Persistence:** Integration with Persistence component for data flush
- **Diagnostic Integration:** Power event logging and reporting
### 8.3 System Testing
- **Hardware Power Testing:** Real power supply interruptions and brownouts
- **Stress Testing:** Repeated power cycles and brownout events
- **Data Integrity Testing:** Validation of data preservation under various scenarios
- **Performance Testing:** Timing constraints under different system loads
### 8.4 Acceptance Criteria
- Brownout detection responds within 10ms of voltage drop
- Critical data successfully preserved during power loss events
- System recovers gracefully from power interruptions
- Data integrity maintained across power cycles
- No data corruption during normal power events
- Power quality monitoring provides accurate metrics
- Complete audit trail of all power events
## 9. Dependencies
### 9.1 Internal Dependencies
- **State Manager:** System state coordination during power events
- **Persistence:** Critical data flush and integrity validation
- **Error Handler:** Power fault classification and escalation
- **Diagnostics:** Power event logging and reporting
### 9.2 External Dependencies
- **ESP-IDF Framework:** Brownout detector, ADC, NVS, RTC
- **Hardware Components:** Supercapacitor, RTC battery, voltage regulators
- **FreeRTOS:** ISR handling, task coordination
- **Power Supply:** Stable 3.3V supply with brownout protection
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Predictive Power Management:** Machine learning for power failure prediction
- **Advanced Supercapacitor Management:** Dynamic runtime optimization
- **Power Quality Analytics:** Advanced power supply analysis and reporting
- **Battery Backup Support:** Optional battery backup for extended operation
### 10.2 Scalability Considerations
- **Fleet Power Monitoring:** Centralized power quality monitoring across hubs
- **Predictive Maintenance:** Power supply health monitoring and replacement alerts
- **Advanced Recovery:** Multi-level recovery strategies based on data loss severity
- **Energy Harvesting:** Integration with renewable energy sources
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-PWR, SWR-PWR)
**Next Review:** After component implementation

View File

@@ -0,0 +1,693 @@
# Feature Specification: Security & Safety
# Feature ID: F-SEC (F-SEC-001 to F-SEC-004)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** Security & Safety
## 1. Feature Overview
### 1.1 Feature Purpose
The Security & Safety feature provides comprehensive security enforcement and safety mechanisms for the ASF Sensor Hub. This feature ensures that only trusted firmware executes, sensitive data is protected at rest and in transit, and all communications maintain confidentiality and integrity through cryptographic mechanisms.
### 1.2 Feature Scope
**In Scope:**
- Hardware-enforced secure boot with cryptographic verification
- Flash encryption for sensitive data protection at rest
- Mutual TLS (mTLS) for secure communication channels
- Security violation detection and response mechanisms
- Device identity management and authentication
**Out of Scope:**
- Cloud server security policies and infrastructure
- User identity management and access control systems
- Physical tamper detection hardware (future enhancement)
- Cryptographic key generation and signing infrastructure
## 2. Sub-Features
### 2.1 F-SEC-001: Secure Boot
**Description:** Hardware-enforced secure boot implementation using Secure Boot V2 to ensure only authenticated and authorized firmware images execute on the Sensor Hub.
**Secure Boot Configuration:**
```c
typedef struct {
secure_boot_version_t version; // Secure Boot V2
signature_algorithm_t algorithm; // RSA-3072 or ECDSA-P256
uint8_t root_key_hash[32]; // Root-of-trust key hash (eFuse)
bool anti_rollback_enabled; // eFuse-based anti-rollback
uint32_t security_version; // Current security version
boot_verification_mode_t mode; // Hardware-enforced verification
} secure_boot_config_t;
typedef enum {
SECURE_BOOT_V2 = 2 // Only supported version
} secure_boot_version_t;
typedef enum {
SIG_ALG_RSA_3072 = 0, // RSA-3072 signature
SIG_ALG_ECDSA_P256 = 1 // ECDSA-P256 signature
} signature_algorithm_t;
typedef enum {
BOOT_MODE_DEVELOPMENT = 0, // Development mode (key revocable)
BOOT_MODE_PRODUCTION = 1 // Production mode (key permanent)
} boot_verification_mode_t;
```
**Boot Verification Flow:**
```mermaid
sequenceDiagram
participant PWR as Power On
participant ROM as ROM Bootloader
participant SB as Secure Boot V2
participant EFUSE as eFuse Storage
participant APP as Application
participant DIAG as Diagnostics
PWR->>ROM: System Reset/Power On
ROM->>SB: Load Firmware Image
SB->>EFUSE: readRootKeyHash()
EFUSE-->>SB: root_key_hash
SB->>SB: verifyFirmwareSignature(root_key_hash)
alt Signature Valid
SB->>SB: checkAntiRollback()
alt Version Valid
SB->>APP: jumpToApplication()
APP->>DIAG: logBootEvent(SECURE_BOOT_SUCCESS)
else Version Invalid
SB->>SB: enterBootFailureState()
SB->>DIAG: logBootEvent(ANTI_ROLLBACK_VIOLATION)
end
else Signature Invalid
SB->>SB: enterBootFailureState()
SB->>DIAG: logBootEvent(SECURE_BOOT_FAILURE)
end
```
**Root-of-Trust Management:**
- **Key Storage:** Root public key hash stored in eFuse (one-time programmable)
- **Key Revocation:** Not supported in production mode (permanent key)
- **Anti-Rollback:** eFuse-based security version enforcement
- **Verification:** Every boot cycle (cold and warm boots)
**Boot Failure Handling:**
- **BOOT_FAILURE State:** System enters safe state, no application execution
- **Diagnostic Logging:** Boot failure events logged to NVS (if accessible)
- **Recovery:** Manual intervention required (re-flashing with valid firmware)
### 2.2 F-SEC-002: Secure Flash Storage
**Description:** Comprehensive flash encryption implementation using AES-256 to protect sensitive data stored in internal flash and external storage devices.
**Flash Encryption Configuration:**
```c
typedef struct {
encryption_algorithm_t algorithm; // AES-256
encryption_mode_t mode; // Release mode (recommended)
uint8_t encryption_key[32]; // Hardware-derived key (eFuse)
bool transparent_decryption; // Automatic decryption on read
flash_encryption_scope_t scope; // Encrypted regions
} flash_encryption_config_t;
typedef enum {
ENCRYPT_AES_256 = 0 // AES-256 encryption
} encryption_algorithm_t;
typedef enum {
ENCRYPT_MODE_DEVELOPMENT = 0, // Development mode (key readable)
ENCRYPT_MODE_RELEASE = 1 // Release mode (key protected)
} encryption_mode_t;
typedef struct {
bool firmware_encrypted; // Application partitions
bool nvs_encrypted; // NVS partition
bool machine_constants_encrypted; // MC data
bool calibration_encrypted; // Calibration data
bool diagnostics_encrypted; // Diagnostic logs
} flash_encryption_scope_t;
```
**Encrypted Data Categories:**
| Data Type | Storage Location | Encryption Method | Access Control |
|-----------|------------------|-------------------|----------------|
| **Firmware Images** | Flash partitions | Hardware AES-256 | Transparent |
| **Machine Constants** | NVS partition | Hardware AES-256 | Component-mediated |
| **Calibration Data** | NVS partition | Hardware AES-256 | Component-mediated |
| **Cryptographic Keys** | eFuse/Secure NVS | Hardware AES-256 | Restricted access |
| **Diagnostic Logs** | NVS partition | Hardware AES-256 | Component-mediated |
| **SD Card Data** | External storage | Software AES-256 | Optional encryption |
**External Storage Encryption:**
```c
typedef struct {
bool sd_encryption_enabled; // SD card encryption flag
uint8_t sd_encryption_key[32]; // SD-specific encryption key
encryption_algorithm_t algorithm; // AES-256 for SD card
file_encryption_policy_t policy; // Per-file encryption policy
} external_storage_encryption_t;
typedef enum {
FILE_ENCRYPT_NONE = 0, // No encryption
FILE_ENCRYPT_SENSITIVE = 1, // Encrypt sensitive files only
FILE_ENCRYPT_ALL = 2 // Encrypt all files
} file_encryption_policy_t;
```
**Encryption Flow:**
```mermaid
sequenceDiagram
participant APP as Application
participant PERSIST as Persistence
participant ENCRYPT as Encryption Engine
participant NVS as NVS Storage
participant SD as SD Card
Note over APP,SD: Secure Data Storage Flow
APP->>PERSIST: storeSensitiveData(data, type)
PERSIST->>PERSIST: classifyDataSensitivity(type)
alt Critical Data (NVS)
PERSIST->>ENCRYPT: encryptData(data, NVS_KEY)
ENCRYPT-->>PERSIST: encrypted_data
PERSIST->>NVS: writeEncrypted(encrypted_data)
NVS-->>PERSIST: writeComplete()
else Regular Data (SD Card)
alt SD Encryption Enabled
PERSIST->>ENCRYPT: encryptData(data, SD_KEY)
ENCRYPT-->>PERSIST: encrypted_data
PERSIST->>SD: writeEncrypted(encrypted_data)
else SD Encryption Disabled
PERSIST->>SD: writePlaintext(data)
end
end
PERSIST-->>APP: storageComplete()
```
### 2.3 F-SEC-003: Encrypted Communication
**Description:** Mutual TLS (mTLS) implementation for secure communication with Main Hub and peer devices, ensuring confidentiality, integrity, and authenticity of all transmitted data.
**Device Identity and Authentication:**
```c
typedef struct {
uint8_t device_certificate[2048]; // X.509 device certificate (max 2KB)
uint8_t private_key[256]; // Device private key (RSA-2048/ECDSA-P256)
uint8_t ca_certificate[2048]; // Certificate Authority certificate
char device_id[64]; // Unique device identifier
uint64_t certificate_expiry; // Certificate expiration timestamp
bool certificate_valid; // Certificate validation status
} device_identity_t;
typedef struct {
tls_version_t version; // TLS 1.2 minimum
cipher_suite_t cipher_suite; // Supported cipher suites
bool mutual_auth_required; // mTLS enforcement
uint32_t session_timeout; // TLS session timeout
bool session_resumption; // Session resumption support
} tls_config_t;
typedef enum {
TLS_VERSION_1_2 = 0x0303, // TLS 1.2 (minimum required)
TLS_VERSION_1_3 = 0x0304 // TLS 1.3 (preferred)
} tls_version_t;
```
**Supported Cipher Suites:**
| Cipher Suite | Key Exchange | Encryption | MAC | Recommended |
|-------------|-------------|------------|-----|-------------|
| **TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384** | ECDHE-RSA | AES-256-GCM | SHA384 | Yes |
| **TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384** | ECDHE-ECDSA | AES-256-GCM | SHA384 | Yes |
| **TLS_RSA_WITH_AES_256_GCM_SHA384** | RSA | AES-256-GCM | SHA384 | Fallback |
**mTLS Handshake Flow:**
```mermaid
sequenceDiagram
participant SH as Sensor Hub
participant MH as Main Hub
participant CA as Certificate Authority
Note over SH,CA: Mutual TLS Handshake
SH->>MH: ClientHello + SupportedCipherSuites
MH->>SH: ServerHello + SelectedCipherSuite
MH->>SH: ServerCertificate
MH->>SH: CertificateRequest
MH->>SH: ServerHelloDone
SH->>SH: validateServerCertificate()
SH->>CA: verifyCertificateChain(server_cert)
CA-->>SH: validationResult(VALID)
SH->>MH: ClientCertificate
SH->>MH: ClientKeyExchange
SH->>MH: CertificateVerify
SH->>MH: ChangeCipherSpec
SH->>MH: Finished
MH->>MH: validateClientCertificate()
MH->>CA: verifyCertificateChain(client_cert)
CA-->>MH: validationResult(VALID)
MH->>SH: ChangeCipherSpec
MH->>SH: Finished
Note over SH,MH: Secure Channel Established
SH<->>MH: EncryptedApplicationData
```
**Certificate Management:**
- **Device Certificate:** Unique X.509 certificate per device (max 2KB)
- **Private Key:** RSA-2048 or ECDSA-P256 stored securely in eFuse/NVS
- **Certificate Chain:** Root CA and intermediate certificates
- **Certificate Rotation:** Managed on broker/server side
- **Revocation:** Certificate Revocation Lists (CRL) or broker-side denylists
**Key Lifecycle Management:**
| Phase | Mechanism | Responsibility |
|-------|-----------|----------------|
| **Manufacturing** | Device certificate and private key injection | Manufacturing process |
| **Provisioning** | Certificate validation and registration | Onboarding system |
| **Operation** | TLS session key generation and management | Runtime TLS stack |
| **Rotation** | Certificate renewal and update | Server-side management |
| **Revocation** | Certificate invalidation and replacement | Certificate Authority |
### 2.4 F-SEC-004: Security Violation Handling
**Description:** Comprehensive security violation detection, classification, and response system to handle security threats and maintain system integrity.
**Security Violation Types:**
```c
typedef enum {
SEC_VIOLATION_BOOT_FAILURE = 0x1001, // Secure boot verification failure
SEC_VIOLATION_AUTH_FAILURE = 0x1002, // Authentication failure
SEC_VIOLATION_CERT_INVALID = 0x1003, // Certificate validation failure
SEC_VIOLATION_MESSAGE_TAMPER = 0x1004, // Message integrity violation
SEC_VIOLATION_UNAUTHORIZED_ACCESS = 0x1005, // Unauthorized access attempt
SEC_VIOLATION_ROLLBACK_ATTEMPT = 0x1006, // Anti-rollback violation
SEC_VIOLATION_KEY_COMPROMISE = 0x1007, // Cryptographic key compromise
SEC_VIOLATION_REPLAY_ATTACK = 0x1008 // Message replay attack
} security_violation_type_t;
typedef struct {
security_violation_type_t type; // Violation type
diagnostic_severity_t severity; // FATAL, ERROR, WARNING
uint64_t timestamp; // Violation timestamp
char source_component[32]; // Component that detected violation
char description[128]; // Human-readable description
uint8_t context_data[64]; // Violation-specific context
uint32_t occurrence_count; // Number of occurrences
bool escalation_triggered; // Escalation flag
} security_violation_event_t;
```
**Violation Response Matrix:**
| Violation Type | Severity | Immediate Response | Escalation Action |
|---------------|----------|-------------------|-------------------|
| **Boot Failure** | FATAL | Enter BOOT_FAILURE state | System halt, manual recovery |
| **Auth Failure** | ERROR | Reject connection, log event | Escalate to FATAL after 3 failures |
| **Cert Invalid** | ERROR | Reject connection, log event | Escalate to FATAL if persistent |
| **Message Tamper** | WARNING | Discard message, log event | Escalate to ERROR if repeated |
| **Unauthorized Access** | FATAL | Deny access, log event | System lockdown |
| **Rollback Attempt** | FATAL | Prevent rollback, log event | System halt |
| **Key Compromise** | FATAL | Revoke keys, log event | System lockdown |
| **Replay Attack** | WARNING | Discard message, log event | Escalate to ERROR if persistent |
**Security Event Flow:**
```mermaid
sequenceDiagram
participant COMP as Security Component
participant SEC as Security Manager
participant DIAG as Diagnostics
participant STM as State Manager
participant LOG as Security Logger
Note over COMP,LOG: Security Violation Detection and Response
COMP->>SEC: reportSecurityViolation(type, context)
SEC->>SEC: classifyViolation(type)
SEC->>SEC: determineResponse(type, severity)
alt Severity == FATAL
SEC->>STM: triggerStateTransition(FAULT)
SEC->>LOG: logSecurityEvent(FATAL, details)
SEC->>DIAG: reportDiagnosticEvent(FATAL, violation)
else Severity == ERROR
SEC->>SEC: checkEscalationCriteria()
alt Escalation Required
SEC->>STM: triggerStateTransition(WARNING)
end
SEC->>LOG: logSecurityEvent(ERROR, details)
SEC->>DIAG: reportDiagnosticEvent(ERROR, violation)
else Severity == WARNING
SEC->>LOG: logSecurityEvent(WARNING, details)
SEC->>DIAG: reportDiagnosticEvent(WARNING, violation)
end
SEC->>SEC: updateViolationStatistics()
SEC->>SEC: checkPatterns()
```
**Escalation Criteria:**
- **Authentication Failures:** 3 consecutive failures within 5 minutes
- **Message Tampering:** 5 tampered messages within 1 minute
- **Certificate Violations:** Persistent certificate validation failures
- **Pattern Detection:** Coordinated attack patterns across multiple violation types
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-SEC-001** | SR-SEC-001, SR-SEC-002, SR-SEC-003, SR-SEC-004 | Secure boot verification and root-of-trust protection |
| **F-SEC-002** | SR-SEC-005, SR-SEC-006, SR-SEC-007, SR-SEC-008 | Flash encryption and secure storage |
| **F-SEC-003** | SR-SEC-009, SR-SEC-010, SR-SEC-011, SR-SEC-012 | Encrypted communication and mTLS |
| **F-SEC-004** | SR-SEC-013, SR-SEC-014, SR-SEC-015 | Security violation handling and response |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-SEC-001** | SWR-SEC-001, SWR-SEC-002, SWR-SEC-003 | Boot verification, signature validation, anti-rollback |
| **F-SEC-002** | SWR-SEC-004, SWR-SEC-005, SWR-SEC-006 | AES-256 encryption, key management, storage protection |
| **F-SEC-003** | SWR-SEC-007, SWR-SEC-008, SWR-SEC-009 | mTLS implementation, certificate management, session security |
| **F-SEC-004** | SWR-SEC-010, SWR-SEC-011, SWR-SEC-012 | Violation detection, response coordination, escalation logic |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **Security Manager** | Security policy enforcement, violation handling | `application_layer/security/` |
| **Secure Boot** | Boot-time firmware verification | `bootloader/secure_boot/` |
| **Encryption Engine** | Cryptographic operations, key management | `application_layer/security/crypto/` |
| **Certificate Manager** | Certificate validation, lifecycle management | `application_layer/security/cert_mgr/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Network Stack** | TLS/DTLS transport layer | `drivers/network_stack/` |
| **NVM Driver** | Secure storage access | `drivers/nvm/` |
| **Diagnostics** | Security event logging | `application_layer/diag_task/` |
| **State Manager** | Security-triggered state transitions | `application_layer/business_stack/STM/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "Security & Safety Feature"
SEC[Security Manager]
BOOT[Secure Boot]
CRYPTO[Encryption Engine]
CERT[Certificate Manager]
end
subgraph "System Components"
STM[State Manager]
DIAG[Diagnostics]
NET[Network Stack]
NVM[NVM Driver]
end
subgraph "Hardware Security"
EFUSE[eFuse Storage]
HWCRYPTO[Hardware Crypto]
FLASH[Flash Memory]
end
subgraph "External Interfaces"
MH[Main Hub]
CA[Certificate Authority]
end
BOOT --> EFUSE
BOOT --> HWCRYPTO
BOOT --> SEC
SEC <--> STM
SEC <--> DIAG
SEC --> CRYPTO
SEC --> CERT
CRYPTO --> HWCRYPTO
CRYPTO --> NVM
CERT --> NET
CERT --> CA
NET <-->|mTLS| MH
SEC -.->|Security Events| DIAG
STM -.->|State Changes| SEC
```
### 4.4 Security Enforcement Flow
```mermaid
sequenceDiagram
participant BOOT as Secure Boot
participant SEC as Security Manager
participant CRYPTO as Encryption Engine
participant CERT as Certificate Manager
participant NET as Network Stack
participant MH as Main Hub
Note over BOOT,MH: Security Enforcement Flow
BOOT->>BOOT: verifyFirmwareSignature()
BOOT->>SEC: secureBootComplete(SUCCESS)
SEC->>SEC: initializeSecurityPolicies()
SEC->>CRYPTO: initializeEncryption()
CRYPTO->>CRYPTO: loadEncryptionKeys()
SEC->>CERT: initializeCertificates()
CERT->>CERT: validateDeviceCertificate()
NET->>CERT: establishTLSConnection(main_hub)
CERT->>CERT: performMutualAuthentication()
CERT-->>NET: tlsConnectionEstablished()
NET<->>MH: secureDataExchange()
alt Security Violation Detected
SEC->>SEC: handleSecurityViolation(type)
SEC->>DIAG: logSecurityEvent(violation)
SEC->>STM: triggerSecurityResponse(action)
end
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **Boot-Time Security:**
- Secure Boot V2 verifies firmware signature using root-of-trust
- Anti-rollback mechanism prevents firmware downgrade attacks
- Flash encryption automatically decrypts application code
- Security Manager initializes and loads security policies
2. **Runtime Security:**
- All communication channels use mTLS with mutual authentication
- Sensitive data encrypted before storage using AES-256
- Certificate validation performed for all external connections
- Security violations monitored and logged continuously
3. **Communication Security:**
- Device certificate presented during TLS handshake
- Server certificate validated against trusted CA chain
- Encrypted data exchange using negotiated cipher suite
- Session keys rotated according to security policy
4. **Violation Response:**
- Security violations detected and classified by severity
- Immediate response actions taken based on violation type
- Escalation logic applied for repeated or coordinated attacks
- Security events logged for audit and analysis
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Boot Verification Failure** | Signature validation failure | Enter BOOT_FAILURE state, halt system |
| **Certificate Validation Failure** | X.509 validation error | Reject connection, log security event |
| **Encryption Key Failure** | Key derivation/access error | Enter FAULT state, disable encryption |
| **TLS Handshake Failure** | Protocol negotiation failure | Retry with fallback, log failure |
| **Message Integrity Failure** | MAC/signature verification failure | Discard message, log tampering event |
| **Anti-Rollback Violation** | Version check failure | Prevent boot, log violation |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Initialize security components, load certificates and keys |
| **RUNNING** | Full security enforcement, continuous violation monitoring |
| **WARNING** | Enhanced security monitoring, stricter validation |
| **FAULT** | Critical security functions only, preserve security logs |
| **OTA_UPDATE** | Secure OTA validation, maintain security during update |
| **TEARDOWN** | Secure data flush, maintain encryption during shutdown |
| **SERVICE** | Limited security access for diagnostics |
| **BOOT_FAILURE** | Security violation state, no application execution |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **Boot Verification:** Maximum 5 seconds for secure boot completion
- **TLS Handshake:** Maximum 10 seconds for mTLS establishment
- **Certificate Validation:** Maximum 2 seconds per certificate
- **Violation Response:** Maximum 100ms for immediate response actions
### 6.2 Resource Constraints
- **Certificate Storage:** Maximum 2KB per certificate (device, CA)
- **Key Storage:** Secure storage in eFuse or encrypted NVS
- **Memory Usage:** Maximum 64KB for security buffers and state
- **CPU Usage:** Maximum 10% for cryptographic operations
### 6.3 Security Constraints
- **Root-of-Trust:** eFuse-based, one-time programmable
- **Key Protection:** Hardware-protected keys, no plaintext exposure
- **Certificate Validation:** Full chain validation required
- **Encryption Strength:** AES-256 minimum for all encryption
## 7. Interface Specifications
### 7.1 Security Manager Public API
```c
// Security initialization and control
bool secMgr_initialize(void);
bool secMgr_isSecurityEnabled(void);
security_status_t secMgr_getSecurityStatus(void);
// Violation handling
bool secMgr_reportViolation(security_violation_type_t type,
const char* source, const uint8_t* context);
bool secMgr_getViolationHistory(security_violation_event_t* events, size_t* count);
bool secMgr_clearViolationHistory(void);
// Security policy management
bool secMgr_setSecurityPolicy(const security_policy_t* policy);
bool secMgr_getSecurityPolicy(security_policy_t* policy);
bool secMgr_enforceSecurityPolicy(void);
```
### 7.2 Encryption Engine API
```c
// Encryption operations
bool crypto_encrypt(const uint8_t* plaintext, size_t plaintext_len,
const uint8_t* key, uint8_t* ciphertext, size_t* ciphertext_len);
bool crypto_decrypt(const uint8_t* ciphertext, size_t ciphertext_len,
const uint8_t* key, uint8_t* plaintext, size_t* plaintext_len);
// Hash operations
bool crypto_sha256(const uint8_t* data, size_t data_len, uint8_t* hash);
bool crypto_hmac_sha256(const uint8_t* data, size_t data_len,
const uint8_t* key, size_t key_len, uint8_t* hmac);
// Key management
bool crypto_generateKey(key_type_t type, uint8_t* key, size_t key_len);
bool crypto_deriveKey(const uint8_t* master_key, const char* context,
uint8_t* derived_key, size_t key_len);
```
### 7.3 Certificate Manager API
```c
// Certificate operations
bool certMgr_loadDeviceCertificate(const uint8_t* cert_data, size_t cert_len);
bool certMgr_validateCertificate(const uint8_t* cert_data, size_t cert_len);
bool certMgr_getCertificateInfo(certificate_info_t* info);
// TLS integration
bool certMgr_setupTLSContext(tls_context_t* ctx);
bool certMgr_validatePeerCertificate(const uint8_t* peer_cert, size_t cert_len);
bool certMgr_getTLSCredentials(tls_credentials_t* credentials);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **Secure Boot:** Firmware signature validation with valid/invalid signatures
- **Encryption:** AES-256 encryption/decryption with known test vectors
- **Certificate Validation:** X.509 certificate parsing and validation
- **Violation Handling:** All violation types and response actions
### 8.2 Integration Testing
- **End-to-End Security:** Complete security flow from boot to communication
- **mTLS Integration:** Full TLS handshake with certificate validation
- **State Integration:** Security behavior across all system states
- **Cross-Component Security:** Security enforcement across all components
### 8.3 System Testing
- **Security Penetration Testing:** Simulated attacks and vulnerability assessment
- **Performance Testing:** Cryptographic operations under load
- **Fault Injection:** Security behavior under hardware/software faults
- **Long-Duration Testing:** Security stability over extended operation
### 8.4 Acceptance Criteria
- Secure boot prevents execution of unsigned firmware
- All sensitive data encrypted at rest and in transit
- mTLS successfully established with valid certificates
- Security violations properly detected and responded to
- No security vulnerabilities identified in penetration testing
- Performance impact of security features within acceptable limits
- Complete audit trail of all security events
## 9. Dependencies
### 9.1 Internal Dependencies
- **State Manager:** Security-triggered state transitions
- **Diagnostics:** Security event logging and audit trail
- **Network Stack:** TLS/DTLS transport implementation
- **NVM Driver:** Secure storage for keys and certificates
### 9.2 External Dependencies
- **ESP-IDF Security Features:** Secure Boot V2, Flash Encryption, eFuse
- **Hardware Security Module:** Hardware-accelerated cryptography
- **Certificate Authority:** Certificate validation and management
- **Cryptographic Libraries:** mbedTLS or equivalent for TLS implementation
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Hardware Security Module:** Dedicated HSM for key management
- **Physical Tamper Detection:** Hardware-based tamper detection
- **Advanced Threat Detection:** Machine learning-based anomaly detection
- **Quantum-Resistant Cryptography:** Post-quantum cryptographic algorithms
### 10.2 Scalability Considerations
- **Fleet Security Management:** Centralized security policy management
- **Certificate Automation:** Automated certificate lifecycle management
- **Security Analytics:** Advanced security event correlation and analysis
- **Zero-Trust Architecture:** Comprehensive zero-trust security model
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-SEC, SWR-SEC)
**Next Review:** After component implementation

View File

@@ -0,0 +1,640 @@
# Feature Specification: System Management
# Feature ID: F-SYS (F-SYS-001 to F-SYS-005)
**Document Type:** Feature Specification
**Version:** 1.0
**Date:** 2025-01-19
**Feature Category:** System Management
## 1. Feature Overview
### 1.1 Feature Purpose
The System Management feature provides comprehensive control over the ASF Sensor Hub's operational lifecycle, state management, local human-machine interface, and engineering access capabilities. This feature acts as the supervisory layer governing all other functional domains.
### 1.2 Feature Scope
**In Scope:**
- System finite state machine implementation and control
- Controlled teardown sequences for safe transitions
- Local OLED-based human-machine interface
- Engineering and diagnostic access sessions
- GPIO discipline and hardware resource management
**Out of Scope:**
- Main Hub system management
- Cloud-based management interfaces
- User authentication and role management
- Remote control of other Sub-Hubs
## 2. Sub-Features
### 2.1 F-SYS-001: System State Management
**Description:** Comprehensive finite state machine controlling all system operations and transitions.
**System States (11 Total):**
| State | Description | Entry Conditions | Exit Conditions |
|-------|-------------|------------------|-----------------|
| **INIT** | Hardware and software initialization | Power-on, reset | Initialization complete or failure |
| **BOOT_FAILURE** | Secure boot verification failed | Boot verification failure | Manual recovery or reset |
| **RUNNING** | Normal sensor acquisition and communication | Successful initialization | Fault detected, update requested |
| **WARNING** | Non-fatal fault detected, degraded operation | Recoverable fault | Fault cleared or escalated |
| **FAULT** | Fatal error, core functionality disabled | Critical fault | Manual recovery or reset |
| **OTA_PREP** | OTA preparation phase | OTA request accepted | Teardown complete or OTA cancelled |
| **OTA_UPDATE** | Firmware update in progress | OTA preparation complete | Update complete or failed |
| **MC_UPDATE** | Machine constants update in progress | MC update request | Update complete or failed |
| **TEARDOWN** | Controlled shutdown sequence | Update request, fault escalation | Teardown complete |
| **SERVICE** | Engineering or diagnostic interaction | Service request | Service session ended |
| **SD_DEGRADED** | SD card failure detected, fallback mode | SD card failure | SD card restored or replaced |
**State Transition Matrix:**
```mermaid
stateDiagram-v2
[*] --> INIT
INIT --> RUNNING : Initialization Success
INIT --> BOOT_FAILURE : Boot Verification Failed
BOOT_FAILURE --> INIT : Manual Recovery
RUNNING --> WARNING : Non-Fatal Fault
RUNNING --> FAULT : Fatal Fault
RUNNING --> OTA_PREP : OTA Request
RUNNING --> MC_UPDATE : MC Update Request
RUNNING --> SERVICE : Service Request
RUNNING --> SD_DEGRADED : SD Card Failure
WARNING --> RUNNING : Fault Cleared
WARNING --> FAULT : Fault Escalated
WARNING --> SERVICE : Service Request
FAULT --> INIT : Manual Recovery
FAULT --> SERVICE : Service Request
OTA_PREP --> TEARDOWN : Ready for OTA
OTA_PREP --> RUNNING : OTA Cancelled
TEARDOWN --> OTA_UPDATE : OTA Teardown
TEARDOWN --> MC_UPDATE : MC Teardown
TEARDOWN --> INIT : Reset Teardown
OTA_UPDATE --> INIT : Update Complete
OTA_UPDATE --> FAULT : Update Failed
MC_UPDATE --> RUNNING : Update Success
MC_UPDATE --> FAULT : Update Failed
SERVICE --> RUNNING : Service Complete
SERVICE --> FAULT : Service Error
SD_DEGRADED --> RUNNING : SD Restored
SD_DEGRADED --> FAULT : Critical SD Error
```
### 2.2 F-SYS-002: Controlled Teardown Mechanism
**Description:** Safe system shutdown ensuring data consistency and resource cleanup.
**Teardown Triggers:**
- Firmware update (OTA) request
- Machine constants update request
- Fatal system fault escalation
- Manual engineering command
- System reset request
**Teardown Sequence (Mandatory Order):**
1. **Stop Active Operations**
- Halt sensor acquisition tasks
- Pause communication activities
- Suspend diagnostic operations
2. **Data Preservation**
- Flush pending sensor data via DP component
- Persist current system state
- Save diagnostic events and logs
- Update machine constants if modified
3. **Resource Cleanup**
- Close active communication sessions
- Release hardware resources (I2C, SPI, UART)
- Stop non-essential tasks
- Clear temporary buffers
4. **State Transition**
- Verify data persistence completion
- Update system state to target state
- Signal teardown completion
- Enter target operational mode
**Teardown Verification:**
```mermaid
sequenceDiagram
participant STM as State Manager
participant SM as Sensor Manager
participant DP as Data Persistence
participant COM as Communication
participant ES as Event System
Note over STM,ES: Teardown Initiation
STM->>ES: publish(TEARDOWN_INITIATED)
STM->>SM: stopAcquisition()
SM-->>STM: acquisitionStopped()
STM->>COM: closeSessions()
COM-->>STM: sessionsClosed()
STM->>DP: flushCriticalData()
DP->>DP: persistSensorData()
DP->>DP: persistSystemState()
DP->>DP: persistDiagnostics()
DP-->>STM: flushComplete()
STM->>STM: releaseResources()
STM->>ES: publish(TEARDOWN_COMPLETE)
Note over STM,ES: Ready for Target State
```
### 2.3 F-SYS-003: Local Human-Machine Interface (HMI)
**Description:** OLED-based local interface with three-button navigation for system status and diagnostics.
**Hardware Components:**
- **OLED Display:** 128x64 pixels, I2C interface (SSD1306 compatible)
- **Navigation Buttons:** 3 physical buttons (Up, Down, Select)
- **Status Indicators:** Software-based status display
**Main Screen Display:**
```
┌─────────────────────────┐
│ ASF Sensor Hub v1.0 │
│ Status: RUNNING │
│ ─────────────────────── │
│ WiFi: Connected (75%) │
│ Sensors: 6/7 Active │
│ Storage: 2.1GB Free │
│ Time: 14:32:15 │
│ │
│ [SELECT] for Menu │
└─────────────────────────┘
```
**Menu Structure:**
```mermaid
graph TD
MAIN[Main Screen] --> MENU{Main Menu}
MENU --> DIAG[Diagnostics]
MENU --> SENSORS[Sensors]
MENU --> HEALTH[System Health]
MENU --> NETWORK[Network Status]
MENU --> SERVICE[Service Mode]
DIAG --> DIAG_ACTIVE[Active Diagnostics]
DIAG --> DIAG_HISTORY[Diagnostic History]
DIAG --> DIAG_CLEAR[Clear Diagnostics]
SENSORS --> SENSOR_LIST[Sensor List]
SENSORS --> SENSOR_STATUS[Sensor Status]
SENSORS --> SENSOR_DATA[Latest Data]
HEALTH --> HEALTH_CPU[CPU Usage]
HEALTH --> HEALTH_MEM[Memory Usage]
HEALTH --> HEALTH_STORAGE[Storage Status]
HEALTH --> HEALTH_UPTIME[System Uptime]
NETWORK --> NET_WIFI[WiFi Status]
NETWORK --> NET_MAIN[Main Hub Conn]
NETWORK --> NET_PEER[Peer Status]
SERVICE --> SERVICE_AUTH[Authentication]
SERVICE --> SERVICE_LOGS[View Logs]
SERVICE --> SERVICE_REBOOT[System Reboot]
```
**Button Navigation Logic:**
- **UP Button:** Navigate up in menus, scroll up in lists
- **DOWN Button:** Navigate down in menus, scroll down in lists
- **SELECT Button:** Enter submenu, confirm action, return to main
### 2.4 F-SYS-004: Engineering Access Sessions
**Description:** Secure engineering and diagnostic access for system maintenance and troubleshooting.
**Session Types:**
| Session Type | Access Level | Capabilities | Authentication |
|-------------|-------------|--------------|----------------|
| **Diagnostic Session** | Read-only | Log retrieval, status inspection | Basic PIN |
| **Engineering Session** | Read-write | Configuration, controlled commands | Certificate-based |
| **Service Session** | Full access | System control, firmware access | Multi-factor |
**Supported Access Methods:**
- **Local UART:** Direct serial connection for field service
- **Network Session:** Encrypted connection via Main Hub
- **Local HMI:** Limited diagnostic access via OLED interface
**Engineering Commands:**
```c
// System information
CMD_GET_SYSTEM_INFO
CMD_GET_SENSOR_STATUS
CMD_GET_DIAGNOSTIC_LOGS
CMD_GET_PERFORMANCE_STATS
// Configuration management
CMD_GET_MACHINE_CONSTANTS
CMD_UPDATE_MACHINE_CONSTANTS
CMD_VALIDATE_CONFIGURATION
CMD_BACKUP_CONFIGURATION
// System control
CMD_REBOOT_SYSTEM
CMD_INITIATE_TEARDOWN
CMD_CLEAR_DIAGNOSTICS
CMD_RESET_STATISTICS
// Debug operations
CMD_ENABLE_DEBUG_LOGGING
CMD_DUMP_MEMORY_USAGE
CMD_TRIGGER_DIAGNOSTIC_TEST
CMD_MONITOR_REAL_TIME
```
### 2.5 F-SYS-005: GPIO Discipline and Hardware Management
**Description:** Centralized GPIO resource management and hardware access control.
**GPIO Ownership Model:**
```c
typedef enum {
GPIO_OWNER_NONE = 0,
GPIO_OWNER_SENSOR_I2C,
GPIO_OWNER_SENSOR_SPI,
GPIO_OWNER_SENSOR_UART,
GPIO_OWNER_SENSOR_ADC,
GPIO_OWNER_COMMUNICATION,
GPIO_OWNER_STORAGE,
GPIO_OWNER_HMI,
GPIO_OWNER_SYSTEM,
GPIO_OWNER_DEBUG
} gpio_owner_t;
typedef struct {
uint8_t pin_number;
gpio_owner_t owner;
gpio_mode_t mode;
bool is_allocated;
char description[32];
} gpio_resource_t;
```
**Resource Allocation Rules:**
- Each GPIO pin has single owner at any time
- Ownership must be requested and granted before use
- Automatic release on component shutdown
- Conflict detection and resolution
## 3. Requirements Coverage
### 3.1 System Requirements (SR-XXX)
| Feature | System Requirements | Description |
|---------|-------------------|-------------|
| **F-SYS-001** | SR-SYS-001 | Finite state machine with 11 defined states |
| **F-SYS-002** | SR-SYS-002, SR-SYS-003 | State-aware operation and controlled teardown |
| **F-SYS-003** | SR-SYS-004 | Local HMI with OLED display and button navigation |
| **F-SYS-004** | SR-SYS-005 | Engineering access sessions with authentication |
| **F-SYS-005** | SR-HW-003 | GPIO discipline and hardware resource management |
### 3.2 Software Requirements (SWR-XXX)
| Feature | Software Requirements | Implementation Details |
|---------|---------------------|----------------------|
| **F-SYS-001** | SWR-SYS-001, SWR-SYS-002, SWR-SYS-003 | FSM implementation, state validation, transition logic |
| **F-SYS-002** | SWR-SYS-007, SWR-SYS-008, SWR-SYS-009 | Teardown sequence, resource cleanup, completion verification |
| **F-SYS-003** | SWR-SYS-010, SWR-SYS-011, SWR-SYS-012 | OLED driver, button handling, menu navigation |
| **F-SYS-004** | SWR-SYS-013, SWR-SYS-014, SWR-SYS-015 | Session authentication, command interface, access control |
| **F-SYS-005** | SWR-HW-007, SWR-HW-008, SWR-HW-009 | GPIO ownership, access control, conflict prevention |
## 4. Component Implementation Mapping
### 4.1 Primary Components
| Component | Responsibility | Location |
|-----------|---------------|----------|
| **State Manager (STM)** | FSM implementation, state transitions, teardown coordination | `application_layer/business_stack/STM/` |
| **HMI Controller** | OLED display management, button handling, menu navigation | `application_layer/hmi/` |
| **Engineering Session Manager** | Session authentication, command processing, access control | `application_layer/engineering/` |
| **GPIO Manager** | Hardware resource allocation, ownership management | `drivers/gpio_manager/` |
### 4.2 Supporting Components
| Component | Support Role | Interface |
|-----------|-------------|-----------|
| **Event System** | State change notifications, component coordination | `application_layer/business_stack/event_system/` |
| **Data Persistence** | State persistence, configuration storage | `application_layer/DP_stack/persistence/` |
| **Diagnostics Task** | System health monitoring, diagnostic reporting | `application_layer/diag_task/` |
| **Security Manager** | Session authentication, access validation | `application_layer/security/` |
### 4.3 Component Interaction Diagram
```mermaid
graph TB
subgraph "System Management Feature"
STM[State Manager]
HMI[HMI Controller]
ESM[Engineering Session Manager]
GPIO[GPIO Manager]
end
subgraph "Core Components"
ES[Event System]
DP[Data Persistence]
DIAG[Diagnostics Task]
SEC[Security Manager]
end
subgraph "Hardware Interfaces"
OLED[OLED Display]
BUTTONS[Navigation Buttons]
UART[UART Interface]
PINS[GPIO Pins]
end
STM <--> ES
STM --> DP
STM --> DIAG
HMI --> OLED
HMI --> BUTTONS
HMI <--> ES
HMI --> DIAG
ESM --> UART
ESM <--> SEC
ESM <--> STM
ESM --> DP
GPIO --> PINS
GPIO <--> ES
ES -.->|State Events| HMI
ES -.->|System Events| ESM
DIAG -.->|Health Data| HMI
```
### 4.4 State Management Flow
```mermaid
sequenceDiagram
participant EXT as External Trigger
participant STM as State Manager
participant ES as Event System
participant COMP as System Components
participant DP as Data Persistence
Note over EXT,DP: State Transition Request
EXT->>STM: requestStateTransition(target_state, reason)
STM->>STM: validateTransition(current, target)
alt Valid Transition
STM->>ES: publish(STATE_TRANSITION_STARTING)
STM->>COMP: prepareForStateChange(target_state)
COMP-->>STM: preparationComplete()
alt Requires Teardown
STM->>STM: initiateTeardown()
STM->>COMP: stopOperations()
STM->>DP: flushCriticalData()
DP-->>STM: flushComplete()
end
STM->>STM: transitionToState(target_state)
STM->>ES: publish(STATE_CHANGED, new_state)
STM->>DP: persistSystemState(new_state)
else Invalid Transition
STM->>ES: publish(STATE_TRANSITION_REJECTED)
STM-->>EXT: transitionRejected(reason)
end
```
## 5. Feature Behavior
### 5.1 Normal Operation Flow
1. **System Initialization:**
- Power-on self-test and hardware verification
- Load system configuration and machine constants
- Initialize all components and establish communication
- Transition to RUNNING state upon successful initialization
2. **State Management:**
- Monitor system health and component status
- Process state transition requests from components
- Validate transitions against state machine rules
- Coordinate teardown sequences when required
3. **HMI Operation:**
- Continuously update main screen with system status
- Process button inputs for menu navigation
- Display diagnostic information and system health
- Provide local access to system functions
4. **Engineering Access:**
- Authenticate engineering session requests
- Process authorized commands and queries
- Provide secure access to system configuration
- Log all engineering activities for audit
### 5.2 Error Handling
| Error Condition | Detection Method | Response Action |
|----------------|------------------|-----------------|
| **Invalid State Transition** | State validation logic | Reject transition, log diagnostic event |
| **Teardown Timeout** | Teardown completion timer | Force transition, log warning |
| **HMI Hardware Failure** | I2C communication failure | Disable HMI, continue operation |
| **Authentication Failure** | Session validation | Reject access, log security event |
| **GPIO Conflict** | Resource allocation check | Deny allocation, report conflict |
### 5.3 State-Dependent Behavior
| System State | Feature Behavior |
|-------------|------------------|
| **INIT** | Initialize components, load configuration, establish communication |
| **RUNNING** | Normal state management, full HMI functionality, engineering access |
| **WARNING** | Enhanced monitoring, diagnostic display, limited operations |
| **FAULT** | Minimal operations, fault display, engineering access only |
| **OTA_UPDATE** | Suspend normal operations, display update progress |
| **MC_UPDATE** | Suspend operations, reload configuration after update |
| **TEARDOWN** | Execute teardown sequence, display progress |
| **SERVICE** | Engineering mode, enhanced diagnostic access |
| **SD_DEGRADED** | Continue operations without persistence, display warning |
## 6. Feature Constraints
### 6.1 Timing Constraints
- **State Transition:** Maximum 5 seconds for normal transitions
- **Teardown Sequence:** Maximum 30 seconds for complete teardown
- **HMI Response:** Maximum 200ms for button response
- **Engineering Command:** Maximum 10 seconds for command execution
### 6.2 Resource Constraints
- **Memory Usage:** Maximum 16KB for state management data
- **Display Update:** Maximum 50ms for screen refresh
- **GPIO Resources:** Centralized allocation, no conflicts allowed
- **Session Limit:** Maximum 2 concurrent engineering sessions
### 6.3 Security Constraints
- **Authentication:** All engineering access must be authenticated
- **Command Validation:** All commands validated before execution
- **Audit Logging:** All engineering activities logged
- **Access Control:** Role-based access to system functions
## 7. Interface Specifications
### 7.1 State Manager Public API
```c
// State management
system_state_t stm_getCurrentState(void);
bool stm_requestStateTransition(system_state_t target_state, transition_reason_t reason);
bool stm_isTransitionValid(system_state_t from_state, system_state_t to_state);
const char* stm_getStateName(system_state_t state);
// Teardown coordination
bool stm_initiateTeardown(teardown_reason_t reason);
bool stm_isTeardownInProgress(void);
bool stm_isTeardownComplete(void);
teardown_status_t stm_getTeardownStatus(void);
// Component registration
bool stm_registerStateListener(state_change_callback_t callback);
bool stm_unregisterStateListener(state_change_callback_t callback);
bool stm_registerTeardownParticipant(teardown_participant_t* participant);
// System control
bool stm_requestSystemReboot(reboot_reason_t reason);
bool stm_requestSystemReset(reset_reason_t reason);
```
### 7.2 HMI Controller Public API
```c
// Display management
bool hmi_initialize(void);
bool hmi_updateMainScreen(const system_status_t* status);
bool hmi_displayMessage(const char* message, uint32_t duration_ms);
bool hmi_displayMenu(const menu_item_t* items, size_t item_count);
// Button handling
bool hmi_processButtonInput(button_event_t event);
bool hmi_setButtonCallback(button_callback_t callback);
// Menu navigation
bool hmi_enterMenu(menu_id_t menu_id);
bool hmi_exitMenu(void);
bool hmi_navigateMenu(navigation_direction_t direction);
bool hmi_selectMenuItem(void);
// Status display
bool hmi_showSystemStatus(const system_status_t* status);
bool hmi_showDiagnostics(const diagnostic_summary_t* diagnostics);
bool hmi_showSensorStatus(const sensor_status_t* sensors);
```
### 7.3 Engineering Session API
```c
// Session management
session_handle_t eng_createSession(session_type_t type, const auth_credentials_t* creds);
bool eng_authenticateSession(session_handle_t session, const auth_token_t* token);
bool eng_closeSession(session_handle_t session);
bool eng_isSessionValid(session_handle_t session);
// Command execution
bool eng_executeCommand(session_handle_t session, const command_t* cmd, command_result_t* result);
bool eng_querySystemInfo(session_handle_t session, system_info_t* info);
bool eng_getDiagnosticLogs(session_handle_t session, diagnostic_log_t* logs, size_t* count);
// Configuration access
bool eng_getMachineConstants(session_handle_t session, machine_constants_t* mc);
bool eng_updateMachineConstants(session_handle_t session, const machine_constants_t* mc);
bool eng_validateConfiguration(session_handle_t session, validation_result_t* result);
```
## 8. Testing and Validation
### 8.1 Unit Testing
- **State Machine:** All state transitions and edge cases
- **Teardown Logic:** Sequence execution and timeout handling
- **HMI Components:** Display updates and button handling
- **GPIO Management:** Resource allocation and conflict detection
### 8.2 Integration Testing
- **State Coordination:** Cross-component state awareness
- **Event System Integration:** State change notifications
- **HMI Integration:** Real hardware display and buttons
- **Engineering Access:** Authentication and command execution
### 8.3 System Testing
- **Full State Machine:** All states and transitions under load
- **Teardown Scenarios:** OTA, MC update, fault conditions
- **HMI Usability:** Complete menu navigation and display
- **Security Testing:** Authentication bypass attempts
### 8.4 Acceptance Criteria
- All 11 system states implemented and functional
- State transitions complete within timing constraints
- Teardown sequences preserve data integrity
- HMI provides complete system visibility
- Engineering access properly authenticated and logged
- GPIO conflicts prevented and resolved
## 9. Dependencies
### 9.1 Internal Dependencies
- **Event System:** State change notifications and coordination
- **Data Persistence:** State and configuration storage
- **Diagnostics Task:** System health monitoring
- **Security Manager:** Authentication and access control
### 9.2 External Dependencies
- **ESP-IDF Framework:** GPIO, I2C, UART drivers
- **FreeRTOS:** Task scheduling and synchronization
- **Hardware Components:** OLED display, buttons, GPIO pins
- **Network Stack:** Engineering session communication
## 10. Future Enhancements
### 10.1 Planned Improvements
- **Advanced HMI:** Graphical status displays and charts
- **Remote Management:** Web-based engineering interface
- **Predictive State Management:** AI-based state prediction
- **Enhanced Security:** Biometric authentication support
### 10.2 Scalability Considerations
- **Multi-Hub Management:** Coordinated state management
- **Cloud Integration:** Remote state monitoring and control
- **Advanced Diagnostics:** Predictive maintenance integration
- **Mobile Interface:** Smartphone app for field service
---
**Document Status:** Final for Implementation Phase
**Component Dependencies:** Verified against architecture
**Requirements Traceability:** Complete (SR-SYS, SWR-SYS)
**Next Review:** After component implementation

View File

@@ -0,0 +1,89 @@
# Features Directory
# ASF Sensor Hub (Sub-Hub) System Features
**Document Type:** Feature Organization Index
**Version:** 1.0
**Date:** 2025-01-19
## Overview
This directory contains the complete feature specifications for the ASF Sensor Hub system. Each feature is documented with:
- Feature description and behavior
- Covered System Requirements (SR-XXX)
- Covered Software Requirements (SWR-XXX)
- Component implementation mapping
- Feature-level constraints
- Mermaid diagrams showing component interactions
## Feature Organization
### Feature Categories
| Category | Feature ID Range | Description |
|----------|------------------|-------------|
| **Sensor Data Acquisition** | F-DAQ-001 to F-DAQ-005 | Environmental sensor data collection and processing |
| **Data Quality & Calibration** | F-DQC-001 to F-DQC-005 | Sensor validation, calibration, and quality assurance |
| **Communication** | F-COM-001 to F-COM-005 | Main Hub and peer communication capabilities |
| **Diagnostics & Health Monitoring** | F-DIAG-001 to F-DIAG-004 | System health monitoring and diagnostic reporting |
| **Persistence & Data Management** | F-DATA-001 to F-DATA-005 | Data storage and persistence management |
| **Firmware Update (OTA)** | F-OTA-001 to F-OTA-005 | Over-the-air firmware update capabilities |
| **Security & Safety** | F-SEC-001 to F-SEC-004 | Security enforcement and safety mechanisms |
| **System Management** | F-SYS-001 to F-SYS-005 | System state management and control |
| **Power & Fault Handling** | F-PWR-001 to F-PWR-004 | Power management and fault handling |
| **Hardware Abstraction** | F-HW-001 to F-HW-003 | Hardware interface abstraction |
### Feature Files
| Feature File | Features Covered | Component Dependencies | Status |
|--------------|------------------|----------------------|--------|
| `F-DAQ_Sensor_Data_Acquisition.md` | F-DAQ-001 to F-DAQ-005 | Sensor Manager, Sensor Drivers, Event System | ✅ Complete |
| `F-DQC_Data_Quality_Calibration.md` | F-DQC-001 to F-DQC-005 | Machine Constant Manager, Sensor Manager | ✅ Complete |
| `F-COM_Communication.md` | F-COM-001 to F-COM-005 | Main Hub APIs, Network Stack, Event System | ✅ Complete |
| `F-DIAG_Diagnostics_Health.md` | F-DIAG-001 to F-DIAG-004 | Diagnostics Task, Error Handler, Persistence | ✅ Complete |
| `F-DATA_Persistence_Management.md` | F-DATA-001 to F-DATA-005 | Data Pool, Persistence, Storage Drivers | ✅ Complete |
| `F-OTA_Firmware_Update.md` | F-OTA-001 to F-OTA-005 | OTA Manager, State Manager, Security | ✅ Complete |
| `F-SEC_Security_Safety.md` | F-SEC-001 to F-SEC-004 | Security Manager, Boot System, Encryption | ✅ Complete |
| `F-SYS_System_Management.md` | F-SYS-001 to F-SYS-005 | State Manager, HMI, Event System | ✅ Complete |
| `F-PWR_Power_Fault_Handling.md` | F-PWR-001 to F-PWR-002 | Power Manager, Error Handler, Persistence | ✅ Complete |
| `F-HW_Hardware_Abstraction.md` | F-HW-001 to F-HW-002 | Sensor Abstraction Layer, GPIO Manager, Drivers | ✅ Complete |
## Traceability
### Requirements Coverage
- **System Requirements:** All 45 SR-XXX requirements are covered by features
- **Software Requirements:** All 122 SWR-XXX requirements are mapped to features
- **Components:** All components are mapped to implementing features
### Component Integration
Each feature document includes:
- **Component Interaction Diagrams:** Mermaid diagrams showing how components work together
- **Interface Definitions:** Clear specification of component interfaces
- **Data Flow:** How data flows between components within the feature
- **State Dependencies:** How the feature behaves in different system states
## Usage
1. **For Requirements Analysis:** Use feature documents to understand how requirements are implemented
2. **For Architecture Review:** Use component mappings to understand system structure
3. **For Implementation Planning:** Use component interfaces and interactions for development
4. **For Testing:** Use feature behaviors and constraints for test case development
## Document Standards
All feature documents follow:
- ISO/IEC/IEEE 29148:2018 requirements engineering standards
- Consistent formatting and structure
- Complete traceability to requirements
- Mermaid diagrams for visual representation
- Clear component interface specifications
---
**Next Steps:**
1. Review individual feature documents for completeness
2. Validate component mappings against architecture
3. Ensure all requirements are properly traced
4. Update component specifications based on feature requirements