Files
ASF_01_sys_sw_arch/1 software design/draft/features/F-COM_Communication.md
2026-02-01 19:47:53 +01:00

19 KiB

Feature Specification: Communication

Feature ID: F-COM (F-COM-001 to F-COM-005)

Document Type: Feature Specification
Version: 1.0
Date: 2025-01-19
Feature Category: Communication

1. Feature Overview

1.1 Feature Purpose

The Communication feature provides comprehensive data exchange capabilities between the ASF Sensor Hub and external entities including the Main Hub and peer Sensor Hubs. This feature ensures reliable, secure, and deterministic transfer of sensor data, diagnostics, configuration updates, and control commands.

1.2 Feature Scope

In Scope:

  • Bidirectional communication with Main Hub via MQTT over TLS 1.2
  • On-demand data broadcasting and request/response handling
  • Peer-to-peer communication between Sensor Hubs via ESP-NOW
  • Long-range fallback communication options (LoRa/Cellular)
  • Communication protocol management and error handling

Out of Scope:

  • Main Hub broker implementation and configuration
  • Cloud communication protocols and interfaces
  • Internet connectivity and routing management
  • Physical network infrastructure design

2. Sub-Features

2.1 F-COM-001: Main Hub Communication

Description: Primary bidirectional communication channel with the Main Hub using MQTT over TLS 1.2 for secure and reliable data exchange.

Protocol Stack:

graph TB
    subgraph "Communication Protocol Stack"
        APP[Application Layer - CBOR Messages]
        MQTT[MQTT Layer - QoS 1, Topics, Keepalive]
        TLS[TLS 1.2 Layer - mTLS, X.509 Certificates]
        TCP[TCP Layer - Reliable Transport]
        IP[IP Layer - Network Routing]
        WIFI[Wi-Fi 802.11n - 2.4 GHz Physical Layer]
    end
    
    APP --> MQTT
    MQTT --> TLS
    TLS --> TCP
    TCP --> IP
    IP --> WIFI

MQTT Configuration:

  • Broker: Main Hub / Edge Gateway
  • QoS Level: QoS 1 (At least once delivery)
  • Keepalive: 60 seconds with 30-second timeout
  • Max Message Size: 8KB per message
  • Payload Format: CBOR (Compact Binary Object Representation)

Topic Structure:

/farm/{site_id}/{house_id}/{node_id}/data/{sensor_type}
/farm/{site_id}/{house_id}/{node_id}/status/heartbeat
/farm/{site_id}/{house_id}/{node_id}/status/system
/farm/{site_id}/{house_id}/{node_id}/cmd/{command_type}
/farm/{site_id}/{house_id}/{node_id}/diag/{severity_level}
/farm/{site_id}/{house_id}/{node_id}/ota/{action}

2.2 F-COM-002: On-Demand Data Broadcasting

Description: Real-time data request/response mechanism allowing the Main Hub to query current sensor data without waiting for periodic updates.

Request/Response Flow:

sequenceDiagram
    participant MH as Main Hub
    participant API as Main Hub APIs
    participant DP as Data Pool
    participant SM as Sensor Manager
    
    Note over MH,SM: On-Demand Data Request
    
    MH->>API: REQUEST_SENSOR_DATA(sensor_ids)
    API->>DP: getLatestSensorData(sensor_ids)
    DP-->>API: sensor_data_records
    
    alt Data available and fresh
        API->>API: formatCBORResponse(data)
        API->>MH: SENSOR_DATA_RESPONSE(data)
    else Data stale or unavailable
        API->>SM: requestImmediateSample(sensor_ids)
        SM->>SM: performSampling()
        SM->>DP: updateSensorData(fresh_data)
        DP-->>API: fresh_sensor_data
        API->>MH: SENSOR_DATA_RESPONSE(fresh_data)
    end
    
    Note over MH,SM: Response time < 100ms

Response Characteristics:

  • Maximum Response Time: 100ms from request to response
  • Data Freshness: Timestamp included with all data
  • Validity Status: Data quality indicators included
  • Batch Support: Multiple sensors in single response

2.3 F-COM-003: Peer Sensor Hub Communication

Description: Limited peer-to-peer communication between Sensor Hubs using ESP-NOW for coordination and status exchange.

ESP-NOW Configuration:

  • Protocol: ESP-NOW (IEEE 802.11 vendor-specific)
  • Range: ~200m line-of-sight, ~50m through walls
  • Security: Application-layer AES-128 encryption
  • Max Peers: 20 concurrent peer connections
  • Acknowledgment: Application-layer retry mechanism

Peer Message Types:

typedef enum {
    PEER_MSG_PING = 0x01,           // Connectivity check
    PEER_MSG_PONG = 0x02,           // Connectivity response
    PEER_MSG_TIME_SYNC_REQ = 0x03,  // Time synchronization request
    PEER_MSG_TIME_SYNC_RESP = 0x04, // Time synchronization response
    PEER_MSG_STATUS_UPDATE = 0x05,  // Status information exchange
    PEER_MSG_EMERGENCY = 0x06       // Emergency notification
} peer_message_type_t;

typedef struct {
    uint8_t message_type;
    uint8_t source_id[6];           // MAC address
    uint8_t sequence_number;
    uint16_t payload_length;
    uint8_t payload[ESP_NOW_MAX_DATA_LEN - 10];
    uint8_t checksum;
} peer_message_t;

2.4 F-COM-004: Heartbeat and Status Reporting

Description: Continuous system health and status reporting to maintain connection awareness and system monitoring.

Heartbeat Message Structure:

typedef struct {
    uint32_t uptime_seconds;
    char firmware_version[16];
    uint32_t free_heap_bytes;
    int8_t wifi_rssi_dbm;
    uint32_t error_bitmap;
    system_state_t current_state;
    uint8_t sensor_count_active;
    uint8_t sensor_count_total;
    uint32_t last_data_timestamp;
    uint16_t communication_errors;
} heartbeat_payload_t;

Status Reporting Schedule:

  • Heartbeat Interval: 10 seconds (configurable)
  • Status Update: On state changes (immediate)
  • Error Reporting: On fault detection (immediate)
  • Performance Metrics: Every 5 minutes

2.5 F-COM-005: Long-Range Fallback Communication

Description: Optional long-range communication capability for farm-scale distances where Wi-Fi coverage is insufficient.

Fallback Options:

  1. LoRa Module (Optional):

    • External LoRa transceiver (SX1276/SX1262)
    • LoRaWAN or proprietary protocol
    • Use cases: Emergency alerts, basic status
    • Data rate: Low (not suitable for OTA updates)
  2. Cellular Module (Alternative):

    • LTE-M or NB-IoT modem
    • Higher data rate than LoRa
    • Suitable for OTA updates
    • Higher power consumption and cost

Fallback Activation Logic:

graph TD
    START[Communication Start] --> WIFI{Wi-Fi Available?}
    WIFI -->|Yes| CONNECT[Connect to Wi-Fi]
    WIFI -->|No| FALLBACK{Fallback Enabled?}
    
    CONNECT --> MQTT{MQTT Connected?}
    MQTT -->|Yes| NORMAL[Normal Operation]
    MQTT -->|No| RETRY[Retry Connection]
    
    RETRY --> TIMEOUT{Timeout Exceeded?}
    TIMEOUT -->|No| MQTT
    TIMEOUT -->|Yes| FALLBACK
    
    FALLBACK -->|Yes| LORA[Activate LoRa/Cellular]
    FALLBACK -->|No| OFFLINE[Offline Mode]
    
    LORA --> LIMITED[Limited Communication]
    OFFLINE --> STORE[Store Data Locally]
    
    NORMAL --> MONITOR[Monitor Connection]
    LIMITED --> MONITOR
    MONITOR --> WIFI

3. Requirements Coverage

3.1 System Requirements (SR-XXX)

Feature System Requirements Description
F-COM-001 SR-COM-001, SR-COM-002, SR-COM-003 MQTT over TLS communication with Main Hub
F-COM-002 SR-COM-004, SR-COM-005 On-demand data requests and responses
F-COM-003 SR-COM-006, SR-COM-007 ESP-NOW peer communication
F-COM-004 SR-COM-008, SR-COM-009 Heartbeat and status reporting
F-COM-005 SR-COM-010, SR-COM-011 Long-range fallback communication

3.2 Software Requirements (SWR-XXX)

Feature Software Requirements Implementation Details
F-COM-001 SWR-COM-001, SWR-COM-002, SWR-COM-003 MQTT client, TLS implementation, topic management
F-COM-002 SWR-COM-004, SWR-COM-005, SWR-COM-006 Request handling, data formatting, response timing
F-COM-003 SWR-COM-007, SWR-COM-008, SWR-COM-009 ESP-NOW driver, peer management, encryption
F-COM-004 SWR-COM-010, SWR-COM-011, SWR-COM-012 Status collection, heartbeat scheduling, error reporting
F-COM-005 SWR-COM-013, SWR-COM-014, SWR-COM-015 Fallback protocols, activation logic, data prioritization

4. Component Implementation Mapping

4.1 Primary Components

Component Responsibility Location
Main Hub APIs MQTT communication, message handling, protocol management application_layer/business_stack/main_hub_apis/
Network Stack Wi-Fi management, TCP/IP, TLS implementation drivers/network_stack/
Peer Communication Manager ESP-NOW management, peer coordination application_layer/peer_comm/
Communication Controller Protocol coordination, fallback management application_layer/comm_controller/

4.2 Supporting Components

Component Support Role Interface
Event System Message routing, status notifications application_layer/business_stack/event_system/
Data Pool Latest sensor data access application_layer/DP_stack/data_pool/
Security Manager Certificate management, encryption application_layer/security/
Diagnostics Task Communication error logging application_layer/diag_task/

4.3 Component Interaction Diagram

graph TB
    subgraph "Communication Feature"
        API[Main Hub APIs]
        NS[Network Stack]
        PCM[Peer Comm Manager]
        CC[Communication Controller]
    end
    
    subgraph "Core System"
        ES[Event System]
        DP[Data Pool]
        SEC[Security Manager]
        DIAG[Diagnostics Task]
    end
    
    subgraph "Hardware Interfaces"
        WIFI[Wi-Fi Radio]
        ESPNOW[ESP-NOW Interface]
        LORA[LoRa Module]
        CELL[Cellular Module]
    end
    
    subgraph "External"
        MAINHUB[Main Hub]
        PEERS[Peer Hubs]
    end
    
    API <--> NS
    API <--> ES
    API <--> DP
    API <--> SEC
    
    PCM <--> ESPNOW
    PCM <--> ES
    PCM <--> SEC
    
    CC --> API
    CC --> PCM
    CC --> NS
    
    NS --> WIFI
    NS --> SEC
    NS --> DIAG
    
    WIFI <--> MAINHUB
    ESPNOW <--> PEERS
    LORA -.-> MAINHUB
    CELL -.-> MAINHUB
    
    ES -.->|Status Events| API
    DP -.->|Sensor Data| API
    DIAG -.->|Error Events| API

4.4 Communication Flow Sequence

sequenceDiagram
    participant SM as Sensor Manager
    participant ES as Event System
    participant API as Main Hub APIs
    participant NS as Network Stack
    participant MH as Main Hub
    
    Note over SM,MH: Sensor Data Communication Flow
    
    SM->>ES: publish(SENSOR_DATA_UPDATE, data)
    ES->>API: sensorDataEvent(data)
    
    API->>API: formatMQTTMessage(data)
    API->>NS: publishMQTT(topic, payload)
    NS->>NS: encryptTLS(payload)
    NS->>MH: MQTT_PUBLISH(encrypted_data)
    
    MH-->>NS: MQTT_PUBACK
    NS-->>API: publishComplete()
    
    alt Communication Error
        NS->>DIAG: logCommError(error_details)
        NS->>ES: publish(COMM_ERROR, error)
        ES->>API: commErrorEvent(error)
        API->>API: handleCommError()
    end
    
    Note over SM,MH: Heartbeat Flow
    
    loop Every 10 seconds
        API->>API: collectSystemStatus()
        API->>NS: publishHeartbeat(status)
        NS->>MH: MQTT_PUBLISH(heartbeat)
    end

5. Feature Behavior

5.1 Normal Operation Flow

  1. Connection Establishment:

    • Initialize Wi-Fi connection with configured credentials
    • Establish TLS session with Main Hub broker
    • Authenticate using device certificate (mTLS)
    • Subscribe to command and configuration topics
  2. Data Communication:

    • Publish sensor data on acquisition completion
    • Send heartbeat messages at regular intervals
    • Handle on-demand data requests from Main Hub
    • Process configuration and command messages
  3. Peer Communication:

    • Maintain ESP-NOW peer list and connections
    • Exchange status information with nearby hubs
    • Coordinate time synchronization when needed
    • Handle emergency notifications from peers
  4. Error Recovery:

    • Detect communication failures and timeouts
    • Implement exponential backoff for reconnection
    • Switch to fallback communication if available
    • Store data locally during communication outages

5.2 Error Handling

Error Condition Detection Method Response Action
Wi-Fi Disconnection Link status monitoring Attempt reconnection, activate fallback
MQTT Broker Unreachable Connection timeout Retry with backoff, store data locally
TLS Certificate Error Certificate validation failure Log security event, request new certificate
Message Timeout Acknowledgment timeout Retry message, escalate if persistent
Peer Communication Failure ESP-NOW transmission failure Remove peer, attempt rediscovery

5.3 State-Dependent Behavior

System State Feature Behavior
INIT Establish connections, authenticate, subscribe to topics
RUNNING Full communication functionality, all protocols active
WARNING Continue communication, increase error reporting
FAULT Emergency communication only, diagnostic data priority
OTA_UPDATE OTA-specific communication, suspend normal data flow
TEARDOWN Send final status, gracefully close connections
SERVICE Engineering communication enabled, diagnostic access
SD_DEGRADED Continue communication, no local data buffering

6. Feature Constraints

6.1 Timing Constraints

  • Connection Establishment: Maximum 30 seconds for initial connection
  • Message Transmission: Maximum 5 seconds for MQTT publish
  • On-Demand Response: Maximum 100ms from request to response
  • Heartbeat Interval: 10 seconds ±1 second tolerance

6.2 Resource Constraints

  • Memory Usage: Maximum 64KB for communication buffers
  • Bandwidth Usage: Maximum 1 Mbps average, 5 Mbps peak
  • Connection Limit: 1 Main Hub + 20 peer connections maximum
  • Message Queue: Maximum 100 pending messages

6.3 Security Constraints

  • Encryption: All communication must use TLS 1.2 or higher
  • Authentication: Mutual TLS required for Main Hub communication
  • Certificate Validation: Full certificate chain validation required
  • Key Management: Automatic key rotation support required

7. Interface Specifications

7.1 Main Hub APIs Public Interface

// Connection management
bool mainHubAPI_initialize(const comm_config_t* config);
bool mainHubAPI_connect(void);
bool mainHubAPI_disconnect(void);
bool mainHubAPI_isConnected(void);

// Message publishing
bool mainHubAPI_publishSensorData(const sensor_data_record_t* data);
bool mainHubAPI_publishHeartbeat(const heartbeat_payload_t* heartbeat);
bool mainHubAPI_publishDiagnostic(const diagnostic_event_t* event);
bool mainHubAPI_publishStatus(const system_status_t* status);

// Message handling
bool mainHubAPI_subscribeToCommands(command_handler_t handler);
bool mainHubAPI_subscribeToConfig(config_handler_t handler);
bool mainHubAPI_handleOnDemandRequest(const data_request_t* request);

// Status and statistics
comm_status_t mainHubAPI_getConnectionStatus(void);
comm_stats_t mainHubAPI_getStatistics(void);
bool mainHubAPI_resetStatistics(void);

7.2 Peer Communication Manager API

// Peer management
bool peerComm_initialize(void);
bool peerComm_addPeer(const uint8_t* mac_address);
bool peerComm_removePeer(const uint8_t* mac_address);
bool peerComm_getPeerList(peer_info_t* peers, size_t* count);

// Message transmission
bool peerComm_sendPing(const uint8_t* peer_mac);
bool peerComm_sendTimeSync(const uint8_t* peer_mac, uint64_t timestamp);
bool peerComm_sendStatus(const uint8_t* peer_mac, const peer_status_t* status);
bool peerComm_broadcastEmergency(const emergency_msg_t* emergency);

// Message reception
bool peerComm_registerMessageHandler(peer_message_handler_t handler);
bool peerComm_setEncryptionKey(const uint8_t* key, size_t key_length);

7.3 Network Stack Interface

// Network management
bool networkStack_initialize(void);
bool networkStack_connectWiFi(const wifi_config_t* config);
bool networkStack_disconnectWiFi(void);
wifi_status_t networkStack_getWiFiStatus(void);

// MQTT operations
bool networkStack_connectMQTT(const mqtt_config_t* config);
bool networkStack_publishMQTT(const char* topic, const uint8_t* payload, size_t length);
bool networkStack_subscribeMQTT(const char* topic, mqtt_message_handler_t handler);
bool networkStack_disconnectMQTT(void);

// TLS management
bool networkStack_loadCertificate(const uint8_t* cert, size_t cert_length);
bool networkStack_loadPrivateKey(const uint8_t* key, size_t key_length);
bool networkStack_validateCertificate(const uint8_t* cert);

8. Testing and Validation

8.1 Unit Testing

  • Protocol Implementation: MQTT, TLS, ESP-NOW protocol compliance
  • Message Formatting: CBOR encoding/decoding validation
  • Error Handling: Network failure and recovery scenarios
  • Security: Certificate validation and encryption testing

8.2 Integration Testing

  • Main Hub Communication: End-to-end MQTT communication testing
  • Peer Communication: ESP-NOW multi-device testing
  • Fallback Systems: LoRa/Cellular fallback activation
  • Event Integration: Communication event publication and handling

8.3 System Testing

  • Load Testing: High-frequency data transmission under load
  • Reliability Testing: 48-hour continuous communication
  • Security Testing: Penetration testing and certificate validation
  • Interoperability: Communication with actual Main Hub systems

8.4 Acceptance Criteria

  • Successful connection establishment within timing constraints
  • 99.9% message delivery success rate under normal conditions
  • On-demand responses within 100ms requirement
  • Secure communication with proper certificate validation
  • Graceful handling of all communication error conditions
  • Peer communication functional with multiple concurrent peers

9. Dependencies

9.1 Internal Dependencies

  • Event System: Message routing and status notifications
  • Data Pool: Access to latest sensor data for transmission
  • Security Manager: Certificate management and encryption
  • State Manager: System state awareness for communication control

9.2 External Dependencies

  • ESP-IDF Framework: Wi-Fi, TCP/IP, TLS, ESP-NOW drivers
  • Main Hub Broker: MQTT broker availability and configuration
  • Network Infrastructure: Wi-Fi access points and internet connectivity
  • Certificate Authority: X.509 certificates for device authentication

10. Future Enhancements

10.1 Planned Improvements

  • Adaptive QoS: Dynamic quality of service based on network conditions
  • Mesh Networking: Sensor Hub mesh for extended coverage
  • Edge Computing: Local data processing and filtering
  • 5G Integration: 5G connectivity for high-bandwidth applications

10.2 Scalability Considerations

  • Protocol Optimization: Compressed protocols for bandwidth efficiency
  • Load Balancing: Multiple Main Hub connections for redundancy
  • Cloud Integration: Direct cloud connectivity bypass Main Hub
  • IoT Platform Integration: Standard IoT platform protocol support

Document Status: Final for Implementation Phase
Component Dependencies: Verified against architecture
Requirements Traceability: Complete (SR-COM, SWR-COM)
Next Review: After component implementation