Internet DRAFT - draft-hongcs-t2trg-dfm

draft-hongcs-t2trg-dfm



T2TRG                                               Hong, Choong Seon
Internet-Draft                                   Kyung Hee University
Intended status: Standards Track                     Minh, Nguyen H N
Expires: August 09, 2020                         Kyung Hee University
                                                   Pandey, Shashi Raj
                                                 Kyung Hee University
                                                      Chit Wutyee Zaw 
                                                 Kyung Hee University
                                                        Seung Il Moon
                                                 Kyung Hee University
                                                        December 2018
                                                   
                                                   
Distributed fault management for IoT Networks
                        draft-hongcs-t2trg-dfm-00

Abstract

Recent advances in Internet of Things (IoT) have increased 
the use of sensing technologies for IoT applications. However,
monitoring sensor nodes is still a challenging issue in 
distributed remote environments, especially wireless environments.
Different from conventional centralized mechanism, Fog Computing 
becomes an essential role in a scalable IoT system. Fog Node can
control and monitor its subdomain's devices and perform aggregation
tasks to support the central server at the cloud. Since node 
fault detection can strongly affect the performance and accuracy
in most IoT analysis applications, fault detection mechanism 
should be integrated into IoT Networks. Accordingly, these 
fault nodes could be detected and replaced by others available 
nodes in the same domain for the analysis by a distributed 
fault detection and node replacement mechanism based on their
sensory values in a considered domain.  

Status of this Memo

This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF).  Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six 
months and may be updated, replaced, or obsoleted by other 
documents at any time.  It is inappropriate to use Internet-Drafts
as reference material or to cite them other than as 
"work in progress." 




Hong, et al.          Expires  August 09, 2020                  [Page 1]


Internet-Draft      Fault Management for IoT               December 2018

This Internet-Draft will expire on August 09, 2020.

Copyright Notice 

Copyright (c) 2018 IETF Trust and the persons identified as the
document authors.  All rights reserved.

 This document is subject to BCP 78 and the IETF Trust's Legal
 Provisions Relating to IETF Documents
 (http://trustee.ietf.org/license-info) in effect on the date of
 publication of this document.  Please review these documents
 carefully, as they describe your rights and restrictions with respect
 to this document.  Code Components extracted from this document must
 include Simplified BSD License text as described in Section 4.e of
 the Trust Legal Provisions and are provided without warranty as
 described in the Simplified BSD License.

Table of Contents

 1.  Introduction . . . . . . . . . . . . . . . . . . . .. . . . . .  2
      1.1.  Terminology and Requirements Language  . . . . . . . . .  3
 2.  Communication Process . . . . . . . . . . . . . . . . . . . . .  3
      2.1.  Constrained Application Protocol (COAP) message exchange  4
      2.2.  Network Setup  . . . . . . . . . . . . . . .. . . .  . .  5
      2.3.  Message description in Fiesta-IOT  . . . . . . . . . . .  6
 3.  Distributed Fault Management . . . . . . . . . . . . . . .  . .  7
      3.1   Abnormal Value Fault Detection . . . . . . . . . . . . .  7
      3.2.  Sensor Replacement for Fault Sensors . . . . . . . . . . .9
 4.  IANA Considerations  . . . . . . .. . . . .  . . . . . . . . . . 9
 5.  Security Considerations  . . . . . . . . . . . .  . . .  . . . .10
 6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . .10
 6.1.  Normative References . . . . . . . . . . . . . . . . . . . . .10 
 6.2.  Informative References . . . . . .. . . . .  . . . . . . . . .10
 Authors' Addresses . . . . . . . . . . . . . . . . . . . . .. . . . 11


1.  Introduction

IoT Networks are composed of massive, small and low-cost sensor
nodes scattered deployed. Using IoT nodes, the sensory data can 
be collected for IoT applications through fog nodes. Accordingly, 
the central server and fog node need a fault detection mechanism 
to monitor sensor nodes in their domain. Failed nodes may affect 
the quality of service (Qos) from IoT analysis applications. It 
is an important feature in IoT management systems since faults 
in IoT Networks occur often due to the following reasons:

. Failure in sensor nodes can occur due to massive low-cost 
  sensor nodes are often deployed in low-cost IoT platform.

  
Hong, et al.          Expires  August 09, 2020                  [Page 2]


Internet-Draft      Fault Management for IoT               December 2018

. The critical applications are very sensitive to the quality 
  of sensory values.  

. Faults can be occurred due to battery depletion in 
  battery-powered nodes.

. The wireless link can be disconnected and the sensory 
  values cannot be updated at the central server

Faults in IoT domain can be classified into two types as in [a]:
.  'Hard fault' is when a sensor node cannot communicate with 
   the monitoring server (e.g., communication failure
   due to the failure of the communication module, energy 
   depletion of a node, being out of the communication range of 
   entire mobile network because of the nodes moving and so on).
.  'Soft fault' means the failed nodes can communicate with the
   monitoring server but the data sensed or transmitted is not 
   correct



1.1.  Terminology and Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


2.   Communication Process

The machine-to-machine (M2M) interaction model, known Constrained 
Application Protocol (COAP) similar to client/server model in HTTP
is adopted. This method can provide a flexible interaction environment 
to handle message exchanges between client and server nodes. Unlike 
HTTP, the message interchanges asynchronously over UDP. 

One complete message exchange for the application is handled in 
three stages. In stage-1, the sensor node (client) instantiates the
device registration process by forwarding the device information in 
JSON format (see Figure 1). The server stores the device information 
in TDB, and trigger application-driven control message (Start/Stop 
observation) in Stage-2. In Stage-3, if the sensor node is not 
sending observation data, it publishes the sensory data (e.g., 
temperature, humidity) to the server with Start control message. 
It may also require to stop sending observation data. It 
will be trigger by the application running on the server.


Hong, et al.          Expires  August 09, 2020                  [Page 3]


Internet-Draft      Fault Management for IoT               December 2018


   +---------------+           +------------+
   | Sensor Node   |           |   Server   |
   |   (Client)    |           |            |
   +---------------+           +------------+
         |                            |
        ,|                            |
        ||   +---------------------+  |
Stage-1 ||   | device registration |  |
process ||   |      process        |  |
        ||   +---------------------+  |
        `|                            |
         |                            |
      -----------------------------------
         |                            |
         |   +------------------+     |`
         |   |  Control message |     || Stage -2
         |   |     exchange     |     ||
         |   +------------------+     |,
         |                            |
     ------------------------------------
        ,|                            |
        ||   +---------------------+  |
Stage-3 ||   | observation message |  |
process ||   |      passing        |  |
        ||   +---------------------+  |
        `|                            |                           
                    
      Figure 1: Basic communication process



2.1. Constrained Application Protocol (COAP) message exchange

COAP messages are exchanged asynchronously between COAP 
endpoints [b]. In M2M interaction with COAP implementation, 
nodes act as both server and client roles. Using a Method Codes, 
a client sends COAP request on a resource (identified by a URI) 
on a server. Correspondingly, the server implements Response Code 
to send a response, which may include a resource implementation.

The COAP message exchange of JSON payload is illustrated in 
Figure 2. A Fog Node acts as an agent to facilitate the 
distributed scenario. The interaction between the Sensor Node 
and Server is managed by the Fog Node, or can proceed as in 
Figure 1. The Sensor Node sends a registration message to 
register itself. It waits for the control message to start 
sending the observation data.

Hong, et al.          Expires  August 09, 2020                  [Page 4]


Internet-Draft      Fault Management for IoT               December 2018
        

      +---------------+        +------------+        +------------+
      | Sensor Node   |        |  Fog Node  |        |  Server    |
      |               |        |            |        |            |
      +---------------+        +------------+        +------------+
             |                       |                     |
Device off-->|                       |                     |    
             |                       |                     |
Device on -->|                       |                     |
            ,|                       |                     |
            || +-------------------+ |                     |
Stage-1     || |device registration| |     Relay Message   |
process     || |      process      | |-------------------->| 
            || +-------------------+ |                     |        
            `|                       |                     |    
      ---------------------------------------------------------------
             |                       | +-----------------+ |`
             |                       | | Control message | ||Stage-2
             |      Forward Control  | |     exchange    | ||
             |           Message     | +-----------------+ |,
             |<----------------------|                     |            
             |                       |                     |
      ---------------------------------------------------------------
            ,|                       |                     |
            || +-------------------+ |                     |
Stage-3     || |observation message| |                     |
process     || |      passing      | |  Store and forward  |
            || +-------------------+ |-------------------->|                
            `|                       |                     |

                     Figure 2: COAP Message exchange



2.2. Network Setup

A tree topology is used as shown in Figure 3. 

 (Sensor Node a)------+                  
                       \               
 (Sensor Node b)--------+(Fog Node1)      (Fog Node2)
                       /        \             /
 (Sensor Node c)------+          \           / 
                                  \         /
                                   \       /
             (Sensor Node d)-------+(Server)


                      Figure 3: A tree topology 


Hong, et al.          Expires  August 09, 2020                  [Page 5]


Internet-Draft      Fault Management for IoT               December 2018



All the nodes MUST be a COAP endpoint for message exchange. 
A COAP endpoint is capable of both client and server roles. 
A sensor node can directly interact with the server or via 
a Fog Node.

          
2.3  Message description in Fiesta-IOT
The message format complies with Fiesta-IOT ontology [c]. 
It maintains three level of service description as shown 
in Figure 4. The observation data is accommodated in the 
sensor level description.


               +------------------------------------------------------+
               | <platform>                                           |
               |     <location>                                       |
               |        <lat></lat>                                   |
               |        <long></long>                                 |
               |   </location>                                        |
               |  +------------------------------------------------+  |
               |  | <system>                                       |  |
               |  |     <coverage></coverage>                      |  |
               |  |     <service></service>                        |  |
               |  |  +-------------------------------------------+ |  |
               |  |  |  <sensor>                                 | |  |
               |  |  |       <quantityKind></quantityKind>       | |  |  
               |  |  |       <unit></unit>                       | |  | 
               |  |  |       <observation>                       | |  |
               |  |  |        <measurementType></measurementType>| |  |
               |  |  |          <time></time>                    | |  |
               |  |  |          <sensorOutput>                   | |  |
               |  |  |          <value></value>                  | |  |
               |  |  |          <unit></unit>                    | |  |
               |  |  |        </sensorOutput>                    | |  |
               |  |  |       </observation>                      | |  |
               |  |  |  </sensor>                                | |  |
               |  |  +-------------------------------------------+ |  |
               |  |  </system>                                     |  |
               |  +------------------------------------------------+  |
               |</platform>                                           |
               +------------------------------------------------------+

                     Figure 4: Message description





Hong, et al.          Expires  August 09, 2020                  [Page 6]


Internet-Draft      Fault Management for IoT               December 2018


3.  Distributed Fault Management

After collecting sensory data from sensors through Fog Node and 
Central Server, the distributed fault management MUST run the 
abnormal value fault detection algorithm to detect the fault 
sensors for a particular location. Then, the detected fault 
sensor nodes SHOULD be replaced with the supplementary sensor
nodes that are currently off. These algorithms MAY be 
implemented in a centralized server (domain) and fog node at 
a particular location such as room, building


3.1 Abnormal Value Fault Detection

Abnormal value fault detection is an important fault detection 
in IoT domain. In order to know whether values observed from IoT 
sensors are fault or not, the observation values of particular 
sensors SHOULD be compared with the values from the neighboring 
sensor nodes. A flow chart for abnormal value fault detection at 
a particular location is shown below. The abnormal value fault 
detection MAY be performed based on two parameters, the current 
distance of observation values, the distance between the previous
and current distance of observation values. If the observation 
value of a particular sensor is similar to the majority of sensors
observation values which means the distances are within the 
predefined threshold, that sensor MAY be detected as normal.
Otherwise, it is a fault sensor.


Hong, et al.          Expires  August 09, 2020                  [Page 7]


Internet-Draft      Fault Management for IoT               December 2018
                        
                 
     
           +----------------------+     
           |  Calculate distances |
           |    between sensors   |                                 
           +----------------------+                                 
                      |                                             
                      |                                             
                      v                                             
                     /\                                             
                    /  \                                            
                   /    \                                           
                  /      \                                          
           Yes   / Are all\                                     
    (End) <-----/  sensors \ <--------------------------------------+
                \ detected?/                                        |
                 \        /                                         |
                  \      /                                          |
                   \    /                                           |
                    \  /                                            |
                     \/                                             |
                      |No                                           |
                      v                                             |
     +-----------------------------+                                |
     |For each sensor, count the   |                                |
     |number of sensors where their|                                |
     |distances are above threshold|                                |
     +-----------------------------+                                |
                     |                                              |
                     v                                              |
                    / \                                             |
                   /   \                                            |
                  /     \                                           |
                 / Is the\         +---------------------------+    |
                /  count  \    No  | Update the sensor status  |    |
               / below the \ ----->|         as Fault          |--->|
               \half of the/       +---------------------------+    |
                \number of/                                         |
                 \sensors/                                          |
                  \  ?  /                                           |
                   \   /                                            |
                    \ /                                             |
                     |                                              |
                     |Yes                                           |                                       
                     v                                              |
       +----------------------------+                               |
       |  Update the sensor status  |-------------------------------+
       |          as Normal         |
       +----------------------------+                               

       Figure 5: Flow chart of abnormal fault detection
           
Hong, et al.          Expires  August 09, 2020                  [Page 8]


Internet-Draft      Fault Management for IoT               December 2018


3.2.  Sensor Replacement for Fault Sensors

The detected fault sensors SHOULD be replaced with the supplementary
sensor nodes that are currently off for further analysis or 
monitoring. A flow chart for sensor replacement is shown below. For 
each fault sensor, an off sensor SHOULD be replaced by turning on it. 
If the sensor can be turned on, it is replaced with the fault sensor. 
Otherwise, the off sensor SHOULD be updated as malfunction sensor. 
Finally, the fault sensor MAY be turned off.

                     /\                                             
                    /  \                                            
                   /    \                                           
                  /      \                                          
           Yes   / Are all\                                     
    (End) <-----/   fault  \ <-------------------------------------+
                \  sensors /                                       |
                 \checked?/                                        |
                  \      /                                         |
                   \    /                                          |
                    \  /                                           |
                     \/                                            |
                      |No                                          |
                      v                                            |
           +--------------------+                                  |                      
           |      Turn on       |                                  |
           |   an off sensor    |                                  |
           +--------------------+                                  |
                     |                                             |
                     v                                             |
                    / \                                            |
                   /   \                                           |
                  /     \         +---------------------------+    |
                 /Was the\    No  | Update the sensor status  |    |
                / sensor  \ ----->|      as Malfunction       |--->|
                \ be able /       +---------------------------+    |
                 \ turned/                                         |
                  \ on? /                                          |
                   \   /                                           |
                    \ /                                            |
                     |                                             |
                     |Yes                                          |                                       
                     v                                             |
            +--------------------+                                 |  
            |     Turn off       |---------------------------------+
            |  the fault sensor  |
            +--------------------+                         



Hong, et al.          Expires  August 09, 2020                  [Page 9]


Internet-Draft      Fault Management for IoT               December 2018

4.  IANA Considerations

There are no IANA considerations related to this document.

5.  Security Considerations

This note touches communication security as in M2M communications and 
COAP protocol. 

6.  References

6.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [a]  Elhadef, M.; Boukerche, A; Elkadiki, H. Performance 
        analysis of a distributed comparison based self-
        diagnosis protocol for wireless ad hoc networks. 
        In Proceedings of the 9th ACM International 
        Symposium on Modeling analysis and simulation 
        of wireless and mobile system; ACM: NY, USA, 
        2006; pp. 165-172.

  [b]  Shelby, Zach, Klaus Hartke, and Carsten Bormann. 
       The constrained application protocol (CoAP). 
        No. RFC 7252. June 2014.
   
  [c]  Agarwal, Rachit, David Gomez Fernandez, Tarek 
       Elsaleh, Amelie Gyrard, Jorge Lanza, Luis Sanchez, 
       Nikolaos Georgantas, and Valerie Issarny. 
      "Unified IoT ontology to enable interoperability 
       and federation of testbeds." In Internet of Things 
       (WF-IoT), 2016 IEEE 3rd World Forum on, pp. 70-75. 
       IEEE, 2016.
   
6.2.  Informative References
    
    
Hong, et al.          Expires  August 09, 2020                 [Page 10]


Internet-Draft      Fault Management for IoT               December 2018


Authors' Addresses


Choong Seon Hong
Computer Science and Engineering Department, Kyung Hee University
Yongin, South Korea
Phone: +82 (0)31 201 2532
Email: cshong@khu.ac.kr

Minh N H Nguyen
Computer Science and Engineering Department, Kyung Hee University
Yongin, South Korea
Phone: +82 (0)31 201 2987
Email: minhnhn@khu.ac.kr

Shashi Raj Pandey
Computer Science and Engineering Department, Kyung Hee University
Yongin, South Korea
Phone: +82 (0)31 201 2987
Email: shashiraj@khu.ac.kr

Chit Wutyee Zaw
Computer Science and Engineering Department, Kyung Hee University
Yongin, South Korea
Phone: +82 (0)31 201 2987
Email: cwyzaw@khu.ac.kr

Seung Il Moon 
Computer Science and Engineering Department, Kyung Hee University
Yongin, South Korea
Phone: +82 (0)31 201 2987
Email: moons85@khu.ac.kr


Hong, et al.          Expires  August 09, 2020                 [Page 11]


Internet-Draft      Fault Management for IoT               December 2018