Internet Engineering Task Force S. Gros, Ed.
Internet-Draft L. Jelenkovic
Intended status: Informational D. Skvorc
Expires: December 17, 2016 University of Zagreb
June 15, 2016

PvD support in Linux
draft-sgros-pvd-support-in-linux-00

Abstract

The purpose of this draft is to document two implementations which implemented parts of PvD architecture document. One implementation was done from scratch (PvD-manager) while another is a part of an existing component (NetworkManager). The implementation of server component was also done but only in IPv6 router. DHCP was ignored due to IPR claims.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 17, 2016.

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

This draft documents experiences gained while trying to implement support for multiple provisioning domains in modern Linux desktop distribution. Two implementations were done. One implementation was written from scratch in Python programming language. This implementation and experiences gained are described in Section 3. The other implementation was done in NetworkManager [NMSrc]. NetworkManager is used in all popular Linux distributions and is mature component. So, the goal of the prototype implementation was to see how hard would it be to add PvD support to existing applications. Experiences gained are described in Section 4. Before implementations are described, in Section 2 some common elements to both implementations are described. Finally, for testing purposes it was necessary to have server component. For that purpose radvd routing daemon [radvd-src] was extended to send PvD container options. This modification is described in Section 5

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. Common implementation mechanisms

In this section, we describe some common implementation concepts. Basically, they follow from the preparation phase when different approaches were analysed.

First, the implementations are all done on modern Linux operating system, more specifically, Fedora Desktop distribution. The main characteristic of this distribution is that the main IPC mechanism is D-Bus [dbus]. So, D-Bus is also used by the two implementations described in this draft.

Next, in the analysis phase, the implementations assessment was done to determine which mechanisms to use with the main goal of having control over the use of provisioning domains (PvDs) by applications. In other words, the main issue was how to control which PvD will an application use in the presence of multiple provisioning domains. The problems we tried to address are well described in RFC 6418 [RFC6418]. The conclusion was that the network namespaces are the most appropriate mechanism to control application use of provisioning domains.

Still, while network namespaces were found to be the most appropriate mechanism for separation of PvD-related network configurations, there is a big issue in that it is not possible to have the same PvD instance in separate network namespaces. To get around that issue, multiple PvD instaces could be used by a single node. However, this is not without the problems either. First, from the operational perspective this leads to proliferation of IP addresses and to issues with accountability. It is even a bigger issue on IPv4 networks that have restricted number of PvD instances per each PvD.

3. PvD-manager Implementation

PvD-manager is a client-side component for IPv6 network auto-configuration based on multiple provisioning domains (PvDs), as described in RFC 6418 [RFC6418] and RFC 7556 [RFC7556]. PvD-manager is orthogonal to existing system, which means it does not interfere with regular network behavior: none of the services and settings used by Network Manager and similar components are changed or affected. Source code and instructions of how to run the PvD-manager are available at its git repository [PvD-manager].

3.1. Architecture

PvDs are implemented through Linux network namespaces. For each coherent PvD information set received on the network interface, PvD-manager creates a separate network namespace and configures received network parameters within that namespace. Since each network namespace uses separate IP stack which is isolated from other namespaces, potentially conflicting network parameters received from different network providers can safely coexist on a single host. PvD-manager manages only those newly created namespaces associated with the PvDs and their network settings and leaves the default network namespace intact. That way, all the existing network management components, such as Network Manager, continue to work unobtrusively.

Figure 1 presents an overview of a system and its components that use PvD-manager.

+---------------------------------------------------+ +------------+
| +-----------------------------------------------+ | |   +-------+|
| | PvD-manager                                   | | | +-+ radvd ||
| | +-----------+   +-----------+   +-----------+ | | | | +-------+|
| | | pvdserver +->-+   pvdman  +-<-+ ndpclient | | | | |          |
| | +-----+-----+   +-------+---+   +-----+-----+ | | | | +-------+|
| +-------|-----------------|-------------|-------+ | | +-+ httpd ||
|         |d-bus      create|configure    |RA/RS    | | | +-------+|
| get_pvds|           delete|             |HTTP     | | |          |
| +-------|-------+    +----+---------+   |         | | |          |
| | +-----+-----+ |join|   network    |  -+---------+-+-+--       -+-
| | |  MIF API  +------+  namespace   |             | |            |
| | +-----------+ |    |  operations  |             | |            |
| |               |    +--------------+             | |            |
| |   PvD aware   |                                 | |            |
| |  application  |                                 | |            |
| +---------------+                    Client (PC)  | |     Router |
+---------------------------------------------------+ +------------+

Figure 1: PvD prototype architecture overview

PvD-manager receives network configurations through Router Advertisement (RA) messages. Modified version of PvD aware radvd daemon [radvd] is used. The modifications made to radvd to support MIF architecture is described in Section 5. Each RA may contain one or more network configurations which are classified either as explicit or implicit PvDs. Explicit PvD is a set of coherent NDP options explicitly labeled with a unique PvD identifier and nested within a special NDP option called PVD container, as described in draft-ietf-mif-mpvd-ndp-support-02 [I-D.ietf-mif-mpvd-ndp-support]. Multiple explicit PvDs may appear in a single RA, each within a different PvD container option, as long as they are labeled with different PvD identifiers. Implicit PvD is just another name for top-level NDP options placed outside the PvD container option, as in regular PvD unaware router advertisements. Since implicit PvDs are not labeled with PvD identifier, PvD-manager automatically generates an identifier for internal use and configures the implicit PvD on the host in the same way as if it was the explicit one. Only one implicit PvD is allowed per RA. In current prototype, UUID is used as a PvD identifier.

Each PvD, either explicit or implicit, is associated with a network namespace with a single virtual network interface (besides the loopback) of macvlan type, where PvD-related network parameters are configured. To establish a connectivity to the outside world, virtual interface is connected to the physical interface on which the related PvD information is received through the RA. Each virtual interface is assigned a link-local IPv6 address (fe80::/64) and one or more addresses derived from Prefix Information options if present in the received RA. Besides the IP addresses, PvD-manager configures the routing tables and DNS records within the namespace. By default, a link-local and default route via the RA-announcing router are added to the routing table, regardless of the routing information received in RA. Additional routing information is configured if Route Information options are received in RA. Finally, for each RDNSS and DNSSL option received in RA, PvD-manager creates a record in /etc/netns/NETNS_NAME/resolv.conf, where NETNS_NAME is a name of the network namespace associated with the PvD.

PvD aware client application uses PvD API to get a list of available PvDs configured on the local host, and activate chosen PvD to use it for communication. Information about configured PvDs are exposed to applications by a special PvD service running on local host. D-Bus is used to connect applications to PvD service. Upon PvD activation, client application is switched to the network namespace associated with the selected PvD. Further network operations (socket creation, sending and receiving data) are performed within that namespace. Once obtained by the application, socket handles are linked to the network namespace they were originally obtained from and continue to work in that namespace, regardless of whether the application switches to another namespace at some time later. This enables the application to use multiple PvDs simultaneously. The only requirement is that the application is running within a proper network namespace while obtaining a socket.

PvD unaware clients operate as before. Although they are not able to use PvD API to select a certain PvD, they can still be forced to use a specific PvD by starting them in a network namespace associated with that PvD. To run a program within a given namespace, it should be started with:

or they can be started with provided launchers ("pvd_run" and "pvd_prop_run").

As per RFC 7556 [RFC7556], implemented PvD system provides basic, intermediate and advanced PvD support (in APIs) for client applications. The only difference is that our basic support doesn't provide automatic selection for PvD unaware application - it must be started with PvD launcher with manual selection of PvD. Intermediate and advanced PvD support require some additional properties (metadata) provided with PvD. Next section describe a mechanism used to provide such information to PvD-manager which they provide such information to client applications.

3.2. PvD Properties

With RA messages routers provides network related parameters for PvDs. Other parameters that can be used to detail properties of particular PvD (an application can use them to better select PvD) in this draft are called "PvD properties" or just "properties".

In this prototype implementation, PvD properties are also provided by router, but only on request, using HTTP protocol on router's link-local address, using port 8080. Router's link-local address is saved by PvD-manager when RA was received.

Upon receiving PvD information from router, PvD-manager tries to get a file with PvD properties from the same router. If such file exists, network related PvD parameters are extended with ones (properties) from received file.

Client application receives all those additional properties from PvD-manager, and may select appropriate PvD based on them.

Current implementation is very rudimentary: files on router are in JSON format. PvD-manager interpret them - create a dictionary of them, but only because PvD-manager it's written in Python and using JSON is easy, while client applications are written in C. In real implementation this should be reversed - only client should interpret file with PvDs' properties.

Properties used in this prototype are just an example ("name", "type", "bandwidth", "price"), not to be used in some protocol specification. We presented one mechanism to provide additional PvD properties obtained by some mechanism (not by RAs) and let client application decide what to do with them.

Figure 2 presents example properties for several PvDs when obtained from two routers (R1 and R2 from test scenarios described in Section 3.5.

From R1:
[
    {
        "name": "Home internet access",
        "type": ["internet", "wired"],
        "id": "implicit",
        "bandwidth": "10 Mbps",
        "pricing": "free"
    },
    {
        "name": "TV",
        "type": ["iptv", "wired"],
        "id": "f037ea62-ee4f-44e4-825c-16f2f5cc9b3e",
        "bandwidth": "10 Mbps",
        "pricing": "free"
    }
]
From R2:
[
    {
        "name": "Cellular internet access",
        "type": ["internet", "cellular"],
        "id": "implicit",
        "bandwidth": "1 Mbps",
        "pricing": "0,01 $/MB"
    },
    {
        "name": "Phone",
        "type": ["voice", "cellular"],
        "id": "f037ea62-ee4f-44e4-825c-16f2f5cc9b3f",
        "bandwidth": "0,1 Mbps",
        "pricing": "0,01 $/MB"
    }
]

Figure 2: PvD property examples

3.3. Deployment

PvD architecture assumes presence of at least one router which runs modified version of radvd daemon [radvd], described in Section 5. Through RA messages, router conveys network related parameters to client host (prefix, routes, DNS servers and domains). Router should also provide PvD properties, using an HTTP server on port 8080 attached to router's link-local IP address.

DNS servers aren't part of PvD architecture but could be used to demonstrate that different PvDs can use different DNS servers.

PvD-manager is a daemon for client host. Currently, PvD-manager consists of several modules. Main module maintain PvD information, creates, updates and deletes namespaces. NDP module listens for RA messages, parses them, and forwards them to the main module. API server module listens for client application requests (using d-bus) and responds to them. It also send signals to clients when a change occurred in PvDs.

Before starting a network connection, PvD aware client application should first request PvDs from PvD-manager. Next, one PvD should be selected (activated). In order to bind a new network connection to a given PvD, network connection must be created after the PvD activation has been made, but before activation on another PvD. If other PvDs are required (later or in parallel), same procedure must be followed: select PvD first and then create connections before activating another PvD. In other words, all network connections created between two PvD activations will be bound to the PvD which was selected first. Connection will continue to operate within PvD in which it was created, regardless of PvDs selected later.

PvD unaware application should be started with PvD launcher to use certain PvD. Otherwise, such application will behave as if PvDs were not present ("as usual").

3.4. Implementation Details

Proposed PvD architecture is based on Linux namespaces as PvD isolation mechanism. Isolation namespaces resolve many issues about overlapping and conflicting network parameters for different PvDs. However, they also impose some requirements that may limit the usage of the proposed implementation in certain environment, especially ones based on public IPv4 addresses. One of the main problem with namespaces is that each namespace requires an IP address (since namespace emulates network from link-layer and up).

Only IPv6 is used in this implementation. Main reasons include using RA messages as PvD information provider and unrestricted generation of IPv6 addresses per PvD.

To expose PvD-related operations to the applications, a library with the API is created, currently only for programs written in C programming language. For communication between client application (using provided API) and PvD-manager, d-bus service is used. The API proposed in this implementation includes PvD retrieval methods (pvd_get_by_id, pvd_get_by_properties), PvD selection method (pvd_activate), and registration for events when PvDs change their state (pvd_register_signal). Sample test applications which demonstrate API usage and PvD system possibilities are provided in the PvD-manager repository [PvD-manager].

PvD-manager is implemented in Python 3 because it allows rapid prototyping using network managing modules. Two modules from Python package repository were helpful during the implementation: pyroute2 as a Python-based interface to the NETLINK service, and netaddr for manipulation with network-related data structures (IP addresses, masks, prefixes, etc.).

More details about implementation of PvD-manager is available in its documentation [PvD-manager].

3.5. Test Scenarios

Test scenarios used for validating implementations include a system with one client host, two routers and two hosts that act as servers, as presented on Figure 3. All hosts are running as virtual machines.


             fd01::1/64 +----+                                +----+
       2001:db8:1::1/64 |    |        2001:db8:10::1/32       |    |
                        |    |        2001:db8:10::2/32       |    |
           +----------o-+ R1 +-o----------------------------o-+ S1 |
+------+   |            |    |      :       [VMnet3]          |    |
|      |   |            |    |      :                         |    |
|      |   |            +----+      : (for some tests         +----+
|Client+-o-+ [VMnet2]               :  this link is 
|      |   |            +----+      :  established)           +----+
|      |   |            |    |      :                         |    |
+------+   |            |    |      :       [VMnet4]          |    |
           +----------o-+ R2 +-o----------------------------o-+ S2 |
                        |    |        2001:db8:20::1/32       |    |
       2001:db8:2::1/64 |    |        2001:db8:20::2/32       |    |
             fd02::1/64 +----+                                +----+

Figure 3: Network configuration used in test scenarios

All server hosts, including routers have configured HTTP and DNS servers providing many possibilities for testing. Routers, in RAs advertise prefixes as shown on Figure 3. Local addresses are used in explicit PvDs (simulating some specific service), while public in implicit.

Example network configurations from RFC 7556 [RFC7556] are simulated with Figure 3 with various applications on Client and servers S1 and S2. S1 is accessible by client only through implicit PvD provided by R1, while S2 similarly, only over PvD provided by R2.

If S1 simulate one service, and S2 another, client application can select PvD based on service required and connect to S1 or S2. Or a PvD aware application can use both in parallel.

VPN was simulated by a tunnel between Client and S2, created within implicit PvD provided by R2. Then tunnel was added as another PvD. In this scenario, S2 had local address (and prefix) as local addresses on R2 network: there were two PvDs with same prefix on Client. However, client applications running in those two different PvDs for the same IP address (fd02::1) get connected with different servers: one with R2, and another (which used "VPN" PvD) with S2.

In some scenarios both S1 and S2 were connected to both routers R1 and R2. In this scenarios better PvD could be chosen for connecting with servers, if properties for PvDs are provided. Also, when a connection fails (when some PvD loses connectivity), the application can reset connections, refresh PvD availability from PvD-manager, and select another PvD among the active ones.

More details for described scenarios (and some other) are provided in demonstration test cases [PvD-manager].

3.6. Experiences gained

Since main implementation mechanism used to separate different PvD configuration from each other are Linux namespaces, in this section we discuss its suitability as a solution to the given problem. Linux namespaces were selected as main implementation mechanism during the design phase, based on their theoretical properties and limited hands-on experience from our previous projects. Comments that follow reflect our opinions gained during the implementation and testing phase.

Using Linux namespaces still seems best option for PvD realization despite its drawbacks. Its isolation and ease of use from PvD-manager and client application perspective can hardly be outmatched with other solutions. Besides already mentioned need for separate IP address per namespace, there are several more issues with namespaces.

Managing namespaces (creation, deletion, modification) requires root privileges (as expected). However, even switching an application from one namespace to another is possible only if application has root privileges. This currently limits this namespace approach exclusively to applications run by root. For lifting this limitation, changes in Linux kernel and namespace handling is required. Some sort of permission system should also be applied to namespaces (e.g. similar to permissions on files and system objects).

Switching namespace from within the application with setns() system call doesn't update DNS related configuration as expected. When an application is started with commands "ip netns exec <namespace-name> <application> [arguments]", DNS configuration is updated (/etc/resolf.conf is the one from /etc/netns/<namespace-name>/). However, setns() doesn't replicate that behavior, and that manipulations should be done separately (mounting certain directories/files).

When a namespace is created, a virtual device is created that is linked with physical device (and it gets assigned its own MAC address). However, if particular physical device does not allow to be linked with virtual one (like VPN), then either physical device must be moved to certain PvD or some sort of bridge has to be created and devices attached to that bridge that can be moved to the namespace. Sometimes, it's better to move physical device to particular namespace (PvD) and allow only some applications to use it (e.g. like VPN).

Managing namespace: creating, deleting, adding device to it, adding ip addresses and routes are operations performed within PvD-manager and they aren't instant. Maybe that is expected and "normal" since such operations aren't to be performed frequently (only when something in network changes). However, when testing frequent changes in PvDs (routers were connected and disconnected) significant delay in PvD aware application was detected. Sometimes PvD-manager's API server module (that is responsible for client communication) become unresponsive for at least several seconds. The reason could be a suboptimal implementation of PvD-manager, but also a side effect of inherently slow process of network reconfiguration required by PvD-manager that should be applied in the Linux kernel.

Recently added feature to the Linux kernel (January, 2016), named Virtual Routing and Forwarding (VRF), seems as a promising alternative to network namespaces. However, a more thorough research is needed to draw valid conclusions.

4. Implementing PvDs in NetworkManager

NetworkManager is a software component used in modern Linux distributions to control network connections. It runs as a daemon tracking and reacting to the network related events, either those coming from the network (like, for example, Router Advertisements) or those from the local system (e.g. user adding new network device, like USB modem). Furthermore, it exposes certain methods and properties over D-Bus interface so that it can be controlled by different clients, the most prominent being network manager applet and nmcli command line tool.

Due to its importance in the modern Linux distributions it was valuable experience to try to implement PvDs within it. Yet, NetworkManager is a very complex piece of a software that wasn't designed with PvDs in mind so it wasn't straightforward task to try to implement them and there were some difficulties that had to be dealt with. In the following text we'll first describe how NetworkManager behaves now in general and with respect to multiple PvDs, then we'll describe one approach on how to add support for multiple PvDs.

4.1. NetworkManager's Current Behavior

In this section we describe how unmodified NetworkManager behaves with respect to multiple provisioning domains.

First, we have to state that NetworkManager doesn't use network namespaces, that is, everything is kept in one network namespace that we'll call "root network namespace" or "main network namespace". So, all devices and all the configuration parameters are in one place.

NetworkManager already allows multiple configurations, that is provisioning domains, to be received and activated/used in parallel. The following sources of configuration data are possible:

VPN Connections.

When user activates VPN connection NetworkManager receives PvD instance. In some cases, after establishing VPN there is virtual device present (e.g. OpenVPN), while in other cases there are no new devices but only appropriate additions to the existing IP packet processing path (e.g. IPsec). In any case, interface and the associated PvD instance are assigned and present in the root network namespace. Furthermore, DNS data is also merged with existing data (modification of /etc/resolv.conf).
Concurrent active wired and wireless connections.

Again, like in the previous case, all the settings are mixed and this mixing has a consequence of not be able to use the two connections in parallel. The only policy currently supported by NetworkManager is to prefer wired connection for default route. In other words, when both wired and wireless connections are used default route is installed to point to wired connection. So, unless there are no specific routes that use wireless connection, it will not be used at all.
Multiple IPv6 routers on a local network.

This is interesting use case even though currently it is not so common. Namely, there is a possibility that there are two IPv6 capable routers on the local network that overlap in connectivity. What NetworkManager will do in this case is that all PvDs received in RAs will be merged together per device and all addresses will be installed, routes will be in a routing table and DNS configuration will be merged in /etc/resolv.conf configuration file.
Multiple concurrent DHCP servers on a local network.

This is a final use case that isn't possible by the current protocol design. Namely, DHCP offers received are treated as alternative and there is no way to have multiple concurrent DHCP servers on a single local network.

It isn't hard to see that NetworkManager mixes multiple PvD instances that leads to very complex situations in which it is hard to control which PvD instance will be used by applications. There are standard mechanisms, like source address selection, but no DNS selection, nor destination address selection. Some control is possible by default route installation. Namely, it is possible to instruct NetworkManager not to install default route for specific connections (like for example VPN) even though default route was received as a part of the configuration. NetworkManager uses this mechanism to enforce some simple policies, like to give a preference to wired over wireless connection.

Applications don't have any control of network configurations they use and there is no standard way for applications to find out what specific connection offers in terms of quality of service, cost and reachability.

4.2. The Architecture and Components

We can describe NetworkManager in terms of static architecture and runtime behavior. The static architecture is reflected by the source code organization and libraries and external programs used. The purpose of this draft is only to present the core of the NetworkManager, but keep in mind that there are other components which we'll not cover.

When the source code archive of NetworkManager is unpacked, there is a src/ subdirectory of top-level directory. This is the main part of the NetworkManager and there you'll find the following subdirectories and files:

devices/

Objects that allow management of networking devices. By management it is meant tracking status of devices, network configuration data applied to devices, but also creating and removing devices of a certain types. The main class/object defined here is NMDevice.
dhcp-manager/

DHCP manager (object that controls DHCP clients) and an object/class for each DHCP client supported. NetworkManager supports several DHCP clients: internal, ISC's dhclient, and systemd. By default, ISC's client is used.
platform/

Platform specific code that isolates NetworkManager from specifics of a certain operating system. The only supported platform at the time this draft was written was for Linux. Platform is used to communicate with network specific parts within the Linux kernel operating system.
rdisc/

Objects that are used to send/receive RA/RS messages. There is a platform independent object and platform dependent objects. Platform dependent objects do real receive/send operations and Linux is currently the only supported platform. NetworkManager uses libndp library here to send, receive and parse RA and RS messages.
settings/

In this subdirectory are objects that represent connections (NMSettingsConnection) and a singleton object NMSettings that manages all connection objects.
vpn-manager/

Object to manage VPN connections and NMVPNConnection object that is instantiated for each VPN established. Handling of external daemons is placed in separate software not part of the NetworkManager core archive.
nm-manager

This is the main object, NMManager, that controls everything.

For the run-time behavior, when NetworkManager is started appropriate objects are created. The main objects are singleton object, like NMManager, NMSetting, NMPlatform but there are also non-singleton objects like NMDevice that is instantiated as many time as there are devices on a system. Since objects, in general, run concurrently one important aspect of the initialization is connecting different objects. After the initialization finishes, NM waits for external events and reacts appropriately.

4.3. Existing NetworkManager API

NetworkManager has an API that is targed primarily to management applications. It can be used by networking application also, but it isn't currently the case. Probably the reason for this is restricted functionality of the API in the sense that applications couldn't gain much by using it.

There are two important concepts necessary to understand the API provided by NetworkManager:

  1. NetworkManager exposes object over D-Bus. By exposing it is meant that applications can found out about them, query them, invoke methods on them, etc. Objects are parts of NetworkManager, and in the previous section we mentioned them. Some objects are permanent throughout the life cycle of NetworkManager like the main NMManager object, while some are dynamic like NMDevices and NMActiveConnections that come and go depending on the current state of networking on a particular node.
  2. Each object implements one or more interfaces, usually only one. Intarface contains methods, properties (variables) and signals. Methods allow callers to initiate some action on an object, properties hold state of the object, while signals are asynchronous mechanisms used to notify interested parties that something changed on an object.

The API is accessible over D-Bus in a raw form but there are wrappers for C programming language (libnm), and wrapper for Python that allow easier use of the API. Because any application can invoke methods on objects, some of which are system wide, there is authorization framework within API that is used to control who is allowed to access certain objects.

Some more interesting objects exposed through this API are:

NMManager

The main object, NetworkManager itself. It has methods to activate and deactivate connections, add new connections, check connectivity, etc.
NMSettingsConnection

A specific connection in NetworkManager is represented with this object. It consists of settings necessary to establish, maintain, and terminate a connection. There are many objects of this type (class) and they are initialized from system files. Note that this object is used for wired, wireless, VPN and any other connection.
NMSettings

Singleton object that manages all the connection settings, that is NMSettingsConnection objects in a NetworkManager. Through this object it is possible to get a list of all configured connections on a host.
NMDevice

Object, one per network interface on a system, that stores information about network interface and allows the interface to be managed (activated, deactivated, etc.).
NMActiveConnection

This object represents each activated connection on a node.

Some more interesting methods exposed through this API are:

method:ActivateConnection

This is a method on a main NMManager object that is used to activate connection. So, for example, when user selects on option to activate WiFi connection, the user action ends by calling this method on NetworkManager.
method:AddAndActivateConnection

This method allows a new connection to be added and immediately activated. Also available on NMManager object.
property:ActiveConnections

Property that contains a list of active connections. Exposed on NMManager object.

4.4. Options to support network namespaces

The intention of implementing support for network namespaces was to allow applications to be isolated so that they use specific network connections. Note that network namespaces in Linux kernel have some characteristics that anyone who uses them must be aware of:

  1. The only way an process can be moved between different network namespaces is for the process to move itself using setns() system call. There is no way that one process can move some other process between network namespaces.
  2. To move between network namespaces it is necessary to have appropriate permissions. User processes can not move themselves between network namespaces.
  3. A single device can be present only in one network namespace at a time. If, for some reason, it is necessary to have device present in two or more network namespaces, then clones (or bridges) have to be used.
  4. PvD instances can be present in only one network namespace at a time. So, if a node receives a PvD instance, and this instance is already active in one network namespace, then there is no way this PvD instance can be activated in any other network namespace. PvD instances are received via DHCPv4/DHCPv6 or VPN connections. In such cases, if multiple applications must use the PvD instance, then they have to be in the same network namespace. If, on the other hand, node received PvD (via RA message) then it can create as many PvD instances as is necessary and place them within network namespaces. The only potential drawback of this approach is the explosion in number of IPv6 addresses on the network.

So, the idea is the following one. Application is started in some network namespace. This can be done easily (e.g. see 'ip netns exec' command). Then, connectivity within this network namespace is manipulated by either the application itself or by some third application (that application could be nmcli). The manipulation means that requests are sent to NetworkManager via D-Bus to make changes to network namespace. The changes can be activation and deactivation of certain connections. NetworkManager, based on those requests and on specifics of connections and devices those connections are bound to, determines what to do. For example, it can create virtual device in the network namespace, or can move physical device. Basically, this part isn't important to the application itself, the only thing that is important is that the application is assigned requested connections.

Due to the restrictions mentioned at the beginning of this section, implementation must behave in the following ways:

  1. Every process that's started will be started in some network namespace and it will be there until it terminates, including any children it creates during the execution.
  2. If PvD instance is received it is assigned to a single network namespace. Note that this doesn't mean it is the only PvD instance in the given network namespace, it only means it isn't available anywhere else. PvDs received, on the other hand, can be made available in any network namespace by creating as many PvD instances as necessary.
  3. Some combinations won't be possible to achieve. For example, suppose some process P1, should only use PvD1 and is thus placed within the network namespace NS1. Then, P2 comes that should use PvD1 AND PvD2, it won't be possible to start P2 without violating P1's restrictions.

One additional requirement for the network namespaces implementation in NetworkManager is for them to be compatible with the network namespaces as implemented by the iproute2 package [iproute2]. The ip command which is the part of the package for each network namespace created creates a file in /var/run directory. Also, it allows commands to be run within a specific network namespace and also it allows /etc/resolv.conf file to be changed within network namespaces.

When analysing how to modify NetworkManager to support network namespaces, different options were discussed. One of them was to have separate instance of NetworkManager for each network namespace. This translates to creating as many NMManager objects as necessary. This approach was abandoned very early because it was immediatelly obvious that it is too inefficient, i.e. too much code duplication, unnecessary functionality duplication, and very hard communication and exchange of resources between network namespaces.

The conclusion was to start by introducing two new objects, one that will allow management of network namespaces (NMNetnsController) and another one (NMNetns) that will be instantiated for each new network namespace and will represent each namespace. Additionally, when NetworkManager starts, it creates one object of type NMNetns to represent root network namespace.

4.5. Options to support PvDs in NetworkManager

As always, the same goal can be achieved in multiple ways, so here are the options on how PvDs can be implemented within NM. Basically, there are two main approaches: first, existing objects can be enhanced so that they can represent PvDs or a completely new object can be introduced.

4.5.1. Using NMSettingsConnection object to store PvD and PvD instance

Each network connection (which is not the same as PvD or PvD instance) is stored in NMSettingsConnection object. Those objects are generated from static files or dynamically during NetworkManager's execution. NMSettingsConnection objects are initialized from the following sources:

Distribution configuration files. System dependent network configuration files (e.g. /etc/sysconfig/network-scripts for RHEL based systems) are read by NM via plugins and NMSettingsConnection objects are created as a result.

Network manager specific configuration. NetworkManager has its own configuration files that are stored in /etc/NetworkManager/system-connections/.

Dynamically created configurations. While running, NetworkManager allows new configurations to be created via D-Bus interface.

Note that NetworkManager has a concept of profiles that are used in the case of Wired networks. Basically, those are settings which are not bound to any specific network interface. Profiles can have 802.1x type of credentials assigned to them.

So, the idea of integrating PvDs into NetworkManager is for each new PvD or PvD instance to create a new NMSettingsConnection object. The modification to NMSettingsConnection should be extended with PvD ID parameter.

There are several potential problems with this approach:

  • There is a difference between NMSettingsConnection on the one hand, and PvD and PvD instance on the other hand. For example, some NMSettingsConnection defines a network connection that should be configured using DHCP and in that case the NMSettingsConnection isn't PvD nor PvD instance. On the other hand, NMSettingsConnection can be the same as PvD instance. This is the case with static IPv4 configurations when a user specifies concrete IP addresses. Finally, NMSettingsConnection can be PvD only in the case of IPv6 when host part is generated from MAC address.
  • When PvDs and PvD instances are received they are valid only for the interface on which they are received. But, a user can request any NMSettingsConnection object to be activated on any interface which isn't possible.
  • Also, this can create confusion. Take for example preconfigured NMSettingsConnection which is now treated as PvD with a specific PvD ID, and it is defined to use DHCP for the configuration. Obviously, this PvD ID is expected to be valid on a certain interface on a specific attachment point. But due to the way the interface is configured (DHCP) it can actually be activated on any interface on any network that supports DHCP. Thus, it might easily happen that a user by mistake activated this particular NMSettingsConnection on a "wrong" network and so makes a user believe the network is active while in the reality it is not.
  • Note that even NMSettingsConnection objects that contain credential information aren't guaranteed to retrieve the same PvD every time the connection is made. Namely, there are AAA servers and infrastructure that allow clients with a same credentials to connect to multiple networks, and thus to potentially receive multiple PvDs.
  • Finally, the problem is that on a single network interface only one NMSettingsObject might be activated and so this prevents having multiple PvDs on a single interface.

Those problems are not unsolvable, i.e. they could be solved by modifying certain aspects of the NetworkManager in general, and NMSettingsObject in particular.

4.5.2. Treating NMActiveConnection object as PvD instance and PvD

Whenever a connection is made in NetworkManager an object is created. Basically, there are two classes for the object, both of which inherit from NMActiveConnection base class. Which class is used depends on the type of the connection. Basically, the only distinction is made between VPN connections that are represented by NMVPNConnection objects and other connections that are represented by NMActRequest objects. The main task of NMActiveConnection is to bind NMSettingsConnection with NMDevice objects.

The idea in this case is to treat NMActiveConnection as a PvD or a PvD instance, i.e. on each new PvD or PvD instance received new NMActiveConnection is created.

But, there are still some problems:

  • Since NMActiveConnection objects are transient that means that there would be no history of PvDs used. This might, or might not be a problem, depending on whether we need this history or not.

    The cases when the history would be necessary is if we cache some information for the next time we connect to the given PvD. The second case is if there are processes still using PvD through API and thus the information about PvD must live until the process dies. Note that this letter problem could be solved with delayed removal of NMActiveConnections or by some asynchronous mechanism informing applications that specific NMActiveConnection isn't available any more.
  • The second problem is the question if there could exist two ActiveConnection objects that were created from the same NMSettingsConnection object, i.e. can NMSettingsConnections be shared.
  • The third problem is that it will happen from within a single NMActiveConnection that two or more PvDs are received and this requires that NMActiveConnection is a factory for itself.

4.5.3. Using NMIP4Config and NMIP6Config objects for PvDs and PvD instances

NetworkManager has object/classes for storing IPv4 (libnm-core/nm-setting-ip4-config.c) and IPv6 (libnm-core/nm-setting-ip6-config.c) settings. More precisely, those objects are used to expose network settings of devices to the rest of the NetworkManager. So, in some way they are PvDs in a sense that each of them contains enough information to allow connection to the network.

The problem is that internally NetworkManager keeps a single IPv4/IPv6 configuration object per device and in addition it merges all received configuration data on a single interface.

Specifically, in case of configuration data received in RAs everything is kept in the object NMRdisc defined in src/rdisc/nm-rdisc.h. There you'll find arrays of received configuration data. NetworkManager assumes that a single router sends all the configuration data. This assumption is not valid on a multihomed network, or a network that can send multiple provisioning domains within each RA. What would be necessary is to change this structure so that configuration data is kept separate for each router and provisioning domain.

The problems in this case are:

  • NMIPxConfig objects were not intended to keep information about available IPv4 and IPv6 addresses but to make available addresses configured on device. So, it reverses the purpose of those objects which isn't accepted so well.
  • Again, those are transient objects and thus there is no history. It is possible to keep every object alive, but NM isn't designed to behave in such way.
  • It seems that in libnm there is no way to obtain a list of IPv4 and IPv6 objects.

4.5.4. PvD specific objects

This is the final alternative and the most intrusive one. The idea is that settings, active connections and IPv6 and IPv6 objects/classes stay as is, but instead, when each new connection is established a new PvD data structure is created. PvD is inferred from configuration settings or the NetworkManager received explicit PvD.

This would solve the problem that some settings might be used to obtain different PvDs which isn't known until connection is established. For example, if we are using DHCP to configure the interface, then, PvD received depends on the PoA.

It would also solve the problem that the user might try to instantiate one PvD, while some other is actually in use. This way, after the connection is established, appropriate PvD is searched for, or new one is created.

This is most intrusive change that would require change in APIs and thus break compatibility with the existing applications (or require a completely new API).

4.6. Implementation

To understand and modify NetworkManager it is necessary to understand GObject system. GObject system was created with the goal of making it possible to use object oriented programming in the C programming language, and also to allow easy integration of objects with D-Bus. But, because GObject isn't strictly related to PvDs, nor it is necessary to know it to understand the following text, it won't be explained further in this document. Yet, in order to make changes to NetworkManager, knowledge of GObject is mandatory.

The implementation of PvD support within NetworkManager was done with the following requirements in mind:

  • For backwards compatibility root network namespace is left as is and is handled by NetworkManager as before, i.e. it contains all configurations received merged into one.
  • NetworkManager doesn't touch network namespaces created by other applications, like for example different virtualization solutions.

The NetworkManager sources that are described by this draft are available on GitHub repository [nmpvdsrc]. Additionally, NetworkManager uses libndp library to parse and create RA and RS messages. This library was also modified as a part of this project to support PvD CO option [I-D.ietf-mif-mpvd-ndp-support]. The modified libndp code can also be found on GitHub [libndp].

First step towards PvD implementation in NetworkManager was to add support for network namespaces. When network namespaces was implemented, PvD support was added. So, in the following text we first describe how network namespaces were added and then how PvDs were implemented.

4.6.1. Network namespace management

This section documents a process to implement support for network namespaces in the NetworkManager. When network namespaces are available for use then NetworkManager can manage different network namespaces and connections within them. Applications then can be started in certain namespaces in which only specific connectivity is available. This allows control of what applications can use.

The best use case for network namespaces is VPN connection (PvD instance) isolation and it was implemented as an example of how the network namespace implementation within NetworkManager can be used. When VPN connection is activated it is isolated within a special network namespace where the only connectivity is via VPN connection. Then, only specific applications are started within this network namespace and those applications will be able to access VPN resources while all the other applications, that are in different network namespaces, will not see VPN connection and thus will not be able to use them.

Note that at the time this draft written, Thomas Haller one of the developers of NetworkManager implemented basic support for namespaces in NetworkManager that took a bit different approach with respect to some implementation details but nevertheless intends to use certain functionality made as a part of this project. This means that there is a good chance of having support for network namespaces in official release of NetworkManager.

The following changes were made to NetworkManager in order to introduce support for network namespaces.

First, a new object NMNetnsController was added to the NetworkManager whose purpose is to allow management of all network namespaces controlled by NetworkManager. This object implements interface org.freedesktop.NetworkManager.NetworkNamespacesController which has methods that allow users to create a new network namespace, or to remove the existing one. It is also possible to obtain a list of existing network namespaces. Network namespaces created by NetworkManager are compatible with iproute2 network namespaces.

Then, new class NMNetns was added that is used to represents a single network namespace. So, when new network namespace is created a new object NMNetns is created and exposed on D-Bus. This object allows manipulation with network namespace via the interface org.freedesktop.NetworkManager.NetNsInstance. So, for example, it is possible to get a list of all devices within the network namespace, take certain device from some other network namespace and to activate some connection.

NMSettings is now singleton object. This wasn't so significant change because there was also one object of this type before, but now it is more explicitly exposed as such.

NMPlatform, NMDefaultRouteManager and NMRouteManager aren't singleton objects any more. They are now instantiated for each new network namespace that is created.

When infrastructure work was finalized it was time to add the first user of network namespaces as a proof of concept that the approach is feasible. As it was mentioned VPN is the most straightforward case of network isolation because of the assumption that applications either have access to VPN or they don't for a security reasons.

At the beginning of the implementation of VPN isolation, there was a doubt on where to place the knowledge, and thus controll, of network namespace. Two places were candidates, in NMActiveConnection and in NMVPNConnection classes. NMActiveConnection is actually a base class of NMVPNConnection class, but also for the class NMActivateConnection that is used for all other connections. Thus implementing the knowledge in class NMActiveConnection would make it available to all connections in Networkmanager. In the end, it turned out that modification of NMVPNConnection approach is better because it was necessary to introduce new configuration parameters in the configuration file that are specific to a VPN type of a connection that will specify that isolation is necessary. Some additional options were also necessary. In the end, the following options were added:

netns-isolate

Boolean parameter (yes/no) which defines weather VPN connection should be isolated within a network namespace or not. For backwards compatibility reasons
netns-persistent

Should network namespace be persistant (yes) or not (no). Persistant namespace will be retained when VPN connection is terminated, while non-persistant will be removed.
netns-name

Network namespace name. Special value is uuid which means connection's UUID should be used, also name is special value that requests connection's name to be used. Finally, any other string is taken as-is and used as a network namespace name.
netns-timeout

How much time to wait (in milliseconds) for device to appear in target namespace. Namely, when VPN activates virtual device appears in the network namespace of the interface used to connect to the Internet. So, it is first moved to target network namespace (asynchronous operation) and then all the received configuration parameters are applided.

Basically, the implementation is such that when device appears in root network namespace it is taken from there (using TakeDevice method, but called directly instead via D-Bus). When device appears in the target network namespace network parameters are assigned to the interface. This was tested with OpenVPN type of VPN.

The implementation has two problems. First, the case of VPN connections that don't create virtual devices but instead just modify packet processing rules in the Linux kernel (i.e. XFRM). Secondly, hostname and name resolution parameters aren't assigned because the infrastructure is lacking in that respect.

4.6.2. Experimental PvD Implementation

The first implementation of PvDs was done using NMIP6Config as a PvD container. Before describing the implementation we have to state that the only mechanism currently able to carry explicit PvDs are RA messages. For that reason NMIP4Config weren't changed. NMIP6Config objects are extended with PvD ID field. At first, there was support for different types of PvD IDs and the first implemented type was UUID stored in ASCII format. Later in the development process support for PvD ID types was removed and the only possible type is UUID. It seems that this doesn't make implementations less flexible and in the same time substantially reduces complexity.

When RA is received from routers that don't support PvDs, it is first processed as usual, and then a new implicit PvD is created from the data in the RA. The ID of the implicit PvD is calculated by concatenating data unique to PvD (network prefix, network prefix length, DNS servers, DNS search domains) and hashed using MD5. This is then used as UUID of the given PvD. Note that if there are two or more routers on the local network, each sending its own configuration data, then PvD IDs will differ and a separate PvD will be created for each RA.

When PvD container option is present within the RA message it is parsed as "normal" RA message would, but then it is only placed in a list of available PvDs on a given interface. This configuration is not merged with the implicit configuration in the root network name space.

All PvDs obtained from RA are then handled to NMDevice object which has old behavior to merge data from implicit PvDs so to make the configuration available to host, but now there is also additional processing of PvDs. A hash table with set of PvDs received on the given interface is added to the device. PvDs are managed in this hash table, i.e. added or removed according to the events received on the network. Note that PvDs bound to device are only valid on a given device, i.e. they can not be used on any other device that doesn't have the same connectivity.

In order for the applications to be able to query received PvDs they are exposed in two places. The first place they are exposed on is via D-Bus on the device object itself (NMDevice). So, any application can query a device to find out if there are PvDs available. The same information is also exposed on NMActivateConnection object, which basically acts as a proxy to NMDevice object that is part of the NMActiveConnection object. When application queries for available PvDs what it actually gets is a list of D-Bus paths to objects with details.

Details about PvDs are available through NMIP6Config objects. When a new PvD is created, an object of class NMIP6Config is created and automatically exposed to D-Bus. By looking at the properties of those objects, details about PvDs can be determined.

4.7. Experience Gained and Future Work

The current work is focused on design of appropriate API that will be exposed over D-Bus to allow applications to activate and deactivate specific PvDs. It is not yet finished because there is no mapping between PvDs and current objects - like connections, devices, settings. Yet, two principles emerged:

  • The changes to the existing API should be minimal. In other words we try to avoid introducing parallel API to the existing one.
  • Applications should be isolated from specifics of a network configuration, i.e. they shouldn't know or care which devices contain specific PvDs or how they should be composed, etc.

Part of the API design is to determine the most appropriate structure to keep information about PvDs. The current use of NMIP6Config object isn't good for two reasons. First, it wasn't designed for that purpose, and second, existing API methods don't allow objects of type NMIP6Config to be passed as arguments. In the next iteration the most probable candidate for storing PvD information will be NMSettingsConnection object.

After API form is resolved, there are at least few more things that have to be sorted out before it can be considered the work to be finished:

  • The API is still low level. It would be good to allow applications to specify requirements for the connections and then, based on this requirements, NetworkManager selects apropriate connection for the application.
  • Host name management has to be properly handled. Currently, this is ignored by PvD handling code within NetworkManager. Still, this want be easy because it might happen that two PvDs have to be available to application each with different hostname.
  • DNS resolution is also currently lacking, specially considering there are existing RFCs that allow domains to be specified for each DNS server and the new drafts that appeared recently in MIF WG.

4.8. Unresolved issues

While working on PvD implementation in NetworkManager there were some issues that came up and that were not resolved during the prototyping implementation. They are listed in this section.

Issues related to the content of PvD payload:

  • It is important to know what information belongs to a PvD. Namely, if there is PvD container option that holds all information belonging to a specific PvD then it is clear that information contained in the header of the RA message couldn't be included within container options. Information such as gatway addres, hop limit, router lifetime and timers, and flaggs M and O must be common to all content of the RA.
  • PvD has to contain gateway address, one or more. The problem is that this information isn't an option and thus can not be embedded within a PvD container. So, all PvDs sent by some router will have the same gateway address. Note that multiple routers can send RAs with the _same_ PvD in it. In such cases gateway addresses are merged.

Issues related to the difference between PvD and PvD instance:

  • In MIF WG documents the term "PvD" is used with two different meanings. The first one is a concrete set of parameters bound to a specific interface, most notably a specific IP address that should be used by host for its interface. The second one is a more general and encompases a set of PvD instances thus creating some kind of a PvD class. The characteristic of PvD (as opposed to PvD instance) is that it contains parameters bound to a local network, i.e. it has network prefix instead of a specific IP address. Obviously there could be other parameteres besides IP addresses that belong to PvD instance, but the analysis wasn't done as a part of work that produced this draft. The distinction between PvD and PvD instance is more visible in IPv6 than in IPv4 due to the way they are configured on end hosts.

4.8.1. Identifying implicit PvDs

In this section we discuss how PvD aware clients should identify implicit PvDs they receive. "Implicit PvDs" are those that contain no PvD related identification information like specific options with identifiers, or container options, etc. In this class are all legacy mechanisms, like DHCP, RA, etc.

The idea is to generate stable identifier independent of PvD instance based on a set of configuration options that uniquely identify a PvD. This would allow PvD aware applications to be more uniform in a sense that they don't have any special functionality (or it is minimized) for support of legacy configurations. The requirements are also:

  • Two independent clients, or client and server, generate the same PvD ID and in that way it is possible to do more specific configurations based on PvD ID. A more specific configuration might be, for example, what is a purpose of some specific PvD.
  • The process of generating PvD ID should allow clients to detect connectivity changes, i.e. to know when they are connected to another network, or when connectivity of the existing network changed.
  • The process of generating PvD ID should be such that all PvD aware nodes, receiving the same set of configuration parameters (i.e. the same implicit PvD), would assign it the same PvD ID.

The following elements could comprise a set of parameters that uniquely identify PvD:

  • All network prefix(es) used on a local network.
  • Routes specific to a gateway that sent RA.
  • Addresses of DNS servers.
  • Domain names served by DNS servers.

Time values are not take in into account while determining to which PvD given configuration parameters belong to. Also, the following parameters are for consideration should they or not be taken into account when determining PvD identities:

  • hop_limit
  • mtu
  • dhcp_level
  • Different lifetimes.

Note that in some cases there could be the same set of network(!) configuration parameters used on multipe different networks. This is the case with IPv4 private space in which IP network mask, gateway and DNS server combination doesn't have to be unique! Take for example residential networks, or different companies. This might pose a problem for determining ID of implicit PvD by calculating hash of network layer parameters. On IPv6 this isn't a problem due to the use of private addresses and/or MAC addresses when generating specific addresses.

To generate PvD ID from the set of configuration parameters some form of hashing over precisely defined set of configuration options might be used. UUID algorithm might be used as well.

4.8.2. Managing PvD lifecycle

PvDi lifecycle consists of the step when they are created and communicated to the nodes, used within the nodes, and finally removed when the node loses connectivity. Here are some the questions that should be answered by defining node behavior:

  • PvDs are received through an interface. And very likely they should be bound to this interface only, or more generally to the network on which they were received. This also means that when the interface stops working, PvDs received on the same interface should disappear too. The question is what should happen if the "main" implicit PvD dissapers? Should all related PvDs be removed too?
  • Should applications know about relationship between PvDs and interfaces, or should they only know for PvDs while some more advanced applications would know for the connection? In other words, when an PvD aware application requests information about available PvDs, should it also receive information on which interfaces those PvDs were received? Should returned PvDs be grouped by interfaces they were received? Or should they be returned flat, without any information about interfaces they are bound to?
  • When a node first time learns about PvD, how long is this information retained? Until connection ends, or forever with some indication that PvD isn't active? Note that this doesn't mean that PvDs is used after connection ends, it only means that applications can find out about PvDs even though they are not active. Presently, this is how information about APs in wireless networks behave, i.e. they are all remembered. But the behavior in wired networks is different.
  • Suppose that on a single link there is implicit PvD and at least one explicit link. What should node do when implicit PvD disappears but explicit is stil present? Could this happen?
  • If we take approach of sending PvD configuration options via RA messages, how can router remove a specific PvD? The only possibility at the moment is to send all the options with lifetime set to zero. The problem is that not all options support lifetime so it is not possible to only remove a specific explicit PvD.
  • There might be a cases when two PvDs have to be merged. For example, if they have to be used on a same interface. Should there be a set of rules that govern the merger? NetworkManager currently merges all configurations in root network namespace. Note that there are two different mergers. One that merges interface specific configuration data and the another one that merges node specific configuration data (e.g. DNS addresses, default routers).
  • From the management perspective, there is a possibility of proliferation of IPv6 addresses used by a node. Should this be bound, or controlled in some way?
  • How to handle multiple DHCP servers on a local network (or accessible via DHCP relay) serving different PvDs? Currently, a node selects one DHCP server and ignores others. So, using current mechanisms there is no way a node might configure multiple PvDs from multiple DHCP servers.

Some questions are not specific for PvDs but are also valid for NDP in general so they might already have an answer. Still, for completness and until we figure out answers, here they are:

  • There are multiple lifetimes in RA message, and in PvD in particular. What if one element of PvD times out, others not? Should only this particular element be removed, or the all PvD? For example, what if prefix disappears and other elements don't?
  • What if in one RA message there is certain set of parameters, but then in the following message some parameters from the first message are missing? How should implementation react in that case?

4.9. Comments on MPvD Architecture RFC document

In this section few comments about MPvD architecture RFC [RFC7556] are given in light of the implementation of PvDs described earlier.

MPvD Architecture document distinguishes between implicit and explicit PvDs. Additional distinction is necessary between PvDs and among PvD instances.

In Section 2.2 the architecture document talks about treating information received on different interfaces as being different PvDs. It might happen that different implicit PvDs are received on a single interface. Thus, this should be better defined, i.e. more formal process on how to distinguish PvDs and PvD instances.

Section 2.3 introduces possibility that one PvD is present on multiple interfaces. This is something that complicates implementation due to the way routing in the operating systems works. Additionally, there are different bridging solutions on L2 that might solve this case much easier.

Different types of PvD identifiers are introduced in Section 2.4. In this implementation only UUID is used, and to present user with something meaningful other mechanisms will be used. For example, connection description might be obtained via HTTP to appropriate URL, or query to DNS.

Section 2.5 talks about relation between IPv6 and IPv4 addresses. From the implementation perspective it seems easier to keep those separated and treat them as a separate PvDs. One reason is that those two are usually configured separately. Thus, to keep things simple, it is best to keep them separated.

Section 3.2 talks about security PvDs. This is a very complex topic and it wasn't touched in this implementation.

In Section 4 three examples are given of multiple PvDs present. Of those, only VPN case (example 4.2) was implemented fully due to being the most straightforward.

The example 4.1 is interesting due to the fact that it already works. In other words, when NetworkManager receives configurations via wired and wireless network by manipulating default route it gives a precedence to wired connection. Because both, i.e. wired and wireless networks, are from the same provider that means DNS servers are very likely the same and so this case is, in some way, already solved. Additionally, it is very questionable if the network namespace separation would be useful in this case. Namely, those two network are alternatives and if separated, and one application is in wired network that at some point disappears, it is expected that wireless in that case could be used. But, because of the separation it can not. Note that there is assumption here that wired should be used first by application. In case it would not, then network separation is necessary.

Example 4.3 is valid for a gateway and devices behind gateway are not aware of different provisioning domains. Yet, a host might also find itself in the given situation and so this example is also interesting for nodes. Yet, this functionality is planned for future work.

RFC7556 [RFC7556] sets requirement of not merging different PvDs in section 5. This, when applied to Linux OS, means that network namespaces are the only way to go. There are other options, but they all seem to be a lot more complex.

As a todo item it would be beneficial to write a draft that defines node's behavior in presence of multiple PvDs. This can be done in terms of multiple virtual hosts on a single host (which are basically network namespaces).

5. Server component

Server side component of the MIF architecture implementation described in this document is Router Advertisements Daemon (radvd). radvd is used to announce PvD-related network configuration data to the clients using NDP protocol. PvD-aware radvd extension alows clients to autoconfigure their network parameters related to multiple PvDs simultaneously. For the purpose of MIF architecture implementation, existing structure of Router Advertisements (RA) message is extended with new NDP option that encapsulates PvD-related network configuration information as suboptions. To configure what PvD-related network configuration information to include into RA messages, radvd configuration file is extended with new configuration block where PvD-related network configuration information are specified.

Proposed extension of the Router Advertisement Daemon is backward compatible with the non-PvD-aware version of radvd. This means that old clients will continue to work normally, except that they will not be able to configure PvD-related network information on their network interfaces. New option introduced into the RA message is silently ignored by non-PvD-aware clients if they are implemented correctly.

5.1. Router Advertisements (RA) message extension

PvD-aware version of radvd supports the inclusion of network parameters that belong to explicitly identified provisioning domains into the Router Advertisements (RA) messages. Explicitly identified provisioning domain is a set of consistent network configuration parameters identified by unique provisioning domain identifier (PvD ID).

To carry PvD-related network configuration information, the structure of the RA message is extended with new NDP option called Provisioning Domains Container option (PVD_CO), as described in [I-D.ietf-mif-mpvd-ndp-support]. PVD_CO is a container option, which means it encapsulates other NDP options. NDP options nested within the PVD_CO option describe particular PvD-related network configuration information. All information the belong to a single PvD which are part of a single network autoconfiguration procedure are nested within the same PVD_CO option. If a single RA message contains network configuration information from multiple sources, multiple PVD_CO options are present in such message. Therefore, RA message MAY contain zero or more PVD_CO options, one per PvD announced in RA message. Figure 4 shows the structure of the PVD_CO option used in proposed MIF architecture implementation.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Type=63    |    Length     |0|  Reserved=0 |  Name Type=0  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Padding to ensure 8 octets alignment=0             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
~                  Encapsulated NDP options                     ~
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          

Figure 4: PVD_CO option

In this experimental implementation, the type of the option is set to 63. Option length is calculated depending on the length of the encapsulated options. The values of the S flag and Name Type field are set to 0, which indicates that current implementation does not use any security mechanism. PVD_CO option header ends up with four octets long padding filled with zeros. The rest of the option contains encapsulated NDP options that describe particular PvD-related network parameters.

The following NDP options are allowed to appear within the PVD_CO option:

  • Provisioning Domain Identifier option (PVD_ID)
  • Prefix Information option
  • Route Information option
  • RDNSS option
  • DNSSL option

PVD_ID is a new option introduced to uniquely identify the PvD, while others are standard NDP options used in regular non-PvD-aware RA messages.

PVD_ID is mandatory option and MUST be present exactly once per PVD_CO option. If RA message contains multiple PVD_CO options, PvD identifiers included in PVD_ID options MUST be unique. General structure of the PVD_ID option is proposed in [I-D.ietf-mif-mpvd-id]. For the pupose of this MIF architecture implementation, a slightly simplified structure is used, as shown in Figure 5.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Type=64    |   Length=5    |   id-type=4   | id-length=36  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
~                    PVD identity information                   ~
~                       (UUID, 36 octets)                       ~
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          

Figure 5: PVD_ID option

In this experimental implementation, the type of the option is set to 64. [I-D.ietf-mif-mpvd-id] proposes several diferent data formats for PvD identifiers, including UUID, UTF-8 string, OID, NAI Realm, FQDN, and ULA Prefix. Here, UUID is chosen as the only supported format for PvD identifier. Since the length of the UUID is fixed to 36 octets, all fields in PVD_ID option are of fixed length. The option itself is always 40 octets long, therefore the Length field is set to 5, which is a multiplier of 8 octets. id-type field is always set to 4, determining UUID as the only supported PvD identifier data format. Consequently, id-length field is always set to 36, determining the length of UUID.

Other options nested within the PVD_CO option follow a standardized structure defined in corresponding NDP documents. Each of the nested options MAY appear zero or more times per PVD_CO option.

5.2. Configuration file (radvd.conf) extension

The behavior of the Router Advertisement Daemon is configured using textual configuration file, usually called radvd.conf. Configuration file enables system administrators to specify a wide range of configuration parameters that can roughly be divided into two categories: 1) interface-level configuration directives that define whether or not radvd will announce RA messages on a given interface, timing for sending RA messages, client filters, etc., and 2) option-specific configuration directives that define the content of particular NDP options. For the purpose of the MIF architecture implementation, radvd configuration file is extended with new configuration directive which enables definition of the content of the PVD_CO option.

Figure 6 shows a general structure of the radvd configuration file extended with new configuration section for specifying PvD-related network configuration information.

  interface name {
      list of interface specific options
      list of prefix definitions
      list of route definitions
      list of RDNSS definitions
      list of DNSSL definitions
      list of LoWPANCo definitions
      list of ABRO definitions
    +----------------------------------------+
    | list of PvD definitions                |
    +----------------------------------------+
  };
          

Figure 6: The structure of the radvd.conf extended with PvD-related configuration section

Newly introduced configuration section enables definition of network configuration parameters for multiple provisioning domains. One PvD definition per provisioning domain MUST be specified in radvd.conf. For each PvD definition, radvd generates a separate PVD_CO option in RA message.

Figure 7 shows the structure of the radvd.conf's configuration directive for PvD definition. PvD configuration block starts with "pvd" keyword followed by UUID that represents the PvD identifier. The body of the pvd block contains zero or more definitions for prefix, route, RDNSS, and DNSSL. Definitions used inside the pvd block follow the same syntax as when they appear in the upper level interface block.

  pvd UUID {
      list of prefix definitions
      list of route definitions
      list of RDNSS definitions
      list of DNSSL definitions
  };
          

Figure 7: The structure of the PVD configuration section to use in radvd.conf

5.3. Example Usage

The following example shows a radvd configuration file that combines two types of network configuration information: 1) those that are announced to clients in traditional way, without specifying PvD identity (implicit PvDs), and 2) those belonging to explicitly identified PvDs (explicit PvDs). In the example given, radvd announces three sets of network configuration information, one belonging to implicit, and two belonging to explicit PvD. As implicit PvD, it announces only the prefix information (2001:db8:1111:2222::/64). For the first explicit PvD, radvd announces prefix information (2001:db8:aaaa:bbbb::/64) and the address of the DNS server (2001:db8:aaaa:bbbb::1). For the second explicit PvD, only the prefix information (2001:db8:cccc:dddd::/64) is announced to the clients.

  interface eth0
  {
      AdvSendAdvert on;
      IgnoreIfMissing on;
      MinRtrAdvInterval 10;
      MaxRtrAdvInterval 20;

      prefix 2001:db8:1111:2222::/64
      {
          AdvOnLink on;
          AdvAutonomous on;
          AdvRouterAddr off;
      };

      pvd f5a7f97d-ba83-4fd8-a3e0-839b2c2446ca
      {
          prefix 2001:db8:aaaa:bbbb::/64
          {
              AdvOnLink on;
              AdvAutonomous on;
              AdvRouterAddr off;
          };

          RDNSS 2001:db8:aaaa:bbbb::1
          {
              AdvRDNSSLifetime 30;
          };
      };

      pvd f5a7f97d-ba83-4fd8-a3e0-839b2c2446cb
      {
          prefix 2001:db8:cccc:dddd::/64
          {
              AdvOnLink on;
              AdvAutonomous on;
              AdvRouterAddr off;
          };
      };
  };
          

Figure 8: Example of radvd.conf with combined configuration parameters for implicit and explicit PvDs

Based on this configuration file, radvd generates the RA message shown in Figure 9.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      134      |       0       |          Checksum             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Cur Hop Limit |0|1|  Reserved |       Router Lifetime         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Reachable Time                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          Retrans Timer                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ <-+
|       3       |       4       |       64      |1|1| Reserved1 |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   I
|                         Valid Lifetime                        |   m
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   p
|                       Preferred Lifetime                      |   l
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Reserved2                           |   P
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   v
~                      2001:db8:1111:2222::                     ~   D
~                          (16 octets)                          ~   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ <-+
|  63 (PVD_CO)  |      13       |0|      0      |       0       |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|            0 (padding to ensure 8 octets alignment)           |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|  64 (PVD_ID)  |       5       |        4      |      36       |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|                                                               |   |
~             f5a7f97d-ba83-4fd8-a3e0-839b2c2446ca              ~   E
~                         (36 octets)                           ~   x
|                                                               |   p
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   l
|       3       |       4       |       64      |1|1| Reserved1 |   i
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   c
|                         Valid Lifetime                        |   i
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   t
|                       Preferred Lifetime                      |   
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   P
|                           Reserved2                           |   v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   D
~                      2001:db8:aaaa:bbbb::                     ~   1
~                          (16 octets)                          ~   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|      25       |       3       |           Reserved            |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|                              30                               |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
~                      2001:db8:aaaa:bbbb::1                    ~   |
~                          (16 octets)                          ~   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ <-+
|  63 (PVD_CO)  |      10       |0|      0      |       0       |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|            0 (padding to ensure 8 octets alignment)           |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
|  64 (PVD_ID)  |       5       |        4      |      36       |   E
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   x
|                                                               |   p
~             f5a7f97d-ba83-4fd8-a3e0-839b2c2446cb              ~   l
~                         (36 octets)                           ~   i
|                                                               |   c
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   i
|       3       |       4       |       64      |1|1| Reserved1 |   t
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         Valid Lifetime                        |   P
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   v
|                       Preferred Lifetime                      |   D
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   2
|                           Reserved2                           |   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
~                      2001:db8:cccc:dddd::                     ~   |
~                          (16 octets)                          ~   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ <-+
          

Figure 9: Example of RA message with combined configuration parameters for implicit and explicit PvDs

Based on this RA message, PvD-aware clients can configure three different set of network parameters on their network interfaces. Non-PvD-aware clients will use prefix information from implicit PvD only, skipping the content of two PVD_CO options.

6. IANA Considerations

This memo includes no request to IANA.

7. Security Considerations

Due to the complexity of the functionality itself, security was not considered in the implementations.

8. References

8.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.
[RFC6418] Blanchet, M. and P. Seite, "Multiple Interfaces and Provisioning Domains Problem Statement", RFC 6418, DOI 10.17487/RFC6418, November 2011.
[RFC7556] Anipko, D., "Multiple Provisioning Domain Architecture", RFC 7556, DOI 10.17487/RFC7556, June 2015.

8.2. Informative References

, ", ", ", "
[dbus]D-Bus", March 2016.
[I-D.ietf-mif-mpvd-id] Krishnan, S., Korhonen, J., Bhandari, S. and S. Gundavelli, Identification of provisioning domains", Internet-Draft draft-ietf-mif-mpvd-id-02, October 2015.
[I-D.ietf-mif-mpvd-ndp-support] Korhonen, J., Krishnan, S. and S. Gundavelli, "Support for multiple provisioning domains in IPv6 Neighbor Discovery Protocol", Internet-Draft draft-ietf-mif-mpvd-ndp-support-03, February 2016.
[iproute2]iproute2", March 2016.
[NMSrc]NetworkManager project", March 2016.
[radvd-src]Linux IPv6 Router Advertisement Daemon (radvd)", March 2016.

8.3. Implementation repositories

[libndp] Gros, S., "libndp - Library for Neighbor Discovery Protocol", January 2016.
[nmpvdsrc] Gros, S., "NetworkManager with network namespaces and PvD extensions", March 2016.
[PvD-manager] Jelenkovic, L. and D. Skvorc, "PvD-manager repository", March 2016.
[radvd] Skvorc, D., "PvD-aware router advertisements daemon (radvd)", February 2016.

Authors' Addresses

Stjepan Gros (editor) University of Zagreb Unska 3 Zagreb, 10000 HR EMail: stjepan.gros@fer.hr
Leonardo Jelenkovic University of Zagreb Unska 3 Zagreb, 10000 HR EMail: leonardo.jelenkovic@fer.hr
Dejan Skvorc University of Zagreb Unska 3 Zagreb, 10000 HR EMail: dejan.skvorc@fer.hr