Internet DRAFT - draft-haynes-nfsv4-flex-filesv2
draft-haynes-nfsv4-flex-filesv2
NFSv4 T. Haynes
Internet-Draft Primary Data
Intended status: Standards Track August 07, 2017
Expires: February 8, 2018
Parallel NFS (pNFS) Flexible File Layout v2
draft-haynes-nfsv4-flex-filesv2-00.txt
Abstract
The Parallel Network File System (pNFS) allows a separation between
the metadata (onto a metadata server) and data (onto a storage
device) for a file. The flexible file layout type is an extension to
pNFS which allows the use of storage devices in a fashion such that
they require only a quite limited degree of interaction with the
metadata server, using already existing protocols. This document
describes two extensions to the flexible file layout type to allow
for multiple stateids for tightly coupled NFSv4 models and an
additional security mechanism for loosely coupled models.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 8, 2018.
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
Haynes Expires February 8, 2018 [Page 1]
Internet-Draft Flex File Layout v2 August 2017
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 4
2. XDR Description of the Flexible File Layout Type . . . . . . 4
2.1. Code Components Licensing Notice . . . . . . . . . . . . 5
3. Flexible File Layout Type v2 . . . . . . . . . . . . . . . . 6
3.1. ffv2_layout4 . . . . . . . . . . . . . . . . . . . . . . 7
4. Security Considerations . . . . . . . . . . . . . . . . . . . 9
4.1. RPCSEC_GSS and Security Services . . . . . . . . . . . . 9
4.1.1. Loosely Coupled . . . . . . . . . . . . . . . . . . . 9
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
6. References . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.1. Normative References . . . . . . . . . . . . . . . . . . 10
6.2. Informative References . . . . . . . . . . . . . . . . . 11
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 11
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 11
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction
In the parallel Network File System (pNFS), the metadata server
returns layout type structures that describe where file data is
located. There are different layout types for different storage
systems and methods of arranging data on storage devices.
[flexfiles] defines the flexible file layout type used with file-
based data servers that are accessed using the Network File System
(NFS) protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1
[RFC5661], and NFSv4.2 [RFC7862].
The first version of the flexible file layout type had two issues
which could not be addressed in [flexfiles] because of existing
implementations. The first issue was that under the tightly coupled
model for a NFSv4 implementation, either a global stateid or an
anonymous stateid needed to be used. The second issue was that under
the loosely coupled model, for a secure Remote Procedural Call (RPC)
([RFC5531]) implementation, each of the client, metadata server, and
storage devices needed to implement an RPC-application-defined
structured privilege assertion with RPCSEC_GSS version 3
(RPCSEC_GSSv3) [RFC7861]. The second version of the flexible file
layout type addresses both of these issues.
Haynes Expires February 8, 2018 [Page 2]
Internet-Draft Flex File Layout v2 August 2017
1.1. Definitions
control communication requirements: defines for a layout type the
details regarding information on layouts, stateids, file metadata,
and file data which must be communicated between the metadata
server and the storage devices.
control protocol: defines a particular mechanism that an
implementation of a layout type would use to meet the control
communication requirement for that layout type. This need not be
a protocol as normally understood. In some cases the same
protocol may be used as a control protocol and data access
protocol.
data file: is that part of the file system object which contains the
content.
fencing: is when the metadata server prevents the storage devices
from processing I/O from a specific client to a specific file.
file layout type: is a layout type in which the storage devices are
accessed via the NFS protocol (see Section 13 of [RFC5661]).
layout: informs a client of which storage devices it needs to
communicate with (and over which protocol) to perform I/O on a
file. The layout might also provide some hints about how the
storage is physically organized.
layout iomode: describes whether the layout granted to the client is
for read or read/write I/O.
layout stateid: is a 128-bit quantity returned by a server that
uniquely defines the layout state provided by the server for a
specific layout that describes a layout type and file (see
Section 12.5.2 of [RFC5661]). Further, Section 12.5.3 of
[RFC5661] describes the difference between a layout stateid and a
normal stateid.
layout type: describes both the storage protocol used to access the
data and the aggregation scheme used to lay out the file data on
the underlying storage devices.
loose coupling: is when the metadata server and the storage devices
do not have a control protocol present.
metadata file: is that part of the file system object which
describes the object and not the content. E.g., it could be the
time since last modification, access, etc.
Haynes Expires February 8, 2018 [Page 3]
Internet-Draft Flex File Layout v2 August 2017
metadata server (MDS): is the pNFS server which provides metadata
information for a file system object. It also is responsible for
generating layouts for file system objects. Note that the MDS is
responsible for directory-based operations.
recalling a layout: is when the metadata server uses a back channel
to inform the client that the layout is to be returned in a
graceful manner. Note that the client has the opportunity to
flush any writes, etc., before replying to the metadata server.
revoking a layout: is when the metadata server invalidates the
layout such that neither the metadata server nor any storage
device will accept any access from the client with that layout.
stateid: is a 128-bit quantity returned by a server that uniquely
defines the open and locking states provided by the server for a
specific open-owner or lock-owner/open-owner pair for a specific
file and type of lock.
storage device: designates the target to which clients may direct I/
O requests when they hold an appropriate layout. See Section 2.1
of [pNFSLayouts] for further discussion of the difference between
a data store and a storage device.
tight coupling: is when the metadata server and the storage devices
do have a control protocol present.
1.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. XDR Description of the Flexible File Layout Type
This document contains the external data representation (XDR)
[RFC4506] description of the flexible file layout type version 2.
The XDR description is embedded in this document in a way that makes
it simple for the reader to extract into a ready-to-compile form.
The reader can feed this document into the following shell script to
produce the machine readable XDR description of the flexible file
layout type version 2:
<CODE BEGINS>
#!/bin/sh
grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??'
Haynes Expires February 8, 2018 [Page 4]
Internet-Draft Flex File Layout v2 August 2017
<CODE ENDS>
That is, if the above script is stored in a file called "extract.sh",
and this document is in a file called "spec.txt", then the reader can
do:
sh extract.sh < spec.txt > flex_filesv2_prot.x
The effect of the script is to remove leading white space from each
line, plus a sentinel sequence of "///".
The embedded XDR file header follows. Subsequent XDR descriptions,
with the sentinel sequence are embedded throughout the document.
Note that the XDR code contained in this document depends on types
from both the flex files version 1 flex_filesv2_prot.x file
([flexfiles]) and the NFSv4.1 nfs4_prot.x file ([RFC5662]). This
includes both nfs types that end with a 4, such as offset4, length4,
etc., as well as more generic types such as uint32_t and uint64_t.
2.1. Code Components Licensing Notice
Both the XDR description and the scripts used for extracting the XDR
description are Code Components as described in Section 4 of "Legal
Provisions Relating to IETF Documents" [LEGAL]. These Code
Components are licensed according to the terms of that document.
<CODE BEGINS>
/// /*
/// * Copyright (c) 2012 IETF Trust and the persons identified
/// * as authors of the code. All rights reserved.
/// *
/// * Redistribution and use in source and binary forms, with
/// * or without modification, are permitted provided that the
/// * following conditions are met:
/// *
/// * o Redistributions of source code must retain the above
/// * copyright notice, this list of conditions and the
/// * following disclaimer.
/// *
/// * o Redistributions in binary form must reproduce the above
/// * copyright notice, this list of conditions and the
/// * following disclaimer in the documentation and/or other
/// * materials provided with the distribution.
/// *
/// * o Neither the name of Internet Society, IETF or IETF
/// * Trust, nor the names of specific contributors, may be
Haynes Expires February 8, 2018 [Page 5]
Internet-Draft Flex File Layout v2 August 2017
/// * used to endorse or promote products derived from this
/// * software without specific prior written permission.
/// *
/// * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS
/// * AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
/// * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
/// * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
/// * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
/// * EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
/// * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
/// * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
/// * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
/// * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
/// * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
/// * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
/// * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
/// * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
/// * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
/// *
/// * This code was derived from RFCTBD10.
/// * Please reproduce this note if possible.
/// */
///
/// /*
/// * flex_files_prot.x
/// */
///
/// /*
/// * The following include statements are for example only.
/// * The actual XDR definition files are generated separately
/// * and independently and are likely to have a different name.
/// * %#include <nfsv42.x>
/// * %#include <rpc_prot.x>
/// */
///
<CODE ENDS>
3. Flexible File Layout Type v2
This document defines structures associated with the layouttype4
value LAYOUT4_FLEX_FILES_V2 and it presents the minimal XDR changes
neccessary from LAYOUT4_FLEX_FILES, which is described in
[flexfiles]. [RFC5661] specifies the loc_body structure as an XDR
type "opaque". The opaque layout is uninterpreted by the generic
pNFS client layers, but is interpreted by the flexible file layout
type implementation. This section defines the structure of this
otherwise opaque value, ffv2_layout4.
Haynes Expires February 8, 2018 [Page 6]
Internet-Draft Flex File Layout v2 August 2017
3.1. ffv2_layout4
<CODE BEGINS>
/// struct ffv2_data_server4 {
/// deviceid4 ffds_deviceid;
/// uint32_t ffds_efficiency;
/// stateid4 ffds_stateid<>;
/// nfs_fh4 ffds_fh_vers<>;
/// fattr4_owner ffds_user;
/// fattr4_owner_group ffds_group;
/// opaque_auth ffds_auth;
/// };
///
/// struct ffv2_mirror4 {
/// ffv2_data_server4 ffm_data_servers<>;
/// };
///
/// struct ffv2_layout4 {
/// length4 ffl_stripe_unit;
/// ffv2_mirror4 ffl_mirrors<>;
/// ff_flags4 ffl_flags;
/// uint32_t ffl_stats_collect_hint;
/// };
///
<CODE ENDS>
The ffv2_layout4 structure specifies a layout over a set of mirrored
copies of that portion of the data file described in the current
layout segment.
It is possible that the file is concatenated from more than one
layout segment. Each layout segment MAY represent different striping
parameters, applying respectively only to the layout segment byte
range.
The ffl_stripe_unit field is the stripe unit size in use for the
current layout segment. The number of stripes is given inside each
mirror by the number of elements in ffm_data_servers. If the number
of stripes is one, then the value for ffl_stripe_unit MUST default to
zero. The only supported mapping scheme is sparse and is detailed in
Section 6 of [flexfiles]. Note that there is an assumption here that
both the stripe unit size and the number of stripes is the same
across all mirrors.
Haynes Expires February 8, 2018 [Page 7]
Internet-Draft Flex File Layout v2 August 2017
The ffl_mirrors field is the array of mirrored storage devices which
provide the storage for the current stripe, see Figure 1.
+-----------+
| |
| |
| File |
| |
| |
+-----+-----+
|
+------------+------------+
| |
+----+-----+ +-----+----+
| Mirror 1 | | Mirror 2 |
+----+-----+ +-----+----+
| |
+-----------+ +-----------+
|+-----------+ |+-----------+
||+-----------+ ||+-----------+
+|| Storage | +|| Storage |
+| Devices | +| Devices |
+-----------+ +-----------+
Figure 1
The ffs_mirrors field represents an array of state information for
each mirrored copy of the current layout segment. Each element is
described by a ffv2_mirror4 type.
ffds_deviceid provides the deviceid of the storage device holding the
data file.
ffds_fh_vers is an array of filehandles of the data file matching to
the available NFS versions on the given storage device. There MUST
be exactly as many elements in ffds_fh_vers as there are in both
ffda_versions (see 4.1 of [flexfiles]) and ffds_stateid. Each
element of the array corresponds to a particular combination of
ffdv_version, ffdv_minorversion, and ffdv_tightly_coupled provided
for the device. The array allows for server implementations which
have different filehandles for different combinations of version,
minor version, and coupling strength. See Section 5.3 of [flexfiles]
for how to handle versioning issues between the client and storage
devices.
For tight coupling, ffds_stateid provides the stateids to be used by
the client to access the file. For loose coupling and a NFSv4
storage device, the client may use anonymous stateids to perform I/O
Haynes Expires February 8, 2018 [Page 8]
Internet-Draft Flex File Layout v2 August 2017
on the storage device as there is no use for the metadata server
stateid (no control protocol). In such a scenario, the server MUST
set the ffds_stateids to be anonymous stateids.
For loose coupling, ffds_auth provides the RPC credentials needed for
secure access to the storage devices. If secure access is not
needed, i.e., the synthetic ids are sufficient, or in a tight
coupling, the server should use the AUTH_NONE flavor and a zero
length opaque body to minimize the returned structure length. [[AI1:
after the lesson learned from ffds_stateid, we either need to put an
array here or define all of the file handles to share the same
credentials. And as Olga points out in her email, this gets big
fast. Especially if we throw in many mirrored copies! --TH]]
4. Security Considerations
All of the security considerations to [flexfiles] apply here. In
addition, this document addresses how security mechanisms, such as
Kerberos V5 GSS-API [RFC4121], can be applied to the loosely coupled
model.
4.1. RPCSEC_GSS and Security Services
4.1.1. Loosely Coupled
Under this coupling model, the principal used to authenticate the
metadata file is different than that used to authenticate the data
file. For the metadata server, the RPC credentials would be
generated by the same source as the client. For RPC credentials to
the data on the storage device, the metadata server would be
responsible for their generation. Such "credentials" SHOULD be
limited to just the data file be accessed. Using Kerberos V5 GSS-API
[RFC4121], some possible approaches would be:
o a dedicated/throwaway client principal name akin to the synthetic
uid/gid schemes.
o authorization data in the ticket.
o an out-of-band scheme between the client and metadata server.
Depending on the implementation details, fencing would then be
controlled either by expiring the credential or by modifying the
synthetic uid or gid on the data file. I.e., if the credentials are
at a finer granularity than the synthetic ids, it might be possible
to also fence just one client from the file.
Haynes Expires February 8, 2018 [Page 9]
Internet-Draft Flex File Layout v2 August 2017
5. IANA Considerations
[RFC5661] introduced a registry for "pNFS Layout Types Registry" and
as such, new layout type numbers need to be assigned by IANA. This
document defines the protocol associated with the existing layout
type number, LAYOUT4_FLEX_FILES_V2 (see Table 1).
+-----------------------+-------+----------+-----+----------------+
| Layout Type Name | Value | RFC | How | Minor Versions |
+-----------------------+-------+----------+-----+----------------+
| LAYOUT4_FLEX_FILES_V2 | 0x6 | RFCTBD10 | L | 1 |
+-----------------------+-------+----------+-----+----------------+
Table 1: Layout Type Assignments
6. References
6.1. Normative References
[LEGAL] IETF Trust, "Legal Provisions Relating to IETF Documents",
November 2008, <http://trustee.ietf.org/docs/
IETF-Trust-License-Policy.pdf>.
[RFC1813] IETF, "NFS Version 3 Protocol Specification", RFC 1813,
June 1995.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4121] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos
Version 5 Generic Security Service Application Program
Interface (GSS-API) Mechanism Version 2", RFC 4121, July
2005.
[RFC4506] Eisler, M., "XDR: External Data Representation Standard",
STD 67, RFC 4506, May 2006.
[RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol
Specification Version 2", RFC 5531, May 2009.
[RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1
Protocol", RFC 5661, January 2010.
[RFC5662] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1
External Data Representation Standard (XDR) Description",
RFC 5662, January 2010.
Haynes Expires February 8, 2018 [Page 10]
Internet-Draft Flex File Layout v2 August 2017
[RFC7530] Haynes, T. and D. Noveck, "Network File System (NFS)
version 4 Protocol", RFC 7530, March 2015.
[RFC7862] Haynes, T., "NFS Version 4 Minor Version 2", RFC 7862,
November 2016.
[flexfiles]
Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible
File Layout", draft-ietf-nfsv4-flex-files-13 (Work In
Progress), July 2017.
[pNFSLayouts]
Haynes, T., "Requirements for pNFS Layout Types", draft-
ietf-nfsv4-layout-types-05 (Work In Progress), July 2017.
6.2. Informative References
[RFC7861] Adamson, W. and N. Williams, "Remote Procedure Call (RPC)
Security Version 3", November 2016.
Appendix A. Acknowledgments
Dave Noveck inspired the need for mutiple stateids for the tightly
coupled model in [flexfiles].
Olga Kornievskaia inspired the need for another security mechanism
for the loosely coupled model in [flexfiles].
Appendix B. RFC Editor Notes
[RFC Editor: please remove this section prior to publishing this
document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
RFC number of this document]
Author's Address
Thomas Haynes
Primary Data, Inc.
4300 El Camino Real Ste 100
Los Altos, CA 94022
USA
Phone: +1 408 215 1519
Email: thomas.haynes@primarydata.com
Haynes Expires February 8, 2018 [Page 11]