Network Working Group M. V. Kerwin
Internet-Draft June 24, 2013
Intended status: Standards Track
Expires: December 26, 2013

The file URI Scheme
draft-kerwin-file-scheme-02

Abstract

This document specifies the file Uniform Resource Identifier (URI) scheme that was originally specified in [RFC1738]. The purpose of this document is to keep the information about the scheme on standards track, since [RFC1738] has been made obsolete.

Status of This Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 26, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.


Table of Contents

1. Introduction

URIs were previously defined in [RFC1738], which was updated by [RFC3986]. Those documents also specify how to define schemes for URIs.

The first definition for many URI schemes appeared in [RFC1738]. Because that document has been made obsolete, this document copies the file URI scheme from it to allow that material to remain on standards track.

1.1. Conventions and Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

2. History

The file URI scheme was first defined in [RFC1630], and informational RFC which does not specify an Internet standard of any kind. The definition was standardised in [RFC1738], and the scheme was registered with IANA, however the latter definition omitted certain language from former that clarified aspects such as:

The Internet draft [I-D.draft-hoffman-file-uri] was written in an effort to keep the file URI scheme on standards track, but it expired in 2005. That draft enumerated concerns arising from the various, often conflicting implementations.

Despite lacking a living standard, the file URI scheme is used by way of example in [RFC3986] three times:

  1. section 1.1 [p6] uses "file:///etc/hosts" as an example
  2. section 1.2.3 [p10] mentions the "file" scheme regarding relative references
  3. section 3.2.2 [p21] says that '...the "file" URI scheme is defined so that no authority, an empty host, and "localhost" all mean the end-user's machine...'.

Finally the WHATWG defines a living URL standard [WHATWG], which includes algorithms for interpreting file URIs.

3. Scheme Definition

The file URI scheme is used to designate files accessible on a particular host computer. This scheme, unlike most other URI schemes, does not designate a resource that is universally accessible over the Internet.

The file URI scheme has historically had little or no interoperability between platforms. Further, implementers on a single platform have often disagreed on the syntax to use for a particular filesystem. This document does not try to resolve those problems, only to show what has been commonly seen in use on the Internet.

Note that file: and ftp: URIs are not the same, even when the target of the ftp: URI is the local host.

A file URI conforms with the generic syntax presented in [RFC3986], with the following components:

scheme name
The literal value file
authority
If present, either the fully qualified domain name of the system on which the file is accessible; or one of the special values localhost or the empty string, in which case it is interpreted as "the machine from which the URI is being interpreted". An absent authority component SHOULD be interpreted as if it were present and had the value localhost.
path
The hierarchical directory path to the file, using the slash character ("/") to separate directories.

file:///usr/local/bin/

Some systems allow URIs to point to directories. In this case, there is usually (but not always) a terminating slash character, such as in:

file:///c:/windows/example.ini
file:///c|/windows/example.ini

On systems running some versions of Microsoft Windows, the local drive specification is sometimes specified with a colon character (":") and sometimes with a pipe ("|"). The two SHOULD be considered equivalent. For example:

Note that some systems running some versions of Microsoft Windows are known to omit the slash before the drive letter.

file:////department/example.doc

For Windows shares, there is an additional slash prepeded to the path. Thus, the file "example.doc" on the shared directory "departments" would have the URI

The file URL scheme is unusual in that it does not specify an Internet protocol or access method for such files; as such, its utility in network protocols between hosts is limited.

3.1. Examples

DISK$USER:[MY.NOTES]NOTE123456.TXT

file://vms.host.edu/disk$user/my/notes/note12345.txt

For example, a VMS file

C:\Users\kerwin\Documents\temp.txt

file:///C:/Users/kerwin/Documents/temp.txt

On a Windows system, the file

/usr/local/lib/ruby/2.0.0/uri.rb

file://localhost/usr/local/lib/ruby/2.0.0/uri.rb
file:///usr/local/lib/ruby/2.0.0/uri.rb
file:/usr/local/lib/ruby/2.0.0/uri.rb

On a Linux system, the file

4. Implementation Notes

4.1. Hierarchical Structure

Most implementations of the file URI scheme do a reasonable job of mapping the hierarchical part of a directory structure into the "/" delimited hierarchy of the URI syntax, independent of what the native platform delimiter is.

For example, on Windows platforms, it is typical that the file system presents backslash "\" as the file delimeter for file names, yet the URI's forward slash "/" can be used in file: URIs. Similarly, on (some) Macintosh OS versions, at least in some contexts, the colon (":") is used as the delimiter in the native presentation of file path names. Unix systems natively use the same forward slash "/" delimiter for hierarchy, so there is a closer mapping between file paths and native path names.

4.2. Drives, drive letters, mount points, file system root

Historically there has been considerable difference, in practice, for handling of the syntax for the "top" of the hierarchy. The file URI syntax provides one simple place for designating the root of the file hierachy, and implementations have diverged, even on the same platform, sometimes even within a single application.

For example, DOS- and Windows-based systems support the notion of a "drive letter", a single character which represents a (virtual) drive, mount point, or device. Native representations of file paths start with the drive letter, a colon, and then the path; e.g., "c:\tmp\test.txt".

Drive letters are mapped into the top of a file URI in various ways, depending on the implementation; some applications substitute vertical bar ("|") for the colon after the drive letter, yielding "file:///c|/tmp/test.txt". In some cases, the colon is left unchanged, as in "file:///c:/tmp/test.txt". In other cases, the colon is simply omitted, as in "file:///c/tmp/test.txt".

4.3. Character sets and encodings

Local file systems sometimes use many different encodings for representing file names. The URI syntax defined in [RFC3986] provides a method of encoding data, presumably for the sake of identifying a resource, as a sequence of characters. The URI characters are, in turn, frequently encoded as octets for transport or presentation. This specification does not mandate any particular character encoding for mapping between URI characters and the octets used to store or transmit those characters, however for interoperability sake, it would be preferable for file: URI libraries to translate the native character encoding for file names to and from Unicode.

5. Security Considerations

There are many security considerations for URI schemes discussed in [RFC3986].

File access and the granting of privileges for specific operations are complex topics, and the use of file: URIs can complicate the security model in effect for file privileges. Under no circumstance should software using file: URIs grant greater access than would be available for other file access methods.

6. IANA Considerations

This document does not modify the existing entry in the URI Schemes registry [IANA], except by updating its reference RFC.

7. References

7.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3986] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005.

7.2. Informative References

[RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web", RFC 1630, June 1994.
[RFC1738] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform Resource Locators (URL)", RFC 1738, December 1994.
[I-D.draft-hoffman-file-uri] Hoffman, P., "The file URI Scheme", Internet-Draft draft-hoffman-file-uri-03, January 2005.
[IANA] IANA, "Uniform Resource Identifier (URI) Schemes registry", June 2013.
[WHATWG] WHATWG, "URL Living Standard", May 2013.

Author's Address

Matthew Kerwin EMail: matthew@kerwin.net.au