Internet DRAFT - draft-golovinsky-cloud-services-log-format

draft-golovinsky-cloud-services-log-format






Network Working Group                                      G. Golovinsky
Internet-Draft                                               S. Johnston
Intended status: Experimental
Expires: April 12, 2013                                          D. Birk
                                           Ruhr University Bochum; Horst
                                                Goertz Institute for IT
                                                                Security
                                                         October 9, 2012


        Syslog Extension for Cloud Using Syslog Structured Data
             draft-golovinsky-cloud-services-log-format-03

Abstract

   This document provides an open and extensible log format to be used
   by any cloud entity or cloud application to log and trace activities
   that occur in the cloud.  The logs and traces can be utilized for
   billing, charging, and debugging purposes.  In addition, these logs
   and traces are equally applicable for cloud infrastructure (IaaS),
   platform (PaaS), and application (SaaS) services.  CloudLog is
   different in content, but not in nature from the traditional logging
   as it takes in account transient nature of Identities and resources
   in the cloud.


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 12, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.




Golovinsky, et al.       Expires April 12, 2013                 [Page 1]

Internet-Draft                  Cloud Log                   October 2012


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Conventions Used in This Document  . . . . . . . . . . . . . .  3
   3.  Problem Statement  . . . . . . . . . . . . . . . . . . . . . .  3
     3.1.  Scope of the application . . . . . . . . . . . . . . . . .  3
     3.2.  The Traditional Logging and its Applications . . . . . . .  3
     3.3.  Challenges with the cloud deployment . . . . . . . . . . .  4
       3.3.1.  SaaS Use Case  . . . . . . . . . . . . . . . . . . . .  4
       3.3.2.  PaaS Use Case  . . . . . . . . . . . . . . . . . . . .  5
       3.3.3.  IaaS Use Case  . . . . . . . . . . . . . . . . . . . .  5
   4.  Cloud Log Structured Data Definitions  . . . . . . . . . . . .  6
     4.1.  SD-ELEMENT context . . . . . . . . . . . . . . . . . . . .  6
       4.1.1.  SD-PARAM aid - Mandatory . . . . . . . . . . . . . . .  6
       4.1.2.  SD-PARAM provider - Optional . . . . . . . . . . . . .  7
       4.1.3.  SD-PARAM rid - Optional  . . . . . . . . . . . . . . .  7
       4.1.4.  SD-PARAM eid - Optional  . . . . . . . . . . . . . . .  7
     4.2.  SD-ELEMENT transit . . . . . . . . . . . . . . . . . . . .  7
       4.2.1.  SD-PARAM client - Mandatory  . . . . . . . . . . . . .  7
       4.2.2.  SD-PARAM gw - Optional . . . . . . . . . . . . . . . .  8
   5.  Log Format Samples . . . . . . . . . . . . . . . . . . . . . .  8
     5.1.  Log Sample of Simple Non-Authenticated Request . . . . . .  8
     5.2.  Successful Authenticated User Request  . . . . . . . . . .  8
     5.3.  Log Sample of Successful Request on Behalf of Another
           Identity . . . . . . . . . . . . . . . . . . . . . . . . .  9
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  9
     7.1.  SD-IDs . . . . . . . . . . . . . . . . . . . . . . . . . . 10
   8.  Normative References . . . . . . . . . . . . . . . . . . . . . 10
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10










Golovinsky, et al.       Expires April 12, 2013                 [Page 2]

Internet-Draft                  Cloud Log                   October 2012


1.  Introduction

   This document describes a standard for syslog structured data
   elements in messages generated by services that may be running on
   different physical or virtual machines when those services are
   processing information generated by a single request.  The purpose of
   which is to provide an audit trail that allows correlation of such
   messages.  In addition, this document defines a number of parameters
   that MUST or SHOULD be included in these structured data elements so
   these messages can be used to identify users of such services, when
   the real and/or effective identities of users is known.


2.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


3.  Problem Statement

3.1.  Scope of the application

   The three service models proposed by the NIST differ in the way the
   single cloud services are offered to the customers.  Hence, besides
   the usage of general logging concepts which can be applied to all
   three service models alike, individual logging measures for each
   single service model with its specific circumstances have to be taken
   into account.

3.2.  The Traditional Logging and its Applications

   Practically all hardware and software entities deployed on the
   network log their activities.  Network elements such as routers,
   servers, firewalls and switches log information about their
   activities using mostly Syslog (except for Windows).  Applications
   running on the network also log activities, but often using
   proprietary mechanisms.  While logging mechanisms are inconsistent
   between different entities - Syslog, Windows events, proprietary
   files - they generally carry enough information to identify type of
   the activity, time of the occurrence, physical entity involved in the
   event, and often user(s) that participated in the event.
   Availability of this information is crucial for accomplishing
   multiple business objectives ranging from assuring security and
   performing forensics to adhering to compliance regulations (SOX, PCI,
   etc.).  The existence of logs and information in them is necessary,
   but not sufficient for achieving security, compliance and other



Golovinsky, et al.       Expires April 12, 2013                 [Page 3]

Internet-Draft                  Cloud Log                   October 2012


   business objectives.  The process of collecting, processing,
   searching and even simply interpreting information in logs is
   exceptionally labor and time consuming process and often cannot even
   be done on any meaningful scale without appropriate tools in place.
   Log Management tools used to solve the problem of scale and
   interpretation heavily depend on the fact that format of logs is
   largely well defined and understood.

3.3.  Challenges with the cloud deployment

   In cloud deployments the situation with availability of logs in
   reliability of information in them is drastically different.  By
   definition, cloud resources are shared.  A piece of hardware is now
   running multiple Virtual Instances of "it".  They can be brought up
   and down within very short period of time and at any given moment the
   hardware can be shared not just by different users but by different
   users from different companies.  Even if Linux or Windows VMs
   continue to log their activity the information in these logs is very
   likely to be irrelevant since you cannot really tie logs to the
   physical entity.  Moreover, even if one managed to map logs to a
   physical entity, there is absolutely no guarantee that the same VM
   image will be running on the same hardware in its next reincarnation.
   And there is really no clear way to determine how many users share
   the hardware and what are their identities and roles.  Tracing
   environmental changes is practically impossible task unless there is
   traceability between physical and virtual entities.  As a result,
   achieving such business objectives as adhering to compliance
   regulations or performing regular security auditing is very difficult
   if not an impossible task.

   Generally, logging mechanisms for cloud environments do not differ in
   the way traditional logging mechanisms work.  However, the
   environmental circumstances of the cloud presuppose additional
   measurements.  Customers mostly rely on the CSP if logging data is
   required.  In SaaS scenarios, the customers have almost no chance to
   prepare the application with additional logging features.  This
   situation slightly changes for PaaS and also states in IaaS a
   tremendous problem.  Hence, logging standards should be applied by
   the CSP in order to improve this situation for the customers.  The
   following use cases underline the need for an additional standard and
   the differentiation between the various cloud services.

3.3.1.  SaaS Use Case

   In SaaS scenarios, the CSP obtains all the power over the application
   itself and the offered services.  The customer mainly uses a client
   device for communicating with a specific API offered by the CSP.  In
   most of the cases, the user agent on the client is a web browser



Golovinsky, et al.       Expires April 12, 2013                 [Page 4]

Internet-Draft                  Cloud Log                   October 2012


   communicating with a web application located on the server
   infrastructure of the CSP.  Unfortunately, the customer does not
   obtain the ability to manage or control the underlying cloud
   infrastructure, network components, servers, operating systems etc.
   Hence, the CSP has to provide additional logging mechanisms to
   improve this situation.  In case of a web-based email service, the
   customer has almost no chance to figure out whether his account has
   been compromised or accessed from an unknown IP address.  Even some
   providers provide some of the last IP addresses which accessed the
   application, this procedure does not solve the problem of NAT or used
   proxies.  Furthermore, if the customer's account has been
   compromised, he can't determine which emails have been edited or
   accessed by the adversary.  Additional, fine granular logging
   mechanisms could improve this situation for the customer and even
   forensic investigations in case of an account compromise could be
   possible.

3.3.2.  PaaS Use Case

   The logging situation in PaaS scenarios slightly changes compared to
   SaaS.  The CSP decides which system-specific logging information is
   provided to the customers, however, the application deployed by the
   customer can contain hard-coded logging features.  This unfortunately
   requires the underlying OS environment to support that.  For
   instance, the application could contain mechanisms which transfer
   encrypted and signed logging data to third party logging servers in
   real-time.  CSP claim that the transfer of data between the PaaS
   instance and the corresponding database backend is encrypted.  This
   can hardly be confirmed by the customer.  Hence, customers should not
   rely on such promises but apply their own logging mechanisms as far
   as possible.  This logging information could be improved by
   information provided by CSP which cannot directly been extracted by
   the customer application.

3.3.3.  IaaS Use Case

   In IaaS cloud environments the situation with availability of logs in
   reliability of information in them has somewhat been improved.  The
   customers can prepare their VM for logging purposes and control the
   single instance.  Therefore, crucial application specific logging
   information can be collected by the customer itself under the
   theoretical reserve, that the CSP can theoretically maliciously or
   unintentionally modify this logging information.  Unfortunately, by
   definition, cloud resources are shared.  This means, the customer
   could share the same physical host with an potential adversary.
   Hence, it is of greater importance whether the customer shares the
   physical host with any other tenant or is the only virtual instance.
   This information cannot be obtained by the customer without the help



Golovinsky, et al.       Expires April 12, 2013                 [Page 5]

Internet-Draft                  Cloud Log                   October 2012


   of the CSP.  This situation is further complicated by the flexibility
   of the cloud.  Within a short range of time, virtual instances are
   transferred to other physical hosts without the knowledge of the
   customer.  These transactions cannot be detected and logged by the
   customer without the assistance of the CSP.  IaaS cloud environments
   should provide the ability to detect and log the bounding of the
   virtual instance to a specific hardware.  For an exhaustive forensic
   analysis of an incident, this information is however of greater
   importance.  Moreover network components containing important
   information about the network in which the instance is deployed,
   cannot be accessed by the customer without the help of the CSP.  As a
   result, achieving such business objectives as adhering to compliance
   regulations or performing regular security auditing is very difficult
   if not an impossible task.


4.  Cloud Log Structured Data Definitions

   1.  RUI - real user identity, the identity of the user that has
       authenticated to the entity.

   2.  EUI - effective or impersonated user identity, the identity of
       the user that the real user identity is acting for.  For example,
       an administrator account could have the ability to impersonate
       another user account.

   3.  Provider - is the domain, service, application, or other entity
       providing the user identities.

   Structured data elements, defined in RFC 5424 [RFC5424], provides a
   mechanism for adding data to syslog messages.  Since additional data
   is necessary to trace user identities and their activities in the
   cloud we use the mechanism of structured data elements to provide
   this additional information in the syslog messages.

4.1.  SD-ELEMENT context

   The SD-ELEMENT identified by the SD-ID "context" defines the context
   of the external request that causes for the activity to take place.
   The syslog message that is generated as a result of this activity
   should be identified by this "context".

4.1.1.  SD-PARAM aid - Mandatory

   The parameter "aid" represents the audit identifier, which uniquely
   identifies an external request for activity.  The value is a UTF-8-
   STRING representation of the UUID generated by the entity when
   request is received.



Golovinsky, et al.       Expires April 12, 2013                 [Page 6]

Internet-Draft                  Cloud Log                   October 2012


   This parameter MUST be present within the SD-ELEMENT "context".

4.1.2.  SD-PARAM provider - Optional

   The parameter "provider" represents the provider of the identity for
   the Real User Identity - 'rid' and Effective User Identity - 'eid',
   User identities are not always exist or available.  In cases that
   they are, either "rid" or "eid" MUST be present in the syslog
   messages.

   The parameter "provider" is not required, but SHOULD be present
   within the SD-ELEMENT "context" when either the 'rid' or 'eid'
   identifiers are present.

4.1.3.  SD-PARAM rid - Optional

   The parameter "rid" represents the real user identity.

   This parameter SHOULD be present within the SD-ELEMENT "context" when
   the real user identity is availbale.

4.1.4.  SD-PARAM eid - Optional

   The parameter "eid" represents the effective user identity.  This
   parameter SHOULD be present within the SD-ELEMENT "context" when user
   impersonation has happened and the effective user identity is
   available.

   The 'eid' parameter represents the effective user identity.

   This parameter SHOULD be present within the 'context' SD-ELEMENT when
   the effective user identity is known.

4.2.  SD-ELEMENT transit

   The SD-ELEMENT identified by the SD-ID "transit" defines logical
   gateway entities which were traversed while request for activity was
   routed to the final destination entity that would satisfy the
   request.

4.2.1.  SD-PARAM client - Mandatory

   The parameter "client" represents the IP address or Fully Qualified
   Domain Name (FQDN) of the client entity on behalf of which the
   request is being made.  This is different from SD-ID 'ip' in RFC 5424
   that defines IP of the entity producing the log message itself.  IPv4
   or IPv6 addresses MUST be represented as STRING-UTF-8 .




Golovinsky, et al.       Expires April 12, 2013                 [Page 7]

Internet-Draft                  Cloud Log                   October 2012


   The parameter "client" represents the IP address or FQDN of the
   client on behalf of which the request is being made.

4.2.2.  SD-PARAM gw - Optional

   The parameter "gw" represents a gateway entity through which the
   request for activity passes before arriving to the final destination
   entity actually responsible processing of the request.  The value of
   the parameter is comprised of the STRING-UTF-8 representation of UUID
   of the entity , identifying the gateway, a colon character (i.e.
   ':'), and finally the STRING-UTF-8 representation of IP address or
   FQDN of the gateway through which the request has been routed.

   This parameter MAY appear more than once within the SD-ELEMENT
   "transit" as request may pass through multiple gateway entities.
   Each occurrence represents a different gateway through which the
   request passed.


5.  Log Format Samples

5.1.  Log Sample of Simple Non-Authenticated Request

   Here is an example of a log produced as a result of simple non-
   authenticated request to a web service.  Only the mandatory
   parameters "aid" and "client" are represented.

   Jul 7 09:01:40 [context aid="9BE817EB-8ACC-1004-D9DF-
   00000A00065E"][transit client="56.2.222.83"] Initializing request to
   /example_api/index

   Jul 7 09:01:40 [context aid="9BE817EB-8ACC-1004-D9DF-
   00000A00065E"][transit client="56.2.222.83"] "64.39.0.40" - "1023"
   ""GET /example_api/index HTTP/1.1"" 200 2543 -- performed in 600 ms

5.2.  Successful Authenticated User Request

   Here is an example of a simple request including user authentication.
   Note that the 'provider' and 'rid' SD-PARAMs are added to the message
   after the user has authenticated to the service, and that those
   parameters are included in each subsequent message.

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-
   00000A000152"][transit client="172.16.1.82"] Initializing request to
   /api/example:instance/1

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
   provider="example.com" rid="1:123"][transit client="172.16.1.82"]



Golovinsky, et al.       Expires April 12, 2013                 [Page 8]

Internet-Draft                  Cloud Log                   October 2012


   User authentication successful for 1:123

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
   provider="example.com" rid="1:123"][transit client="172.16.1.82"]
   "172.16.1.82" - "-" ""GET /api/example:instance/1 HTTP/1.1"" 200 119
   -- performed in 2 ms

5.3.  Log Sample of Successful Request on Behalf of Another Identity

   Here is a request made by an authenticated user on behalf of another
   identity.  Note that the parameter "eid" is added after the user
   authentication takes place and the effective user identity is
   validated.  This parameter is included in each subsequent message.

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-
   00000A000152"][transit client="172.16.1.82"] Initializing request to
   /api/example:instance/1

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
   provider="example.com" rid="1:123"][transit client="172.16.1.82"]
   User authentication successful for 1:123

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
   eid="2:456" provider="example.com" rid="1:123"][transit
   client="172.16.1.82"] User impersonation successful for 1:123 to
   2:456

   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
   eid="2:456" provider="example.com" rid="1:123"][transit
   client="172.16.1.82"] "172.16.1.82" - "-" ""GET /api/
   example:instance/1 HTTP/1.1"" 200 119 -- performed in 2 ms


6.  Security Considerations

   In addition to general syslog security considerations discussed in
   RFC 5424 [RFC5424], he information contained in these messages may
   provide information about how services interact, user identities, and
   other information about network or service inventory.

   Users should not have access to these messages if they would not have
   access to this information through other authenticated means.


7.  IANA Considerations






Golovinsky, et al.       Expires April 12, 2013                 [Page 9]

Internet-Draft                  Cloud Log                   October 2012


7.1.  SD-IDs

   ANA is requested to register the syslog structured data element SD-
   IDs and PARAM-NAMEs shown below:

                   +---------+------------+-----------+
                   | SD-ID   | PARAM-NAME |           |
                   +---------+------------+-----------+
                   | context |            | OPTIONAL  |
                   |         | aid        | MANDATORY |
                   |         | eid        | OPTIONAL  |
                   |         | provider   | OPTIONAL  |
                   |         | rid        | OPTIONAL  |
                   | transit |            | OPTIONAL  |
                   |         | client     | MANDATORY |
                   |         | gw         | OPTIONAL  |
                   +---------+------------+-----------+

                                  Table 1


8.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", RFC 2119.

   [RFC5424]  Gerhards, R., "The Syslog Protocol", RFC 5424.


Authors' Addresses

   Gene Golovinsky
   Redwood City, CA  94065
   US

   Phone: (650)8016259
   Email: ggolovinsky@qualys.com
   URI:   NA


   Sam Johnston


   Phone:
   Email: samj@samj.net






Golovinsky, et al.       Expires April 12, 2013                [Page 10]

Internet-Draft                  Cloud Log                   October 2012


   Dominik Birk
   Ruhr University Bochum; Horst Goertz Institute for IT Security
   Bochum,   44780
   Germany

   Phone: +49(0)234-32-26740
   Email: dominik.birk@rub.de
   URI:











































Golovinsky, et al.       Expires April 12, 2013                [Page 11]