Login | Register
Login | Register

My pages Projects SunSource.net openCollabNet

1.2.  GE Multi Cluster Management

The only natural candidate to prove the concept of service domain management (implemented in Hedeby) is management of Sun Grid Engine (in the next paragraphs also SGE or GE) multi cluster instance. Thanks to in-house developement of GE we were able to deliver implementation which allows simultaneous management of 2 or more GE clusters.

1.2.1.  Components

Some of the Hedeby components (or building-blocks) posses special functionality or restrictions related to management of SGE clusters. Next paragraphs will describe in greater detail these special cases.

1.2.1.1.  Resources

Hedeby defines resource as an aggregation of properties (each property is key/value pair which represents required attribute of the resource - for more see Section 1.1.1, “Resource”). Currently only one type of resources (as defined by Hedeby) is available to be used with GE that is managed by Hedeby - it is a resource which type is "HOST".

"HOST" type of resource means that resource properties are describing a host computer (machine that communicates via a network). See Section 2.2.4.1, “Resource Definition Overview” for more details about current host resource implementation.

From the GE perspective host resource could be a computer that is running an instance of GE Queue Master, an instance of GE Execution Daemon or simply any computer (node) that is part of GE cluster. Thanks to Hedeby functionality host resource for a GE managed by Hedeby can be almost any computer - if there is need (expressed by GE cluster instance) and possibility (evaluated by Agent/Executor running on a host) Hedeby can add that host to GE cluster.

The way how resources are managed (added/removed to GE cluster, daemons are started/stopped) is affected by service level objectives (SLO). Hedeby supports several SLOs (see Section 1.2.1.5, “SLO/SLA”).

Sun GE cluster resources

1.2.1.2.  GE Service Adapter

The component responsible for communication between Hedeby and managed GE is GE service adapter that acts like a bridge for communication between Resource provider and the managed GE cluster. In general, GE service adapter is kind of special driver responsible for translating "generic" management functionality needed by Hedeby to "native" functionality of a GE - for example it is able to translate "remove resource" message originating from Resource Provider to a set of GE commands, that results into :

  • 1. Stopping a GE daemon (if running) on the host resource.

  • 2. Removing host (node) configuration from GE.

GE service adapter is also responsible for providing information that is needed by Resource Provider . It reports cpu load values, memory consumption, number of pending jobs etc. - hence everything that helps Resource Provider to make qualified decisions to satisfy needs of managed services.

Communication between service adapter and GE

Current implementation of GE service adapter is using custom Java interface to obtain vital information from underlying GE (this API is known as JGDI).

There is no 1:1 or similar relation between GE service adapter and GE components (queue master, execution daemon, shadow daemon etc.), in fact GE service adapter is using a combination of all GE components that all provide information and functionality needed for management of a GE cluster.

GE service adapter responsibilities currently involves:

Table 1.1. GE service adapter functionality

FeatureDetails
Installing a GE execution daemon on a host resource Used when adding a new host resource to GE service. See Section 1.2.1.4, “ Executor ” and Section 1.2.2.1, “ Use cases ”.
Uninstalling a GE execution daemon from a host resource Used when removing a host resource from GE service. See Section 1.2.1.4, “ Executor ” and Section 1.2.2.1, “ Use cases ”.
Adding a resource to GE Only adding of a new host resource to GE is currently supported. See Section 1.2.1.4, “ Executor ” and Section 1.2.2.1, “ Use cases ”.
Removing a resource from GE Only removing of a host resource from GE is currently supported. See Section 1.2.1.4, “ Executor ” and Section 1.2.2.1, “ Use cases ”.
Listening for a host related events dispatched from GE (host added, removed, modified) Resource management is done in asynchronous way in GE service adapter - it does not wait until a host resource management task finishes, instead it listens for an event that is dispatched by the GE itself (when host is added to GE, removed from GE or modified) when task is finished.
Computing GE needs GE service adapter periodically gathers data that are needed to evaluate whether configured SLOs are met or not. See Section 1.2.1.5, “SLO/SLA” for more details about SLOs.
Dispatching service related events to service adapter All changes in the GE characteristics are reported as an events back to service adapter - this means all host related events and SLO non-compliance events.

Service adapter concepts are described in greater detail at Section 1.1.2, “Service”. More information about configuration is at Section 2.4.3.3, “Service Adapters, Grid Engine Adapter ”.

1.2.1.3.  Resource Provider

Although Resource Provider (RP in the next paragraphs) can be considered as a component that is independent of other components, its functionality has direct impact on managed GE clusters. Regarding the GE managed by Hedeby RP is responsible for following actions:

Table 1.2. GE related Resource Provider functionality

FeatureDetails
Listen for normalized SLO non-compliance events from managed GE cluster SLO non-compliance event is dispatched by GE service adapter when manged GE cluster is in need of a resource because of any of its requirements is not met. These requirements (called service level objectives) can be:
  • Minimum number of nodes

  • Minimum number of pending jobs

  • Etc.

The fact whether SLO is met or not met is evaluated by GE service adapter. For complete listing of supported SLOs see Section 1.2.1.5, “SLO/SLA”.
Locate resources to address reported SLO non-compliance events If any of managed GE clusters expresses need for a resource, Resource Provider will look amongst known resources for a "free" resource. In scope of Resource Provider known resources are all resources gathered in Spare Pool and all resources assigned to other managed GE clusters (or other services). Actually, Resource Provider will look for any resource that can be taken and assigned to GE cluster that expressed need - GE cluster will just provide list of their nodes. See Section 1.1.3, “Resource Provider” for more details about Resource Provider and Section 1.2.2.1, “ Use cases ” for more details about satisfying the needs of GE.
Order GE to add/remove resources Hedeby resource management of GE cluster reflects as starting/stopping GE daemons on hosts and adding/removing hosts to GE cluster (cell). More details can be found at Section 1.2.1.2, “ GE Service Adapter ” and Section 1.2.2.1, “ Use cases ”.
Listen for error notifications from a GE cluster If there is unexpected behaviour in the components that handle GE cluster management, an error notification is dispatched (originating from executor or service adapter). Typical case is problem in starting the GE daemons, adding new node configuration etc.

1.2.1.4.  Executor

GE adapter is using executor for the following set of actions:

Table 1.3. GE related Resource Provider functionality

FeatureDetails
Installing GE execution daemon on a host resource Used when there is no execd running on the host resource and the resource was assigned to be part of a GE cluster that is managed by Hedeby. See Section 1.2.2.1, “ Use cases ”.
Uninstalling GE execution daemon on a host resource Used when there execd running on the host resource and the resource was assigned to be removed from a GE cluster that is managed by Hedeby. See Section 1.2.2.1, “ Use cases ”.

The executor commands that handle functionality related to GE (installation/uninstallation of execd, etc.) are currently implemented as shell commands or shell scripts (GE has high quality CLI interface which allows to do that).

Service Adapter and executor on a host resource

Once the executor daemon is installed on the host resource grid engine service adapter will use GE Java api to retrieve information from (or pass to) the resource.

Service Adapter and executor on a managed host resource

See Section 1.1.5, “Executor” for general concept of executor and Section 2.4.3.4, “Executors” for configuration details.

1.2.1.5. SLO/SLA

According to Hedeby project specification, SLO defines the acceptance state of a given service attribute (with appropriate metrics). Very simple example of SLO for a GE service is that "the number of pending jobs should be less than 100" in a GE cluster. More about SLO can be found at Section 1.1.2.4, “Service Level Objectives ( SLO) ”.

The original plan was to support following SLOs for managed grids :

Table 1.4. Originaly planned SLOs

SLODetails
Minimum number of hostsAllows to set minimum number of host
Minimum number of hosts of a given architectureAllows to set minimum number of host of certain architecture
Maximum number of pending jobsAllows to set maximum number of pending jobs
Maximum number of pending jobs over a given time periodAllows to set maximum number of pending jobs over a specified time period
Maximum number of pending jobs for a given architectureAllows to set maximum number of pending jobs for the host of certain architecture
Maximum average number of pending jobs over a given period for a give architectureAllows to set maximum average number of pending jobs for the host of certain architecture
Minimum number of free slotsAllows to set minimum number of free slots
Minimum average number of free slots over a given time periodAllows to set minimum number of free slots over a specified time period
Minimum average number of free slots for a given architectureAllows to set minimum number of free slots for the host of certain architecture
Minimum average number of free slots over a given time period for a given architectureAllows to set minimum number of free slots for the host of certain architecture over a specified time period
Fixed hostAllows to set a bunch of static hosts

While priority has been placed on other core Hedeby features, only a subset of the above mentioned SLOs has been implemented so far. Currently supported SLOs are summarized in the next table (with short comments).

Table 1.5. Currently implemented SLOs

SLODetails
Minimum number of hosts Allows to set minimum number of host. Current implementation allows to add filter for hw architecture so this SLO is also implementation of "Minimum number of hosts of certain architecture" SLO
Maximum number of pending jobs Allows to set maximum number of pending jobs. Current implementation allows to add filter for hw architecture so this SLO is also implementation of "Maximum number of pending jobs of certain architecture" SLO
Fixed hostAllows to set a bunch of static hosts

Each of currently implemented SLOs has several attributes that can be customized. See samples with short explanation in the following tables.

Table 1.6. Minimum number of hosts

AttributeDetails
AttributeThe GE attribute name to monitor
ValueThe value the GE attribute needs to have to be part of the SLO
UrgencyThe urgency the resources should have that are part of the min amount
MinThe mininum value of the specified attribute / value pair

Table 1.7. Maximum number of pending jobs

AttributeDetails
AttributeThe GE attribute name to monitor
ValueThe value the GE attribute needs to have to be part of the SLO
UrgencyThe urgency the resources should have that are part of the min amount
MaxThe maximum number of pending jobs for the specified attribute / value pair

Table 1.8. Fixed host

AttributeDetails
HostStatic resource identifier - hostname of the static host

1.2.2.  Primary goals

Primary goal of Hedeby's support for GE was to deliver automatic resource management tool for GE clusters. The tool should allow unattended configuration of GE cluster to deliver optimal job throughput and utilization of various resources assigned to GE cluster. This chapter focuses on a description of Hedeby managing GE cluster, its performance and usability.

1.2.2.1.  Use cases

Due fact that only host resource are currently supported core Hedeby functionality related to GE cluster management can be described by relatively short set of use cases. These use cases are compiled in the following chapters.

1.2.2.1.1.  GE cluster without Hedeby

The only instance of GE cluster (regardless of total nodes in the cluster) that is not managed by Hedeby is the simplest example of Hedeby functionality. As the GE cluster instance is not managed by Hedeby, there is not much that Hedeby can do - in fact it does nothing. All resources (the only supported resource type is "HOST" therefore resources equals nodes) are staticaly assigned to GE cluster.

Sun GE cluster managed by administrator

Only the GE administrator can add/remove nodes to cluster or start/stop GE daemons on nodes. To increase a performance of a GE cluster, administrator would have to perform following steps.

Procedure 1.1.  Increasing the performance of a GE cluster

  1. Add the node configuration to the GE cluster.

    Administrator would use GE administrator interface (qconf or qmon).

  2. Start the grid daemon (execution daemon) on the node.

    Administrator would use shell script to start the daemon (sge_execd).

A resource added by GE administrator - sequence diagram

Similar procedure is needed when administrator has to remove a node from a GE cluster (to use it in different GE cluster, for example)

Procedure 1.2.  Decreasing the performance of a GE cluster

  1. Stop the grid daemon (execution daemon) on the node.

    Administrator would use shell command to kill the daemon ( pkill sge_execd or qconf -ke).

  2. Remove the node configuration from the GE cluster.

    Administrator would use GE administrator interface (qconf or qmon).

A resource removed by GE administrator - sequence diagram

1.2.2.1.2.  Two GE clusters without Hedeby

Follow up to previous use case is the situation of two (3,4, many) instances of GE cluster (again, regardless of total nodes in the cluster) that none of them is managed by Hedeby. And again, as none of the instances is managed by Hedeby, Hedeby does nothing in this use case. All resources (the only supported resource type is "HOST" therefore resources are GE nodes) are statically assigned to a corresponding GE cluster.

Two Sun GE clusters managed by administrator

Only the GE administrator can add/remove nodes to cluster or start/stop GE daemons on nodes. To increase a performance of a GE cluster #2 using a node taken from a GE cluster #1 (assuming that it has spare capacity), administrator would have to perform following steps.

Procedure 1.3.  Increasing the performance of a GE cluster #2 using a spare resource from a GE cluster #1

  1. Stop the grid daemon (execution daemon) on a node that is part of cluster #1.

    Administrator would use shell command to kill the daemon ( pkill sge_execd or qconf -ke).

  2. Remove the node configuration from the GE cluster #1.

    Administrator would use GE administrator interface (qconf or qmon).

  3. Add the node configuration to the GE cluster #2.

    Administrator would use GE administrator interface (qconf or qmon).

  4. Start the grid daemon (execution daemon) on the node that is now part of the GE cluster #2.

    Administrator would use shell script to start the daemon (sge_execd).

A resource shuffled between two grids by GE administrator - sequence diagram

1.2.2.1.3.  Two GE clusters with Hedeby

Logical "upgrade" to previous use case is the situation where thoe instances of GE cluster are managed by Hedeby - from the Hedeby's view, those GE clusters are managed services. Each resource (the only supported resource type is "HOST" therefore resource equals node) are dynamically assigned to GE cluster which expresses need for additional resource (with the only exception of static/fixed hosts). Hedeby is therefore responsible for automatical adding/removing of nodes to relevant cluster or starting/stopping of GE daemons on related nodes.

Two Sun GE clusters managed by Hedeby

If a managed GE cluster #2 expresses need for a resource and if there is another managed GE cluster #1, Hedeby could take following set of actions.

Procedure 1.4.  Hedeby satisfies need of a managed GE cluster #2

  1. SLO of the GE cluster #2 is not met

    Service Adapter for the GE cluster #2 gathers information and finds that at least one of the configured SLO for the GE cluster #2 is not met.

  2. Resource provider receives non-satisfied SLO event from GE cluster #2

    Resource provider looks for any resource that can be taken and assigned to the GE cluster #2. Resource provider considers any resource that:

    • is assigned to a service (GE cluster) with lower priority than the GE cluster #2

    • is not marked as fixed resource

    • is owned by a Spare Pool

    Regarding on the details of the need, some of the resources are be left out (if they do not match needed host architecture, operating system version etc.).

  3. Resource provider finds an appropriate resource.

    (Assuming that the only available resource matching expressed need for a resource is assigned to a GE cluster #1) Resource provider asks the related service container to release the resource.

  4. The GE cluster #1 removes the node.

    Service container dispatch to its service adapter message that the resource should be released form the managed grid engine cluster #1. GE service adapter contacts the executor running on the node with order to stop the grid daemon (execution daemon). Service Adapter then removes the node configuration from the GE cluster #1.

  5. Resource provider assigns the resource to GE cluster #2.

    Once the resource (node) is released from GE cluster #1, it is temporarily marked as "RESERVED" which makes it reserved fo assignment to GE cluster #2. Resource provider asks service container related to GE cluster #2 to add the resource. Service container delegates the message to its service adapter which first adds a configuration for the node to the GE cluster #2. Service Adapter then again contacts executor running on the node with order to start the GE daemon (execution daemon) - this time the node will be part of the GE cluster #2.

Although the above procedure looks kind of complicated, it is worth to notice that it does not involve any single interaction of a human administrator.

A resource shuffled between two grids by Hedeby - sequence diagram

Juggling resource back and forth between managed GEs is not always enough to satisfy their needs - sometimes it is more convenient to add additional (new) resources to managed grids (typical example is buying a new hardware). With Hedeby there is no need first to manually add such resource to one of the managed GE clusters - it is possible to add such resource to Spare Pool and let Hedeby decide what to do with it.

Procedure 1.5.  Adding a resource to Hedeby Spare Pool

  1. Administrator starts the executor on the resource

    Running executor is a MUST HAVE requirement to manage any of the hosts. Executor can be deployed on the resource using several ways (see Section 2.1, “Install Hedeby”).

  2. Administrator adds the resource to the Spare Pool

    Administrator would use Hedeby administrator's interface - typically it would be "sdmadm add_resource" shell command. Once the command execution finishes, the resource is available to be assigned to any of managed GE clusters.

Adding a host resource to Spare Pool by GE administrator - sequence diagram

1.2.2.2.  Target performance

Hedeby is supposed to manage 2 and more GE clusters (although in combination with Spare Pool it is possible to use it to manage only one GE cluster) and therefore it has to provide best scalability and performance. Test runs with 3 managed GE clusters and cca 3000 host resources showed very small memory and cpu consumption of Hedeby components, while overall response time of Hedeby system was very fast.

Next table shows list of basic actions and time that needs Hedeby to perform it.

Table 1.9. Performance and response times

ActionDetailsTime (include CLI execution overhead)
StartupJVM running CS component, JVM running RP and CA component, all JVMs running on one host7 s
Assigning the GE cluster to be mangedThis is automatical actionLess then 1s
Adding a new host resource to Spare Pool 2 s
Removing a host resource from Spare Pool 2 s
Adding a host resource to a GE clusterGE adapter performs automatical Exec daemon installation on a host4 s
Removing a host resource from a GE clusterGE adapter performs automatical Exec daemon uninstallation on a host4 s
Showing list of managed servicesSystem was managing 3 GE clusters and 1 Spare poolless than 3 s
Showing a resource details 2 s
Restarting the Resource Provider 8 s
Restarting the GE service 3 s
ShutdownJVM running CS component, JVM running RP, Spare pool and Reporter, JVM running Executor and CA component, all JVMs running on same host12 s
ShutdownJVM running GE service and JVM running Executor, both JVMs running on same host3 s
Shutdown14 JVMs running various components running on 9 hosts (on some hosts more than 1 JVM was running)16 s

The data introduced in this section were captured using following configuration:

Table 1.10. Hedeby testing system setup

SubcomponentDetails
Hedeby Managing 2 GE clusters, both clusters having 2 execution daemons, 1 queue master and 1 shadow daemon. Hedeby used 2 JVMs - the first one was running all components except an executor, the second one running just an executor.
GE clusters Each component (2 execution daemons, queue master, scheduler) was running in the separate Solaris 10 U2 zone on a machine equipped with 2xSparc@1.5GHz, 8GB RAM.
GE versionFor testing was used Sun GE 6.2 maintrunk
JavaJDK 1.5_011 was used

1.2.2.3.  Usability

The very first real-life application of Hedeby will be using it as an automatic resource manager for several Sun GE clusters. This premise has driven the implementation effort and caused that priority has been placed on certain of its features (while others are not implemented yet).

Here is the brief resume of all aspects that apply to usablity of Hedeby (in relation to Sun GE management) with short comments.

Table 1.11. Hedeby usability resume

FeatureDetails
Adding a resource to a GE cluster Either creating a new host resource using Hedeby CLI or adding a new node to GE cluster manually (Hedeby will find out that a new node was added to managed GE cluster)
Removing a resource from a GE cluster Either removing a host resource using Hedeby CLI or removing a node from GE cluster manually (Hedeby will find out that a node was removed from managed GE cluster)
Supported GE resources Only "host" resources are supported. Support for general GE resources (Complex Types) is planned.
Policy support Currently there is only priority policy available for decision making process. Other policy types will be introduced in later relases of Hedeby.
Static resources Any GE node that is running Queue master instance will automatically marked as static. In addition, any GE node running Exec daemon but not running Executor (Hedeby component) will be marked static, too.
Resource error state Resource can be reset either using Hedeby CLI or by removing and then adding a node from GE cluster manually (Hedeby will find out that a node was removed from managed GE cluster and again added to GE cluster)
SLO definition There is support for SLO creation/modification using CLI.

1.2.3.  Requirements

1.2.3.1.  OS and HW platforms

Hedeby as a Java application is virtually able to run on any platform that complies with Section 1.2.3.2, “Java”.

Hedeby does not rely on any special HW platform (although it contains few lines of a native code). In general it is aimed to support all platforms that are supported by GE 6.2.

Table 1.12. Tested OS and HW platforms

Operating systemHardware
Solaris 9 (with applied updates)Not tested yet
Solaris 10 (with applied updates)Tested, works fine
Linux kernel 2.4Tested, works fine
Linux kernel 2.6Tested, works fine

Table 1.13. Known OS and HW platform limitations

Operating systemHardware
Windows (any version)Windows support is planned for later releases of Hedeby.
MacOSNot tested yet.
Linux kernel xxx running on Solaris SparcNot tested yet.

1.2.3.2. Java

Hedeby makes strong use of Java features that were introduced in JDK 1.5 (generics, concurrency utilities etc.) which makes any pre 1.5 version of Java unusable for it. During the Hedeby's implementation phase JDK 6.0 was released and Hedeby is working fine with Java 6, too.

Table 1.14. Required Java

VersionDetails
pre JRE 1.5Not supported
Sun JRE 1.5Tested, recommended to use
JRE 1.5 provided by other vendors (not Sun)Not tested, should be working.
JRE 1.6Tested
Non Sun JRE 1.6Tested, minor issues. Not recommended to use yet.

1.2.3.3.  GE

Hedeby is using to manage GE set of GE features that were introduced only in the latest updates to Sun GE and some features that are part of the development maintrunc of the Sun GE (not available for public use yet). The main restriction applies to JGDI (Java wrapper to GDI) which as a private Java GE API provides vital functionality for Hedeby and is not part of any Sun GE distribution.

Despite the restrictions stated in the above paragraph, support of the most of Sun GE releases is planned with the release of Hedeby reloaded.

Table 1.15. Supported GE releases

VersionDetails
Sun GE 6.2Tested, JGDI over JMX required
Other release than 6.2 Not supported