|
|
There are two ways how Hedeby can be installed. We call this ways preferences. Preferneces can be set for particular user - in that case we speak about USER preferences or for whole system platform - SYSTEM preferences and this installation has to be performed by superuser. The use of SYSTEM preferences makes available such features like: autostart/smf support for hosts within the Hedeby system. The Hedeby is using its own implementation of java preferences. SYSTEM preferences are located in /etc/sdm/bootstrap/<system_name>. USER preferences are located in <USER_HOME>/.sdm/bootstrap/<system_name>. More about java preferences you can find here The files structure for preferences looks like:
<system_name> --\
|
|-- hosts --\
| |-- <host_name> --\
| | |-- smf --\
| | | |
| | | \----prefs.properties
| | |
| | \----prefs.properties
| |
| \-- prefs.properties
|
\-- prefs.properties
The exemple content of prefs.properties file in <system_name> directory, it is the main bootstrap information about system: version=0.1 localspool=/var/spool/sdm/localspool/ csInfo=foo\:2324 smf=true ssl_disable=false dist=/net/foo/sdm_dist auto_start=false
Table 2.5. Description of the file content:
The exemple content of prefs.properties file in <system_name>/host/<host_name> directory, it is the host specific information about system: localspool=/var/spool/sdm/localspool/ master=false dist=/net/foo/sdm_dist
Table 2.6. Description of the file prefs.properties content. File located in <system_name>/host/<host_name>
The exemple content of prefs.properties file in smf directory, that contains SMF information about system: rp_vm=svc\:/application/management/sdm/mySystem/jvm\:rp_vm executor_vm=svc\:/application/management/sdm/mySystem/jvm\:executor_vm
Table 2.7. Description of the file prefs.properties content. File located in <system_name>/host/<host_name>/smf
This is directory with the Hedeby system installation files. You can find here binaries, installation scripts, libraries, manuals and state files that are used by Installer.
sdm_dist --\
|
|-- bin --\
| \-- sdmadm
|
|-- util --\
| |-- arch
| |-- arch_variables
| |-- sdmsmf.sh
| |-- smf_sdmsvc
| |-- supportRc.sh
| |
| |-- sdmST --\
| | |-- sdm_st
| | \-- st_settings.sh
| |
| |-- templates --\
| |-- sdm.env.template
| |-- jaas.config.template
| |-- java.policy.template
| |-- sdmsvc
| |-- sdm_template.xml
| |-- sdm_template_masterhost.xml
| |-- start_sh.template
| |-- logging.properties.template
| \-- ge-adapter --\
| |-- install_execd.conf
| |-- install_execd.sh
| |-- uninstall_execd.conf
| |-- uninstall_execd.sh
|
|-- lib --\
| |-- ext --\
| | |-- endorsed --\
| | | \-- *.jar
| | \-- *.jar
| |
| |-- <PLATFORM_ARCHITECTURE> --\
| | \-- libplatform.so
| |-- sdm-common.jar
| |-- sdm-starter.jar
| |-- sdm-ge-adapter.jar
| |-- sdm-ge-adapter-impl.jar
| |-- sdm-security.jar
| \-- sdm-security-impl.jar
|
\-- man --\
\-- man1 --\
|-- sdmadm.1
Table 2.8. Description of Dist directory content:
Local spool directory has to be specify on the local file system and for each managed host. The localspool directory can be different for each host. In local spool directory information about running componant are stored, but only this which JVMs are on that host, there are also logs from JVMs. This is also place for spooling local data and for security infos like certificates and keystores.
localSpool --\
|-- log --\
| \-- log files
|
|-- run --\
| \-- files with the pids of running components
| on local host
|
|-- security --\
| |-- ca --\
| | \ (files for Grid Engine CA)
| |
| |-- deamons --\
| | \-- keystores for JVM`s
| |
| \-- users --\
| \-- keystores for Hedeby users
|
|-- spool --\
| \-- cs spool directory for configuration
| \-- ... spool directories for Hedeby components
|
|-- logging.properties
|
|
\-- tmp --\
\-- tmp directories for Hedeby components
Table 2.9. Description of Local spool directory content:
The typical starting point for the Hedeby system configuration is to define the java virtual machines where the components should run. This also includes to define on which host a component should run. When a Hedeby component should be started on a local host (Which might be done by sdmadm start command) the global configuration is used to find out which components have to be started in which virtual java machines on the local host. One important task in the JVM configuration is to define the environment variables for the JVM. The current configuration supports the setting the LD_LIBRARY_PATH environment variables. All components startet in a JVM will have this defined environment setting. The global configuration file is loaded/stored by the configuration service. It is stored in the local spool directory of the host where CS is running (<local spool>/spool/cs/global.xml).
The configuration file is written in XML format. The structure of the file
is defined in the xml schema Example 2.3. JVM configuration
<?xml version="1.0" encoding="UTF-8"?>
<global ...>
<jvm
Inside the JVM configuration it is possible to define the components for this JVM. We have two different types of components: Component types
Example 2.4. Component configuration
<?xml version="1.0" encoding="UTF-8"?>
<global ...>
<jvm ...>
<component xsi:type="MultiComponent"The Configuration Service (CS) is a special component. It is not explicitly defined in a component tag. The Hedeby system detects automatically the JVM which hosts CS. It compares the hostname and the port with the CS URL stored in the preferences. The CS JVM must have a static port. % sdmadm -s system1 show_configs -f system type host port properties -------------------------------------------- system1 SYSTEM master_host 31006 This chapter describes configuration of the Hedeby components. The Resource Provider is the main component of the Hedeby system which provides information about resources and processes all information from the managed services. There exists only one Resource Provider in a Hedeby system and it should run in the JVM started as admin user (No root privileges required). Once the Resource Provider component has been started the command line clients can get information about resources and which resources are assigned to services. It is necessary to define the following component configurations before the Resource Provider can be started:
A typical component configuration entry in the system configuration file for the Resource Provider looks as follows: Example 2.5. Example for Hedeby Resource Provider system component configuration
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<common:global name="mySystem"
xmlns:executor="http://hedeby.sunsource.net/hedeby-executor"
xmlns:reporter="http://hedeby.sunsource.net/hedeby-reporter"
xmlns:security="http://hedeby.sunsource.net/hedeby-security"
xmlns:resource_provider="http://hedeby.sunsource.net/hedeby-resource-provider"
xmlns:common="http://hedeby.sunsource.net/hedeby-common"
xmlns:ge_adapter="http://hedeby.sunsource.net/hedeby-gridengine-adapter">
<common:jvm port="0"
user="root"
name="rp_vm">
<common:component xsi:type="common:Singleton"
host="foo"
autostart="true"
classname="com.sun.grid.grm.resource.impl.ResourceProviderImpl"
name="resource_provider"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
...
</common:jvm>
...
</common:global>
The Resource Provider component configuration is stored as a component configuration in the config service.
The configuration file is written in XML format. The structure is defined
in the xml schema Example 2.6. Example for Resource Provider configuration
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <common:componentConfig xsi:type="resource_provider:ResourceProviderConfig" Reporter component is a log/monitoring tool for Hedeby. The role of reporter component is to intercept and gather informations about what is going on in the system. Administrator can sprecify what kinf of data he is interested in. Right now reporter is able to store informations and notifications that comes from Configuration Service, Resource Provider and all services that are installed in the system. The reporter component is prepared to store data in arco data base. By prepared we mean that, there is a special arco format file created, that stores suitable for Arco data. The data from Arco file arent so much readable for normal user, thats why Administrator can get and print out on the screen data using CLI commands. The data can be filtered using avaiable filters.
The Reporter component configuration is stored as a component configuration
in the config service. The path to the configuration is defined in the
The configuration file is written in XML format. The structure is defined
in the xml schema Example 2.7. Example for Reporter configuration
<?xml version="1.0" encoding="UTF-8"?>
<reporter:reporter
xmlns:reporter="http://hedeby.sunsource.net/hedeby-reporter"
xmlns:common="http://hedeby.sunsource.net/hedeby-common"
filePattern="report-%g.log"
fileCount="4"
fileSize="5242880"/>
Service Adapter is a component representing service that is managed by Hedeby System. Currently Hedeby supports only two type of services: Spare Pools and GE Adapter (managing Grid Engine). The Service Adapter is capable of starting and stopping the currently running Service. To start a Service Adapter without starting a Service means to connect the Service Adapter to an already running Service. To stop a Service Adapter without stopping the associated Service means to disconnect from the Service but leave it running. Stopping a Service Adapter without stopping the associated Service also means that any resources assigned to that Service are effectively lost to Hedeby system. Service Adapter talks to a specific Service. As Hedeby is supporting only Grid Engine (GE) there exists only Grid Engine Adapter. The specifics of gathering information in order to evaluate SLO's and the process of preparing and adding a new resource or releasing a current resource is handled by the Service Adapter. To achieve this GE Adapter uses JGDI and Executor Section 2.4.3.4, “Executors”. The Service Adapter acts as the container for SLO's associated with its Service. The Service Adapter interacts with the Service to maintain a current perspective on the Service's SLO's. When an SLO is not being met, the Service Adapter must normalize that SLO into an urgency, an integer from 0 to 99. The Service Adapter also maintains a list of SLO priorities. When SLO non-compliance is raised to the Resource Provider, the Service Adapter first applies the SLO priority to the event, before sending it to the Resource Provider. A Grid Engine service in Hedeby is defined over an entry in the global configuration. Normally it is not necessary to modify this configuration. It is automatically created with the sdmadm add_ge_service. The following samples shows a typical definition of a Grid Engine Service. Example 2.8. Example for Hedeby service system component configuration
<?xml version="1.0" encoding="UTF-8"?>
<global name="new_hedeby_system"
xmlns='http://hedeby.sunsource.net/hedeby-common'
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
...
<jvm name="ge_jvm" user="grm_admin" port="0">
<component name="ge_service"
classname="com.sun.grid.grm.service.impl.ge.GEServiceImpl">As all other configurations Grid Engine Service configuration is defined in an xml structure. It can be modified with the sdmadm modify_component -c <component name> command. This command opens the configuration in an editor. The following section describes the configuration of a GEAdapter: Example 2.9. Example for Grid Engine service configuration
<?xml version="1.0" encoding="UTF-8"?> <componentConfig xsi:type="ge:GEServiceConfig"
<?xml version="1.0" encoding="UTF-8"?>
<componentConfig ...>
<slos>
<slo name="<name of SLO>"
The It further defines that all required resources for this SLO has the at least an usage which is equal or higher then the urgency of the SLO.
<?xml version="1.0" encoding="UTF-8"?>
<componentConfig ...>
<slos>
<slo name="minHostResourceSample"
xsi:type="MinHostResourceSLOConfig"
The It is also possible to define an urgency and resource type that is wanted by the service which is using this SLO.
<?xml version="1.0" encoding="UTF-8"?>
<common:slos>
<common:slo xsi:type="common:PermanentRequestSLOConfig
The
<?xml version="1.0" encoding="UTF-8"?>
<componentConfig ...>
<slos>
<slo name="fixUsageSample"
xsi:type="FixedUsageSLOConfig"
The Each host resource which matches the request filter and which runs jobs matching to the job filter will have at least the usage of the MaxPendingJobsSLO.
<?xml version="1.0" encoding="UTF-8"?>
<componentConfig xmlns:ge="http://hedeby.sunsource.net/hedeby-gridengine-adapter"
...>
<slos>
<slo name="maxPendingJobs"
xsi:type="ge:MaxPendingJobsSLOConfig"
The configuration of the Hedeby components allows the definition of filters in serveral places. All these filters uses the same filter language. The following listing shows the generic definition of a filter in Backus–Naur form (BNF):
filter: orExpr <EOF>
orExpr: andExpr ("|" andExpr)*
andExpr: expr ("&" expr)*
expr: "(" orExpr ")" | "!" orExpr | booleanExpr
booleanExpr: compareExpr | matchEpr
compareExpr: value ( "<"|"<="|"="|"!="|">="|">" ) value
matchExpr: "matches" stringLiteral
value: identifier | constant
constant: int_literal | float_literal | string_literal | bool_literal | null_literal
identifier: ["a"-"z","A"-"Z","_"] ( ["a"-"z","A"-"Z","_","0"-"9","."] )*
int_literal: integer literal (e.g. 10, 1G, 1g, xFF)
float_literal: floating point literal as in java (e.g. 12.0E3G)
string_literal: string literal as in java (e.g. "a")
bool_literal: "true" | "false"
null_literal: "null"
The basic expression of the filter language is the
The basic expressions can be combined with or/and expressions. The and expression has a higher binding than the or expression. The binding can be influenced be setting brackets. The basic expressions can also be negated:
According to the constants the filter language knowns four different data types (string, boolean, number or null type). If a compare expression has to compare different types of data it tries to do a data conversation with following the rules
The "a" matches "[a-z]*" => trueThe filter expression uses the standard java implementation of regular expressions. For more details please have a look at http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html Example 2.10. Job filters for the
The
% qstat -j 22377
==============================================================
job_number: 22377
exec_file: job_scripts/8
...
hard resource_list: arch=sol-sparc64, num_proc=2, lic=1
...
With the following job filter the job with id 22377 would be considered if the
arch = "sol-amd64" & lic = "1"
The above filter will match if a Grid Engine job is submitted with qsub -l arch=sol-amd64 -l lic=1.
arch matches "sol-.*"
Using Example 2.11. Resource Request Filter for SLOs If a SLO has a need it sends a request to Resource Provider. This request contains a request filter which is used to find matching resource. The request filter has access to any resource property of the resources.
(hardwareCpuArchitecture = amd64 & hardwareCpuCount >= 2) |
(hardwareCpuArchitecture = sol-sparc64 & hardwareCpuCount >= 4)
The above resource filter will match against all resources with more then one amd64 CPU or all sol-sparc64 hosts with more then 3 CPUs
state != "ERRROR"
The last example shows that the request filter can also filter resource according to it's state. Only resource which are not in error state will match. GEAdpater automatically updates the properties of the assigned host resource. With the "Complex to Resource Property Mapping" the administrator can define what complex values are used. After the installation of the first Grid Engine Service the default mapping is installed. It can be displayed with the sdmadm show_ge_complex_mapping command. For a complex mapping the administrator has to define the name of the complex the value of the complex and the list of resource property which will be used.
<ge:mapping>
<ge:resource>
<source name="arch">sol-sparc64</source>When ever a resource is assigned to a Grid Engine service the Grid Engine Adapter tries to install an exec daemon on the managed host. When ever a resource is removed from an execdaeomon it uninstalls the execd daemon from the managed host. This section describes how install/uninstall is done. Per default Grid Engine uses Grid Engines auto installation feature to perform an execd installation/uninstallation. The installation is done by executing a install script and with configuration file on the executor of the manged host. GEAdpater expects the following exit value from the install/uninstall scripts:
The installation script and the configuration file is generated out of templates. The pathes
to the templates can be defined in the
BASEDIR=`pwd`
SGE_ROOT="@@@SGE_ROOT@@@"
if [ ! -d "$SGE_ROOT" ]; then
echo "SGE_ROOT directory $SGE_ROOT does not exists"
exit 2
fi
if [ ! -f "$SGE_ROOT/inst_sge" ]; then
echo "inst_sge script in directory $SGE_ROOT not found"
exit 2
fi
if [ ! -x "$SGE_ROOT/inst_sge" ]; then
echo "inst_sge script in directory $SGE_ROOT is not executable"
exit 2
fi
if [ ! -f "$BASEDIR/install_execd.conf" ]; then
echo "auto config file $BASEDIR/install_execd.conf not found"
exit 2
fi
cd "$SGE_ROOT"
./inst_sge -x -noremote -auto $BASEDIR/install_execd.conf
res=$?
exit $res
The general overview of Executor component can be found in Section 1.2.1.4, “ Executor ” and Section 1.1.5, “Executor” The General configuration of Executor component is specified in Java Virtual Machines section For detailed information about specific fields (see Section 2.4.2, “Java Virtual Machines”). Example 2.12. Executor configuration
<?xml version="1.0" encoding="UTF-8"?>
<executor:executor
idleTimeout="60"NoteTo be fully operative, the Executor has to be run on JVM that is run as "root" user who has uid equal to 0. Otherwise, the user switching is not working anymore. Executor should be defined on each resource host. Example 2.13. Typical executor component definition
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<common:global name="mySystem"
xmlns:executor="http://hedeby.sunsource.net/hedeby-executor"
xmlns:reporter="http://hedeby.sunsource.net/hedeby-reporter"
xmlns:security="http://hedeby.sunsource.net/hedeby-security"
xmlns:resource_provider="http://hedeby.sunsource.net/hedeby-resource-provider"
xmlns:common="http://hedeby.sunsource.net/hedeby-common"
xmlns:ge_adapter="http://hedeby.sunsource.net/hedeby-gridengine-adapter">
<common:jvm port="0"
user="root"
name="executor_vm">
<common:c |