CAO Tien Duc
source-extractor

Repository



Reference repository generators
This directory contains the input files and scripts needed for generating:


The Reference API ie. the JSON files that are served by the Grid'5000 API for describing nodes, network equipments, topology ...
Ex: $ curl -k https://api.grid5000.fr/sid/sites/nancy/clusters/graoully/nodes/graoully-1.json?pretty
See also: https://www.grid5000.fr/mediawiki/index.php/API_all_in_one_Tutorial


The OAR properties ie. the node information that is registered in OAR databases and
allows user to select resources matching their experiment requirements.
Ex: $ oarsub -p "wattmeter=’YES’ and gpu=’YES’ and eth10g=’Y’"


The configuration files of the following puppet modules: bindg5kb, conmang5k, dhcpg5k, kadeployg5k and lanpowerg5k.


See also: https://www.grid5000.fr/mediawiki/index.php/Reference_Repository.

General Design
For the general design discussion, see:


https://www.grid5000.fr/mediawiki/index.php/CT-114_DesignRepoAPI (25/10/2015)
[CT-grid5000] Proposition de réorganisation des outils autour de la ref API (16/10/2015)
git clone git@gitolite.g5kadmin:slide -> 2016-03-29-IJD-Seminar-JG/


Requirements
Ruby 2.1 (+ HashDiff and Net/SSH for the oar-generator; + peach and ruby-cute for run-g5kchecks; + hash_validator for the input validators)
Here is an example for creating a ruby setup with RVM (https://rvm.io/), gemset and bundle:
$ \curl -sSL https://get.rvm.io | bash -s stable --ruby
$ source ~/.rvm/scripts/rvm
$ rvm install 2.1
$ rvm gemset create ref-repo-dev
$ rvm gemset use ref-repo-dev
$ bundle install

Input files
Input files are stored in the input/ directory.
A ruby script loads the YAML files in a global hashtable. The file paths are used as entry points to the hashtable.

sites/nancy/clusters/pomme/pomme.yaml.erb
nodes:
pomme-1:
ib0:
ip: 172.18.70.11
becomes:
hash = {"sites"=>
{"nancy"=>
{"clusters"=>
{"pomme"=>
{"nodes"=>
{"pomme-1"=>
{"ib0"=> {"ip"=> "172.18.70.11"} }
} } } } } }
The input file loader (lib/input_loader.rb) also supports coalesced key names:
pomme-[1-4]:
performance:
core_flops: 3929000000
node_flops: 7440000000
and for more complex use case, YAML can be generated with ERB:

input/sites/nancy/clusters/pomme.yaml.erb
nodes:
<% (1..16).each { |i| %>
pomme-<%= i %>:
ib0:
ip: 172.18.70.<%= i + 10 %>
<% } %>

Updating the node input files using g5k-checks
Node files (input/grid5000/sites//clusters//nodes/*.yaml) are generated by g5k-checks and should not be edited manually.

You can use run-g5kchecks/run-g5kchecks.rb for updating those files.
Node files of input/ are slightly edited compared to the raw output of g5k-checks -m api.
=> Use run-g5kchecks/postprocessing.rb for applying the mandatory modifications to the g5k-checks ouput.

See also:

https://www.grid5000.fr/mediawiki/index.php/G5k-checks
https://github.com/grid5000/g5k-checks/


Reference API Generator
The Reference API generator reads the input/ YAML files and generates the data/ JSON files.
Usage: cd reference-api/; ruby reference-api.rb

Updating the OAR properties
The generator can show the differences between the reference-repo and the OAR servers databases:
workstation$ ruby oar-properties.rb -d -n graphene-105 -vv
Output format: [’’, ’key’, ’old value’, ’new value’]
graphene-105:
["", "disktype", "sata", "SATA II"]
["", "eth10g", nil, false]
["", "ib10g", false, true]
["", "ib10gmodel", "none", "MT26418"]
["", "ib40g", nil, false]
["~", "ib40gmodel", nil, "none"]
You can run the script per site, cluster or node using the -s, -c and -n options. There are several verbose levels (-v, -vv, -vvv).
After reviewing the changes, you can update the configuration of the OAR server:
workstation$$ ruby oar-properties.rb -d -n graphene-105 -e
The previous command will execute the oarnodesetting/oar_resources_add/oarproperty commands needed for updating the OAR database via SSH. For example, it will execute:
g5kadmin@oar.nancy$ oarnodesetting -h graphene-104 -p disktype='SATA II' -p eth10g='NO' -p ib10gmodel='MT26418'
If you do not want to run the script directly on the OAR server, you can also print the commands out by using the -o (--output) option in place of -e (--execute).
With the -d (--diff) option, the script fetches the OAR resources description from the OAR API. If you remove this option, the generator
simply creates a script for setting up every node properties (even those that might already be set correctly). There is a safeguard for not adding a node twice to an OAR database.
You can also use cache files (-d oarnodes-%s.yaml) if you do not want to retrieve the OAR configuration each time you run the generator.
See ruby oar-properties.rb --help for more information.
If you need to add new OAR properties, see get_node_properties and diff_node_properties in oar-properties/lib/lib-oar-properties.rb.
This script can be tested with the oar-vagrant box (https://github.com/oar-team/oar-vagrant).
See also:

https://www.grid5000.fr/mediawiki/index.php/OAR_properties
https://www.grid5000.fr/mediawiki/index.php/OAR_properties_2.0


Generating the DHCP, DNS, Kadeploy, Conman and Lanpower configurations
Principles:

The source code and the templates of the generators are located in the reference-repo in the generators/puppet/ directory.
Additional configuration files (ex: console passwords, kadeploy tuning options etc.) are located within the puppet-repo in puppet-repo/modules/<module_name>/generators/.
By default, the output files of the generators are created at /tmp/puppet-repo. It is possible to create them directly at the right place in the puppet-repo directories by passing the option "-o puppet_repo_path" to the generator.
(ie. you can use git diff to display the changes made by the generators before committing them).

Usage example:
$ (cd /tmp; git clone ssh://g5kadmin@git.grid5000.fr/srv/git/repos/puppet-repo) # or use your existing local copy of the repository
$ cd reference-repo/generators/puppet
$ rake puppet_repo=/tmp/puppet_repo # run every generator and output files to /tmp/puppet_repo
or
$ rake puppet_repo=/tmp/puppet_repo sites=SITE1,SITE2 # run every generator and output files to /tmp/puppet_repo for only SITE1 and SITE2
You also can also run the generators one by one:
$ ruby bindg5k.rb [options]
$ ruby conmang5k.rb [options]
$ ruby dhcpg5k.rb [options]
$ ruby kadeployg5k.rb [options]
$ ruby lanpowerg5k.rb [options]
You can run ruby %generator%.rb -h to see available options for a particular generator.
Each generator can be run to generate configuration files for only specified grid5000 sites:
ruby %generator%.rb -s SITE1,SITE2 [options]
By default, generators will execute for all sites.

Configuration directory
lanpowerg5k, conmang5k and kadeployg5k generators require an additional parameter, the path where to find configuration used to generate configuration files:
ruby %generator%.rb -o puppet_repo_path -c configuration_directory
It is not mandatory to provide it if the '-o' option is given. In that case, the configuration directory will be set to 'puppet-repo-path/modules/<module_name>/generators/'.
To enable the standalone release of the generators, the conf-examples/ directory includes examples of configuration files and the generators can be run without access to the puppet-repo:
ruby %generator%.rb -c ./conf-examples/

Notes about the conmang5k/lanpowerg5k generators

The configuration is generated for both the clusters and servers entries of the reference-api.

Example:
sites:
rennes:
clusters:
parapide:
nodes:
parapide-[1-25]:
servers:
ceph:
nodes:
ceph[0-3]:

The configuration is shared between the module conmang5k and lanpowerg5k
** Passwords are stored in puppet-repo/modules/conmang5k/generators/console-password.yaml
** Additional configuration options are located in puppet-repo/modules/conmang5k/generators/console-password.yaml


Notes about the kadeplog5k generator

It generates both the configuration of the kadeploy and kadeploy-dev servers.
It generates the <site_uid>/clusters.conf and the <site_uid>/<cluster_uid>-clusters.conf files for each sites.
The template of the <site_uid>/<cluster_uid>-clusters.conf files is located in reference-repo/generators/puppet/templates/kadeployg5k.conf.erb.
It includes default parameters.
Cluster specific tuning options are located in puppet-repo/modules/kadeployg5k/generators/kadeployg5k.yaml and kadeployg5k-dev.yaml


Notes about the DNS and DHCP generators

Those generators are solely based on the data of the reference API.
DHCP and DNS entries are generated for nodes (including MIC accelerators), network equipments, pdus and dom0.
In addition, the DNS also has kavlan entries and the DHCP includes admin laptops.
The DNS script generates the global-<site_uid>.conf files and the zones/<site_uid>/*.db files.
DNS CNAME are generated for the main ethernet interface of the nodes.
DNS entries can be added manually to the configuration:
** A new zone can be added directly to the ./zones/<site_uid>/ directory. Re-run the generator after adding the file to add the zone to the global-<site_uid>.conf files.
** For existing zone, add a ./zones/<site_uid>/<zone_name>-manual.db file and re-run the generator.
This file will get included automatically in the generated zone file called ./zones/<site_uid>/<zone_name>.db.
See bindg5k/files/zones/nancy/ for examples. <zone_name>-manual.db files do no need the usual headers ($TTL directives etc.) of zone files.


How to add a new cluster to Grid'5000
The generators can be used to ease the integration of a new cluster to the platform.
The general idea is to:

add manually a minimalistic description of the new cluster to the reference-api input files
use the generators to create the (puppet) configuration files needed to boot the cluster
enrich the reference-api input files with information retrieved by g5k-checks on the nodes

Detailed steps:


Manually get the list of node MAC adresses (ex: QR code scan)


Add the cluster to the reference-api:


See input/grid5000/sites/nancy/clusters/graoully/graoully_ip.yaml.erb as an example for bootstrapping a cluster configuration. Create a similar file for the new cluster. This files includes the ip/mac information that will be used by the configuration file generators.
You don't have to run the reference-api generator yet (generator/reference-api/reference-api.rb) but the cluster information must be available locally in the input directory


Run the DHCP, Kadeploy, Conman and Lanpower generators:
$ cd generators/puppet
$ puppet_repo=path_to_your_local_puppet_repo rake


First boot of the cluster :-)


Retrieve hardware information using g5k-check and add them to the reference repository
$ cd generators/run-g5kchecks; ruby run-g5kchecks.rb -f -s <site_uid> -c <cluster_uid>


Manually add the information that are not provided by g5k-checks (ex: the main <cluster_uid>.yaml file, PDU information ...). Use graoully.yaml as an example.


Note that it is mandatory to specify some of the network_adapter properties (enabled=true, mountable: true etc.). Those properties are used to detect the name of the main interface (eth0 ? eth1 ?).
You can use the input validator to check the new cluster configuration:
$ cd generators/input-validators; ruby yaml-input-schema-validator.rb


Run the OAR properties generators. This generator will add the nodes to the OAR configuration.
$ cd generators/oar-properties/
$ ruby oar-properties.rb -s <site_uid> -c <cluster_uid> -d -vv
$ ruby oar-properties.rb -s <site_uid> -c <cluster_uid> -d -e


Add the cluster to the reference API:
$ cd generator/reference-api; ruby reference-api.rb # and then commit the data/ directory + wait 10-15 minutes


Boot and check the nodes with g5k-checks