Mentions légales du service

Skip to content
Snippets Groups Projects

Table of Contents

Synopsis

The reference data is stored in a Git repository as JSON files, organized into hierarchical folders. These files can be manually written but the Git repository comes with a "/generators" folder which contains a script to ease their generation, based on high-level description files written in Ruby. Given one or more input files that describe the data you want to add, it will generate the required JSON files, directories and symlinks.

Requirements

  • Ruby >= 1.8.6
  • Bundler (gem install bundler)
  • Git
  • g5kadmin account

Workflow

The general overview of the workflow between git repositories is as follows:

    ADMIN REPOSITORY <-- pull/push -->  MASTER REPOSITORY
                                               |
                                               |
                                              pull 
                                         (every minute)
                                              / \
                                             /   \
                                            |     |
                   API REPOSITORY SITE X <--|     |--> API REPOSITORY SITE Y

Each site administrator must first clone the remote MASTER REPOSITORY located on the git.grid5000.fr server and store it on a local machine (this is what I call the ADMIN REPOSITORY and has to be done once):

  g5kadmin@host:/somewhere$ git clone ssh://g5kadmin@git.grid5000.fr/srv/git/repos/reference-repository.git

When there is a need for change, the site administrator PULLs from the MASTER REPOSITORY to get the latest changes:

  g5kadmin@host:/somewhere/reference-repository$ git pull

Then she manually adds/edits/removes the raw JSON files or uses the generator (more on that later). When she's done, she COMMITs her changes and PUSHes them to the MASTER REPOSITORY:

  ... editing ...
  g5kadmin@host:/somewhere/reference-repository$ git commit -a -m "list of modifications"
  g5kadmin@host:/somewhere/reference-repository$ git push

Finally, these changes are automatically replicated every minute to each API REPOSITORY (one per site), that are used by the Reference API.

Getting started

First, clone the remote MASTER REPOSITORY if it is not already done:

  g5kadmin@host:/somewhere$ git clone ssh://g5kadmin@git.grid5000.fr/srv/git/repos/reference-repository.git

From the newly created reference-repository folder, run the following command to install the required dependencies:

  g5kadmin@host:/somewhere/reference-repository$ bundle install

Right now, the easiest way to get started is to look at some existing input files in the "generators/input" directory. There you can see how you can define sites, clusters, nodes and environments programmatically. Then you may create a new input file or change an existing one and run it in simulation mode:

  g5kadmin@host:/somewhere/reference-repository$ ./generators/grid5000 generators/input/*.rb -s

or, if you want to explicitly specify the input files:

  g5kadmin@host:/somewhere/reference-repository$ ./generators/grid5000 generators/input/input-file1.rb generators/input/input-file2.rb -s

For more information about the available options and usage of the grid5000 generator, run:

  g5kadmin@host:/somewhere/reference-repository$ ./generators/grid5000 --help

You may also use the rake task available (run $ rake -D to see the list of available tasks):

  g5kadmin@host:/somewhere/reference-repository$ rake g5k:generate

Your changes won't be applied but you'll see what would have been changed. Thus, the simulation mode is useful to review your changes before committing and check the ruby syntax of the input files.

When you are happy with your changes, you can then run the command without the -s flag:

  g5kadmin@host:/somewhere/reference-repository$ ./generators/grid5000 generators/input/*.rb

Please be aware that config files (YAML format) may be passed on the command line, so that the values can be used in the input files via the lookup(config_filename, key) function. To tell the generator to include one or more config files, you must pass them in your command arguments:

  g5kadmin@host:/somewhere/reference-repository$ ./generators/grid5000 generators/input/*.rb generators/input/*.yaml

Finally, commit your changes with a meaningful message in ENGLISH (you SHOULD first review the changes that will be committed by running the git diff command) and push them immediately to the MASTER REPOSITORY:

  g5kadmin@host:/somewhere/reference-repository$ git commit -a -m "[<code class="replace">TAGS</code>] message"

  g5kadmin@host:/somewhere/reference-repository$ git push

Synchronizing the OAR database

As of 2010/09/08, a synchronization task has been added that allows you to generate the diff between 2 commits (not necessarily consecutive).

Once you've committed your changes, run the oar:generate rake task to generate the corresponding oaradmin lines (run $ rake -D to see the list of available tasks):

  g5kadmin@host:/somewhere/reference-repository$ rake oar:generate -s FROM=<PREVIOUS-COMMIT-ID> TO=<LATEST-COMMIT-ID>

By default, TO is set to HEAD.

The oaradmin lines are sent to STDOUT, the logging data to STDERR.

Filling the reference - Guidelines

network_adapters

Many machines have several network interfaces, which are not always all configured. We have identified 4 cases in G5K clusters:

  1. The interface is not connected to any cable.
  2. The interface is the admin interface (e.g. IPMI).
  3. The interface is not mounted in the production environment, but users may use it in their own deployed environment.
  4. The interface is mounted in the production environment.
After several discussions inside the PS team, we have fixed some attributes. All of them are mandatory, but the ones between square brackets are only mandatory under conditions. Those conditions follow the field name, in red.
  • interface: the type of network interface, ∈ {"Ethernet", "Myrinet", "InfiniBand"}
NB: It is useless to define "Myrinet 10G" or "Myri-2000" values, because the rate will differentiate them.
  • rate: speed of the interface in b/s
  • mac:
if interface ∈ {"Ethernet", "Myrinet"}, the MAC address of this interface,
if interface=="InfiniBand", its GUID.
  • vendor: the company which made the device
  • version: its version according to the company nomenclatura
  • enabled: true if there is any cable connected to this interface
  • [management](if enabled==true): true if this interface is on the administration network (IPMI,...)
  • [network_address](if mounted==true or management==true): the DNS entry of the machine by this interface
  • [mountable](if enabled==true): true if it is usable by any user (even if it possibly requires a customized environment)
  • [driver](if mountable==true): name of the driver for the device in the linux kernel
  • [mounted](if mountable==true): true if the production environment mounts, configures this interface
  • [network_address](if mounted==true or management==true): the DNS entry of the machine by this interface
  • [device](if mounted==true): name of this interface in the production environment
  • [ip](if enabled==true): the IP of this interface
  • [ip6]: the IPv6 of this interface, for future use...
[EDIT] No contestation, no more votes for the alternate propositions and no other entry proposed => the base version is validated.

script reaching IP/MAC addresses of cluster

Some scripts have been created to ease the retrieving of MAC/IP addresses on cluster. Get them here

how to retrieve the guid on Infiniband card

Here a sample of an ohai plugin (included on the useful gem reference-helper ! ) :

#
# Author:: Pascal Morillon <pascal.morillon@irisa.fr>

provides "infiniband"

infiniband Mash.new

interfaces = Dir['/sys/class/net/*'].collect { |c| File.basename(c) }.select { |s| s =~ /ib.*/ }
interfaces.each do |interface|
  infiniband[:"#{interface}"] = Mash.new
  if File.exist?(File.join('/sys/class/net', interface, 'address'))
    if File.exist?('/sys/class/infiniband/mthca0/ports')
      guid_prefix = "20:00:55:04:01:"
    elsif File.exist?('/sys/class/infiniband/mlx4_0/ports')
      guid_prefix = "20:00:55:00:41:"
    end
    guid_part2 = File.read(File.join('/sys/class/net', interface, 'address')).chomp
    infiniband[:"#{interface}"][:guid] = guid_prefix + guid_part2.split(":")[5..20].join(":") 
  else
    exit 1
  end
end

Resources

FAQ

Resolv::ResolvError occurred when I tried to generate

At runtime, the generator resolves the hostname of each node to obtain its IP address. As a consequence it should be run from within Grid5000 so that it can query the Grid5000 DNS.

What kind of commit message ?

After each modification to the repository, you should immediately commit your changes with a meaningful message, so that people can easily understand what has changed (latest changes will be displayed in a syndication feed). Your commits should also be site-specific, or even cluster-specific to avoid merge conflicts. Try to avoid putting a lot of changes in only one commit.

You should also check that your name and email are correctly configured in your Git configuration:

  $ git config --get user.name
  $ git config --get user.email

Otherwise you can set them by issuing:

  $ git config --global user.name 'John Doe'
  $ git config --global user.email johndoe@example.com

My commit has been rejected, why ?

Since users will make queries such as: "give me the description of that site at this date", the time between the date of the commit and the effective replication of the changes to the APIs must be as low as possible. That's why, right after your commit, you should push your changes to the remote MASTER REPOSITORY. Please note that commits whose committed date is older than 60 seconds will be rejected (if you encounter an error, check that your system clock is correctly synchronized with a time server).