python-grid5000
python-grid5000
is a python package wrapping the Grid’5000 REST API. You can
use it as a library in your python project or you can explore the Grid’5000
resources interactively using the embedded shell.
API compatibility
Client version | API version |
---|---|
1.x | 3.x (stable) |
Thanks
The core code is borrowed from python-gitlab with small adaptations to conform with the Grid5000 API models (with an ‘s’!)
Contributing
- To contribute, you can drop me an email or open an issue for a bug report, or feature request.
- There are many areas where this can be improved some of them are listed here:
- The complete coverage of the API isn’t finished (yet) but this should be fairly easy to reach. Most of the logic go in ~grid5000.objects~. And to be honnest I only implemented the feature that I needed the most.
- Returned status code aren’t yet well treated.
Comparison with …
- RESTfully: It consumes REST API following the HATEOAS principles. This allows the client to fully discover the resources and actions available. Most of the G5K API follow theses principles but, for instance the Storage API don’t. Thus RESTfully isn’t compatible with all the features offered by the Grid’5000 API. It’s a ruby library. Python-grid5000 borrows the friendly syntax for resource browsing, but in python.
- Execo: Written in Python. The api module gathers a lot of utils functions leveraging the Grid’5000 API. Resources aren’t exposed in a syntax friendly manner, instead functions for some classical operations are exposed (mainly getters). It has a convenient way of caching the reference API. Python-grid5000 is a wrapper around the Grid’5000 that seeks 100% coverage. Python-grid5000 is resource oriented.
- Raw requests: The reference for HTTP library in python. Good for prototyping but low-level. python-grid5000 encapsulates this library.
Installation and examples
- Please refer to https://api.grid5000.fr/doc/4.0/reference/spec.html for the complete specification.
- All the examples are exported in the examples subdirectory so you can easily test and adapt them.
- The configuration is read from a configuration file located in the home directory (should be compatible with the restfully one). It can be created with the following:
- When accessing the API from outside of Grid’5000 (e.g your local workstation), you need to specify the following configuration file:
echo ' username: MYLOGIN password: MYPASSWORD ' > ~/.python-grid5000.yaml
- When accessing the API from a Grid’5000 frontend, providing the username and password is optionnal. Nevertheless you’ll need to deal with SSL verification by specifying the path to the certificate to use:
echo ' verify_ssl: /etc/ssl/certs/ca-certificates.crt ' > ~/.python-grid5000.yaml
- Using a virtualenv is recommended (python 3.5+ is required)
virtualenv -p python3 venv source venv/bin/activate pip install python-grid5000
Grid’5000 shell
If you call grid5000
on the command line you should land in a ipython shell.
Before starting, the file $HOME/.python-grid5000.yaml
will be loaded.
$) grid5000 Python 3.6.5 (default, Jun 17 2018, 21:32:15) Type 'copyright', 'credits' or 'license' for more information IPython 7.3.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: gk.sites.list() Out[1]: [<Site uid:grenoble>, <Site uid:lille>, <Site uid:luxembourg>, <Site uid:lyon>, <Site uid:nancy>, <Site uid:nantes>, <Site uid:rennes>, <Site uid:sophia>] In [2]: # gk is your entry point
Reference API
Get node information
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
node_info = gk.sites["nancy"].clusters["grisou"].nodes["grisou-1"]
print("grisou-1 has {threads} threads and has {ram} bytes of RAM".format(
threads=node_info.architecture["nb_threads"],
ram=node_info.main_memory["ram_size"]))
Get Versions of resources
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
root_versions = gk.root.get().versions.list()
print(root_versions)
rennes = gk.sites["rennes"]
site_versions = rennes.versions.list()
print(site_versions)
cluster = rennes.clusters["paravance"]
cluster_versions = cluster.versions.list()
print(cluster_versions)
node_versions = cluster.nodes["paravance-1"]
print(node_versions)
Browse the reference API offline
Note that only GET like requests are accepted on the ref API.
import logging
import json
from pathlib import Path
import os
from grid5000 import Grid5000, Grid5000Offline
logging.basicConfig(level=logging.DEBUG)
# First get a copy of the reference api
# This is a one time and out-of-band process,
# here we get it by issuing a regular HTTP request
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
data = gk.dump_ref_api()
Path("ref.yaml").write_text(json.dumps(data))
# you can dump the data to a file
# and reuse it offline using the dedicated client
# here we reuse directly the data we got (no more HTTP requests will be issued)
ref = Grid5000Offline(json.loads(Path("ref.yaml").read_text()))
print(ref.sites["rennes"].clusters["paravance"].nodes["paravance-1"])
Monitoring API
Get Statuses of resources
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
rennes = gk.sites["rennes"]
site_statuses = rennes.status.list()
print(site_statuses)
cluster = rennes.clusters["paravance"]
cluster_statuses = cluster.status.list()
Job API
Job filtering
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# state=running will be placed in the query params
running_jobs = gk.sites["rennes"].jobs.list(state="running")
print(running_jobs)
# get a specific job by its uid
job = gk.sites["rennes"].jobs.get("424242")
print(job)
# or using the bracket notation
job = gk.sites["rennes"].jobs["424242"]
print(job)
Submit a job
import logging
import os
import time
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# This is equivalent to gk.sites.get("rennes")
site = gk.sites["rennes"]
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600"})
while job.state != "running":
job.refresh()
print("Waiting for the job [%s] to be running" % job.uid)
time.sleep(10)
print(job)
print("Assigned nodes : %s" % job.assigned_nodes)
Deployment API
Deploy an environment
import logging
import os
import time
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# This is equivalent to gk.sites.get("rennes")
site = gk.sites["rennes"]
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600",
"types": ["deploy"]})
while job.state != "running":
job.refresh()
print("Waiting the job [%s] to be running" % job.uid)
time.sleep(10)
print("Assigned nodes : %s" % job.assigned_nodes)
deployment = site.deployments.create({"nodes": job.assigned_nodes,
"environment": "debian9-x64-min"})
# To get SSH access to your nodes you can pass your public key
#
# from pathlib import Path
#
# key_path = Path.home().joinpath(".ssh", "id_rsa.pub")
#
#
# deployment = site.deployments.create({"nodes": job.assigned_nodes,
# "environment": "debian9-x64-min"
# "key": key_path.read_text()})
while deployment.status != "terminated":
deployment.refresh()
print("Waiting for the deployment [%s] to be finished" % deployment.uid)
time.sleep(10)
print(deployment.result)
Storage API
Get Storage accesses
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
print(gk.sites["rennes"].storage["msimonin"].access.list())
Set storage accesses (e.g for vms)
from netaddr import IPNetwork
import logging
import os
import time
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
site = gk.sites["rennes"]
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600",
"resources": "slash_22=1+nodes=1"})
while job.state != "running":
job.refresh()
print("Waiting the job [%s] to be running" % job.uid)
time.sleep(5)
subnet = job.resources_by_type['subnets'][0]
ip_network = [str(ip) for ip in IPNetwork(subnet)]
# create acces for all ips in the subnet
access = site.storage["msimonin"].access.create({"ipv4": ip_network,
"termination": {"job": job.uid,
"site": site.uid}})
Vlan API
Get vlan(s)
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
site = gk.sites["rennes"]
# Get all vlans
vlans = site.vlans.list()
print(vlans)
# Get on specific
vlan = site.vlans.get("4")
print(vlan)
vlan = site.vlans["4"]
print(vlan)
# Get vlan of some nodes
print(site.vlansnodes.submit(["paravance-1.rennes.grid5000.fr", "paravance-2.rennes.grid5000.fr"]))
# Get nodes in vlan
print(site.vlans["4"].nodes.list())
Set nodes in vlan
- Putting primary interface in a vlan
import logging import os import time from grid5000 import Grid5000 logging.basicConfig(level=logging.DEBUG) conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml") gk = Grid5000.from_yaml(conf_file) site = gk.sites["rennes"] job = site.jobs.create({"name": "pyg5k", "command": "sleep 3600", "resources": "{type='kavlan'}/vlan=1+nodes=1", "types": ["deploy"]}) while job.state != "running": job.refresh() print("Waiting the job [%s] to be runnning" % job.uid) time.sleep(5) deployment = site.deployments.create({"nodes": job.assigned_nodes, "environment": "debian9-x64-min", "vlan": job.resources_by_type["vlans"][0]}) while deployment.status != "terminated": deployment.refresh() print("Waiting for the deployment [%s] to be finished" % deployment.uid) time.sleep(10) print(deployment.result)
- Putting the secondary interface in a vlan
import logging import os import time from grid5000 import Grid5000 logging.basicConfig(level=logging.DEBUG) def _to_network_address(host, interface): """Translate a host to a network address e.g: paranoia-20.rennes.grid5000.fr -> paranoia-20-eth2.rennes.grid5000.fr """ splitted = host.split('.') splitted[0] = splitted[0] + "-" + interface return ".".join(splitted) conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml") gk = Grid5000.from_yaml(conf_file) site = gk.sites["rennes"] job = site.jobs.create({"name": "pyg5k", "command": "sleep 3600", "resources": "{type='kavlan'}/vlan=1+{cluster='paranoia'}nodes=1", "types": ["deploy"] }) while job.state != "running": job.refresh() print("Waiting the job [%s] to be runnning" % job.uid) time.sleep(5) vlanid = job.resources_by_type["vlans"][0] # we hard code the interface but this can be discovered in the node info # TODO: write the code here to discover nodes = [_to_network_address(n, "eth2") for n in job.assigned_nodes] print(nodes) # set in vlan site.vlans[vlanid].nodes.submit(nodes)
Finding vlans of a user, users of a vlan
The Vlan API allows to check which vlans are being used by a user using the vlanusers manager. Additionally for each individual vlan it is possible to check whether a user is authorized or not.
import os
from grid5000 import Grid5000
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
site = gk.sites["rennes"]
#Get list of users using vlans
users = site.vlansusers.list()
#Get list of vlans a user is using.
user = site.vlansusers['msimonin']
print(user.vlans)
#Get list of users using a specific vlan
users = site.vlans['4'].users.list()
#Check if a user has access to a specific vlan.
user = site.vlans['4'].users['msimonin']
print(user.status)
Vlan Stiching
The stitching API allows users to connect a Grid’5000 global vlan to external vlans connected to other testbeds for experiments involving wide area layer2 networks.
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# List current stitchings
gk.stitcher.list()
# Sitching global vlan 16 to external vlan 1290
gk.stitcher.create({"id":"16", "sdx_vlan_id":"1290"})
# Get a a stitchings information by Grid'5000 global vlan id.
stitching = gk.stitcher.get('16')
# Or
stitching = gk.stitcher['16']
# End stitching
stitching = gk.stitcher.get('16')
stitching.delete()
# Or
gk.stitcher.delete('16')
Metrics API
Get the timeseries corresponding to a job
Credits to lturpin
.
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
def get_job_consumption(job_id, gk, site):
metrics = gk.sites[site].metrics
job = gk.sites[site].jobs[job_id]
# nodes as list : "cluster-number.site.grid5000.fr"
nodes_dom = job.assigned_nodes
# nodes as list : "cluster-number"
nodes = map(lambda node_dom: node_dom.split('.')[0], nodes_dom)
# nodes as string : "cluster-number,cluster-number,..."
nodes_str = ','.join(nodes)
start = job.started_at
end = job.stopped_at
kwargs = {
"only": nodes_str,
"resolution": 1,
"from": start,
"to": end
}
timeseries = metrics["power"].timeseries.list(**kwargs)
return timeseries
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
timeseries = get_job_consumption("1092446", gk, "lyon")
print(timeseries)
Get some timeseries (and plot them)
For this example you’ll need matplotlib
, seaborn
and pandas
.
import logging
import os
from grid5000 import Grid5000
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import time
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
metrics = gk.sites["lyon"].metrics
print("--- available metrics")
print(metrics.list())
print("---- power metric")
print(metrics["power"])
print("----- a timeserie")
now = time.time()
kwargs = {
"only": "nova-1,nova-2,nova-3",
"resolution": 1,
"from": int(now - 600),
"to": int(now)
}
timeseries = metrics["power"].timeseries.list(**kwargs)
# let's visualize this
df = pd.DataFrame()
for timeserie in timeseries:
print(timeserie)
timestamp = timeserie.timestamps
value = timeserie.values
measurement = timeserie.uid
df = pd.concat([df, pd.DataFrame({
"timestamp": timestamp,
"value": value,
"measurement": [measurement]*len(timestamp)
})])
sns.relplot(data=df,
x="timestamp",
y="value",
hue="measurement",
kind="line")
plt.show()
More snippets
Site of a cluster
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
clusters = ["dahu", "parasilo", "chetemi"]
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
sites = gk.sites.list()
matches = []
for site in sites:
candidates = site.clusters.list()
matching = [c.uid for c in candidates if c.uid in clusters]
if len(matching) == 1:
matches.append((site, matching[0]))
clusters.remove(matching[0])
print("We found the following matches %s" % matches)
Get all job with a given name on all the sites
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
NAME = "pyg5k"
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
sites = gk.sites.list()
site = gk.sites["rennes"]
sites = [gk.sites["rennes"], gk.sites["nancy"], gk.sites["grenoble"]]
# creates some jobs
jobs = []
for site in sites:
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600"})
jobs.append(job)
_jobs = []
for site in sites:
_jobs.append((site.uid, site.jobs.list(name=NAME,
state="waiting,launching,running")))
print("We found %s" % _jobs)
# deleting the jobs
for job in jobs:
job.delete()
Caching API responses
The Grid’5000 reference API is static. In this situation to speed up the
requests, one could leverage heavily on caching. Currently
python-grid5000
doesn’t do caching out-of the box but defers that to the
consuming application. There are many solutions to implement a cache.
Amongst them LRU cache
(https://docs.python.org/3/library/functools.html#functools.lru_cache)
provides an in-memory caching facilities but doesn’t give you control on the
cache. The ring library (https://ring-cache.readthedocs.io/en/stable/) is
great as it implements different backends for your cache (esp.
cross-processes cache) and give you control on the cached object. Enough talking:
import logging
import threading
import os
import diskcache
from grid5000 import Grid5000
import ring
_api_lock = threading.Lock()
# Keep track of the api client
_api_client = None
storage = diskcache.Cache('cachedir')
def get_api_client():
"""Gets the reference to the API cient (singleton)."""
with _api_lock:
global _api_client
if not _api_client:
conf_file = os.path.join(os.environ.get("HOME"),
".python-grid5000.yaml")
_api_client = Grid5000.from_yaml(conf_file)
return _api_client
@ring.disk(storage)
def get_sites_obj():
"""Get all the sites."""
gk = get_api_client()
return gk.sites.list()
@ring.disk(storage)
def get_all_clusters_obj():
"""Get all the clusters."""
sites = get_sites_obj()
clusters = []
for site in sites:
# should we cache the list aswell ?
clusters.extend(site.clusters.list())
return clusters
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG)
clusters = get_all_clusters_obj()
print(clusters)
print("Known key in the cache")
print(get_all_clusters_obj.get())
print("Calling again the function is now faster")
clusters = get_all_clusters_obj()
print(clusters)
Using Grid’5000 client certificates
python-grid5000
can also be used as a trusted client with Grid’5000
internal certificate. In this mode users can pass the g5k_user
argument
to most calls to specify which user the API call should be made as. In
cases where g5k_user
is not specified API calls will be made as the
anonymous
user whose access is limited to the Grid’5000 reference API.
In this mode python-grid5000
does not store any login information, so
g5k_user
must be provided explicitly provided on every call that requires
one.
import logging
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
gk = Grid5000(
uri="https://api-ext.grid5000.fr/stable/",
sslcert="/path/to/ssl/certfile.cert",
sslkey="/path/to/ssl/keyfile.key"
)
gk.sites.list()
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600"},
g5k_user = "auser1")
# Since the 'anonymous' user can not inspect jobs the following call will raise exception
# python-grid5000.exceptions.Grid5000AuthenticationError: 401 Unauthorized
job.refresh()
# Both following call work since any user can request info on any jobs.
job.refresh(g5k_user='auser1')
job.refresh(g5k_user='auser2')
# Some operations can only be performed by the jobs creator.
# The following call will raise exception
# pyg5k.exceptions.Grid5000DeleteError: 403 Unauthorized
job.delete(g5k_user='auser2')
# This call works as expected
job.delete(g5k_user='auser1')
Appendix
How to export this file
- Produce
README.rst
. To generate the rst file, load theox-rst.el
file from https://github.com/msnoigrs/ox-rst into emacs. Then do,C-c C-e r r
orM-x org-rst-export-to-rst
. - Produce python example scripts.
Do
C-c C-v t
orM-x org-babel-tangle
. The scripts are available under available underexamples
.