Mentions légales du service

Skip to content
Snippets Groups Projects
Gaetan Lepage's avatar
LEPAGE Gaetan authored
177a163c
History

RemI

Remote Inria
Code here, run there !

A CLI tool for remotely interacting with Inria computing resources.

What is remi ?

remi is a tool aiming at easing the workflow of Inria researchers when it comes to performing computations remotely. More precisely, remi is configured for each project you might want to use it for.
Once your preferences are set, you can run your code either on your desktop at Inria or on one the cluster nodes.

If you are tired of messing up with 4-line oarsub commands or if you find yourself committing and pushing your code each time you want to test it remotely, this tool is for you !

Note: Even though this tool has been made to work from a personal computer different from your Inria desktop, it also totally works when running directly from the Inria desktop.
Most of the remi features are still relevant in this case.

Presentation / tutorial video:
https://odysee.com/@GaetanLepage:6/remote-inria:6

Main features

  • Synchronization: remi creates a remote clone of your project on the Inria storage space (under \scratch) and lets you synchronize it easily.

  • Remote execution: The core idea behind remi is provide an easy way to execute code on remote computers (regular Inria desktops or cluster servers).

  • Clusters and singularity support: The complex way to request computing resources is integrated in remi to minimize overhead. Singularity container management is also embedded (build and use).

  • Jupyter notebook support: This tool lets you run a jupyter notebook server on your Inria workstation and connect to it locally on the browser.

Install

Dependencies and prerequisites

First, you need to set up your ssh keys to be able to connect to the Inria computers (learn more here).

You also need to have rsync and python (>= 3.7) on your system.

Installation

The installation is done on the local PC.

To perform a simple user installation with pip, the following command is enough:

pip install git+ssh://git@gitlab.inria.fr/galepage/remi.git

If you want to do modifications to the code, you may clone this repo and install the package in the editable mode (pip -e option).

git clone git@gitlab.inria.fr:galepage/remi.git
cd remi
pip install -e .

How to use it

The first thing to do is to initialize the project like so:

[me@local-pc:~]$ cd <PATH_TO_MY_PROJECT>
[me@local-pc:project]$ remi init

You will be asked to provide:

  • Project name: A name for your project (can be anything).
  • Inria username: Your Inria login (with which you login to your desktop).
  • Inria hostname: The name of your Inria desktop (alya, scorpio, andromeda...)
  • Virtual environment: If you want to use a virtual environment and if so, its type (virtualenv or conda).

This process leads to the creation of a folder .remi within your project folder.
It contains a configuration file (config.yaml) and an exclude file (exclude.txt).

Once configured, you are ready to go !

File synchronization

One of the main features of remi is syncing.
More precisely, this tool sends your code on the storage of your workstation (/local_scratch).
This MUST NOT be considered as a kind of 'backup'.
The only purpose is to copy the code in order to run it. Each time you will push, a one-way synchronization will update the remote copy of your project and delete the remote files that do not exist anymore locally.

I strongly recommend using a versioning tool like git to manage your project as remi does not intent to replace this functionality.

You are not expected to modify the remote copy of your code (this is exactly what remi is for: edit locally, run remotely).

Exclude mechanism

Some folders and files might not be relevant to sync (remi push) on the remote desktop.
You can edit the exclude file (.remi/exclude.txt) and add any regex to ignore files or folders based on regex (as you would do in a .gitignore file).
By default, this file is not empty and contains locations that should not be synced such as output/, .git/, __pycache__/...

Here are some directories that are excluded by default:

  • .git
  • output/
  • notebooks/
  • ...

This is useful for folders that will be filled on the remote such as output and notebooks. You might then run remi pull <FOLDER> to fetch this remote data back to your local machine.

Recap sceme:

╭----------------╮                              ╭--------------------------------╮
| Local computer |                              | Remote storage (local scratch) |
╰----------------╯                              ╰--------------------------------╯
 my_code/   ╮                                   ╭  my_code/
 folder_a/  |                                   |  folder_a/
 folder_b/  |                                   |  folder_b/
 file_a     |---------- `remi push` ----------->|  file_a/
 file_b     |                                   |  file_b/
 file_c     |                                   |  file_c/
 file_d     ╯                                   ╰  file_d/
 output/     <------- `remi pull output/`-------   output/
 notebooks/                                        notebooks/
 ignored_folder/                                   other_ignored_folder/


ignored folders (listed in `.remi/exclude.txt`):
- output/
- notebooks/
- ignored_folder/
- other_ignored_folder/

Virtual environments

remi supports the virtualenv and conda virtual environments to run your code.
All you have to do is to configure it (if you wish to use a virtual environment) in the configuration file. Learn more in the Configuration section below.

Once configured, the virtual environment will be automatically enabled on the remote desktop when running code.

If using a conda environment, it is necessary for conda to be installed and configured on the selected desktop.
You might need to run conda init bash on the remote workstation (using ssh for instance).
Here are some resources to help you setup and use conda:

Inria Grenoble cluster architecture

One of the main features of remi is to easily run your code on the Inria cluster.
Through the configuration file, you can specify resources to request (see more in the Configuration section).

Command Line Interface

Here is a description of all the available cli commands.
They are all supposed to be ran from the project directory on your local machine.

[me@local-pc:project]$ remi <COMMAND>

Every command is performed from the project root directory. Hence, relative paths from there might be use.


Initialization

init

remi init
Initialize the project in the current working directory. Generate the configuration files.

setup

remi setup [(-h | --hostname) HOSTNAME]
Set up remote project location and install virtual environment if any.

Options:

  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.

update-packages

remi update-packages [(-h | --hostname) HOSTNAME]
Update the packages on the remote workstation.

Options:

  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.

prepare

remi prepare [(-h | --hostname) HOSTNAME]
Prepare the remote workstation (setup, virtual environment, pip and dependencies, push project files and update packages).
Equivalent to running setup, and update-packages.

Options:

  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.

Transferring and managing files

push

remi push [-f | --force]
Sync the content of the project directory to the remote desktop.
If no changes are detected locally, the file sync will not be attempted.\

Options:

  • -f | --force: Run the sync command even if no local changes were detected.

pull

remi pull [REMOTE_PATH] [-f | --force]
Sync the content of the provided REMOTE_PATH directory from the remote computer to the local one.
This can be used to sync back experimental output that result from a computation done remotely.
If no path is specified, output/ will be used as the default value.

Options:

  • -f | --force: Do not ask for a confirmation before pulling.
    Use with caution. (Eventually conflicting local files might be overridden).

clean

remi clean [REMOTE_PATH] [-f | --force]
Clean the content of the provided REMOTE_PATH directory on the remote location.
If no directory is specified, output/ will be used as the default value.

Options:

  • -f | --force: Do not ask for a confirmation before cleaning.
    Use with caution.

Desktop (execution on Inria desktop)

Execute the code on the remote workstation.

Available subcommands:

script (default)

  • remi desktop [script] [(-s | --script) SCRIPT] [(-h | --hostname) HOSTNAME] [-b | --background] [-a | --attach] [-c | --use-container] [--no-push] [--no-build]
    Run a bash script on the remote computer.
    This is the default subcommand (and can thus be run using remi cluster).
    Options:
  • -s | --script: The path to a bash script to run.
    Default: script.sh
  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.
  • -b | --background: Run the script in the background (free the ssh session) by running it inside a screen or tmux session (configured in .remi/config.yaml).
  • -a | --attach: If running the job in background mode, directly attach to it.
  • -c | --use-container: Run the code in the singularity container rather than directly.
  • --no-push: Do not attempt to sync project files to the remote location.
  • --no-build: Do not attempt to (re)-build the singularity container.

Examples:

  • remi desktop: Run script.sh on the default remote desktop.
  • remi desktop -h mensa -c: Run script.sh on mensa within the singularity container.
  • remi desktop -s training_script.sh -b: Run training_script.sh on the default remote desktop in background mode.

command

remi desktop command COMMAND [(-h | --hostname) HOSTNAME] [-b | --background] [-a | --attach] [-c | --use-container] [--no-push] [--no-build]
Run the specified COMMAND on the remote computer.

Options:

  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.
  • -b | --background: Run the script in the background (free the ssh session) by running it inside a screen or tmux session (configured in .remi/config.yaml).
  • -a | --attach: If running the job in background mode, directly attach to it.
  • -c | --use-container: Run the command in the singularity container rather than directly.
  • --no-push: Do not attempt to sync project files to the remote location.
  • --no-build: Do not attempt to (re)-build the singularity container.

Examples:

  • remi desktop command nvidia-smi: Run the command nvidia-smi on the default remote desktop.
  • remi desktop command -h bacchus -c python train_net.py: Run the command train_net.py on bacchus within the singularity container.
  • remi desktop command -b ./test.sh --number_steps=1000: Run the command ./test.sh --number_steps=1000 on the default remote desktop in background mode.

interactive

remi desktop interactive [(-h | --hostname) HOSTNAME] [--no-push]
Start an interactive session on the remote computer.

Options:

  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.
  • --no-push: Do not attempt to sync project files to the remote location.

Examples:

  • remi desktop interactive: Start an interactive session on the default remote desktop.
  • remi desktop -h hydra --no-push: Start an interactive session on hydra without pushing local changes.

attach-session

remi desktop attach-session [(-h | --hostname) HOSTNAME]
Attach to the screen/tmux running session.

Options:

  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.

Singularity container

build-container

remi build-container [-f | --force]
Build the singularity container on the remote desktop. If no changes are detected locally, the container build will not be attempted.

Options:

  • -f | --force: Run the build command even in if no local changes were detected in the recipe (.def) and even if the image already exists.

Cluster

Execute the code on the Inria cluster. Please note that the cluster request (oarsub command) and singularity container are configurable from the config file.

Available subcommands:

script (default)

  • remi cluster [script] [(-s | --script) SCRIPT] [(-n | --job-name) JOB_NAME] [--no-push] [--no-build]
    Run a bash script on the cluster.
    This is the default subcommand (and can thus be run using remi cluster).
    Options:
  • -s | --script: The path to a bash script to run.
    Default: script.sh
  • -n | --job-name: A custom name for the cluster job (oarsub's --name option).
    Default: The project name
  • --no-push: Do not attempt to sync project files to the remote location.
  • --no-build: Do not attempt to (re)-build the singularity container.

Examples:

  • remi cluster: Run script.sh on the cluster.
  • remi cluster -s training_script.sh: Run training_script.sh on the cluster.

command

remi cluster command COMMAND [(-n | --job-name) JOB_NAME] [--no-push] [--no-build]
Run the specified COMMAND on the cluster.

Options:

  • -n | --job-name: A custom name for the cluster job (oarsub's --name option).
    Default: The project name
  • --no-push: Do not attempt to sync project files to the remote location.
  • --no-build: Do not attempt to (re)-build the singularity container.

Example:

  • remi cluster command ./test.sh --number_steps=1000: Run the command ./test.sh --number_steps=1000 on the cluster.

interactive

remi cluster interactive [(-n | --job-name) JOB_NAME] [--no-push]
Start an interactive session on the cluster. This run oarsub with the --interactive flag.

Options:

  • -n | --job-name: A custom name for the cluster job (oarsub's --name option).
    Default: The project name
  • --no-push: Do not attempt to sync project files to the remote location.

Examples:

  • remi cluster --no-push: Start an interactive session on the cluster without pushing local changes.

Remote servers

Remote servers are applications that run on a remote computer and can be accessed from your local browser thanks to remi.

Two such servers are supported right now:

  • Jupyter notebook
  • TensorBoard

Other could be quite easily added in the future if needed.

Please, note that the connection can take a few seconds before working. Do not hesitate to refresh your browser once the page opens.


Jupyter notebook

Use a jupyter notebook server running on the remote desktop from the local PC.

WARNING: The way synchronization work with remi will delete all modifications made remotely. You should put your notebooks in the notebooks/ directory (which is excluded by default) and get them back on your local machine using remi pull notebooks/.

Available subcommands:

start (default)

remi jupyter [start] [(-p | --port) PORT] [(-h | --hostname) HOSTNAME] [--browser/--no-browser]
Run a jupyter server on the remote desktop.
start is the default sub-command (which can thus be run using remi jupyter).\

Options:

  • -p | --port: The port (local and remote) for the server.
    Default: 8080
  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.
  • --browser/--no-browser: If enabled, automatically open the jupyter notebook in the local browser.
    Default: The value of open_browser in the jupyter section of the config file.
stop

remi jupyter stop [(-p | --port) PORT] [(-h | --hostname) HOSTNAME]
Stop the jupyter server on the remote desktop.

Options:

  • -p | --port: The port (local and remote) for the server.
    Default: 8080
  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.

TensorBoard

Look at how your remote experiment is doing thanks to TensorBoard.

Available subcommands:

start (default)

remi tensorboard [start] [(-p | --port) PORT] [(-h | --hostname) HOSTNAME] [(-d | --logdir) LOGDIR] [--browser/--no-browser]
Run a TensorBoard server on the remote desktop.
start is the default sub-command (which can thus be run using remi tensorboard).

Options:

  • -p | --port: The port (local and remote) for the server.
    Default: 9090
  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.
  • -d | --logdir: Log directory.
    Default: The value of logdir in the tensorboard section of the config file (output/ by default).
  • --browser/--no-browser: If enabled, automatically open TensorBoard in the local browser.
    Default: The value of open_browser in the tensorboard section of the config file.
stop

remi tensorboard stop [(-p | --port) PORT] [(-h | --hostname) HOSTNAME]
Stop TensorBoard on the remote desktop.

Options:

  • -p | --port: The port (local and remote) for the server.
    Default: 8080
  • -h | --hostname: The hostname of an Inria computer.
    Default: The value of pc_name in the config file.

Configuration

Configuration file

Here are the different options that can be configured within the config.yaml file (located in the .remi/ folder).

Don't worry, most of those options are set to sane defaults when initializing the project (remi init). In practice you might only change a few of them at some point.

Some options are missing from the automatically generated config file but can be added manually if needed by the user. It does not mean that they are not supported.

# Name for your project
project_name: PROJECT_NAME


# Inria username
username: INRIA_USERNAME


# Name of your Inria workstation
pc_name: INRIA_DESKTOP_HOSTNAME


# Location of the project on the remote computer
project_remote_path: /scratch/INRIA_DESKTOP_HOSTNAME/INRIA_USERNAME/.remi_projects/PROJECT_NAME


# Bastion used to ssh into Inria resources
bastion:
  hostname: bastion.inrialpes.fr
  username: INRIA_USERNAME


# Desktop background jobs
background:
    # Which backend to use (`screen` or `tmux`)
    backend: screen

    # Whether to keep the session alive after the job has ended.
    # It lets you attach to the session to see the program output.
    # If 'false', the session will be closed when the job is over and stdout/stderr will be lost.
    # CAUTION: If true, you will have to manually re-attach and close the session.
    keep_session_alive: false



# Virtual environment
virtual_env:
  # Which virtual environment backend to use (`conda` or `virtualenv`)
  type: virtualenv

  # For `virtualenv` or `conda` virtual environments, you can specify a custom path.
  path: venv/

  # The name of your virtual environment (for `conda` environments)
  name: my_conda_env

  # For `conda` environments, path to a `yaml` configuration path
  conda_env_file: environment.yaml

  # For `conda` environments, you may specify a python version
  python_version: 3.9


# Singularity container options
singularity:
  # The name of the 'recipe' file (`.def`) to build the singularity container.
  def_file_name: container.def

  # The name of the singularity image.
  output_sif_name: container.sif

  # A dictionnary of binds for the singularity container.
  # If the value is empty (''), the mount point is the same as the path on the host.
  # By default, the project folder is bound within the singularity container: This configuration
  # then allows you to add extra locations.
  # Example:
  #     /path_on_host/my_data: /path_in_container/my_data
  bindings:


# Oarsub options (for more details on `oarsub`, please refer to
# https://oar.imag.fr/docs/latest/user/commands/oarsub.html).
oarsub:

  # Job name
  job_name: PROJECT_NAME

  # Number of cpus requested.
  num_cpus: 1

  # Number of cpu cores requested.
  # If the value is 0, all the cores for the requested cpus will be used.
  num_cpu_cores: 0

  # Number of GPUs requested.
  # If the value is 0, no GPU will be requested (CPU only).
  num_gpus: 1

  # The maximum allowed duration for your job.
  walltime: '72:00:00'

  # The name of the requested cluster (perception, mistis, thoth...)
  cluster_name: perception

  # Optionnaly specify the id of a specific node (gpu3, node2...)
  host_id:

  # If the options above are too restricive for your use-case, you may
  # directly provide a property list that will be provided to `oarsub` with the
  # `-p` flag.
  custom_property_query:

  # Whether to schedule the job in the besteffort queue.
  besteffort: true

  # Whether to set the job as idempotent (see oarsub documentation for more details).
  idempotent: false


# Remote servers
# Remote servers are applications that run on a remote computer and can be accessed from your local
# browser thanks to remi.
# Two such servers are supported right now:
# - Jupyter notebook
# - TensorBoard
remote_servers:
    # The command to run for opening the local browser (`<browser_cmd> <url>`)
    browser_cmd: firefox

    # Jupyter notebook
    jupyter:
        # The port (local and remote) for the server
        port: 8080

        # If true, automatically open the jupyter notebook in the local browser.
        open_browser: true

    # TensorBoard
    tensorboard:
        # The port (local and remote) for TensorBoard
        port: 9090

        # Directory from where to run tensorboard.
        logdir: 'output/'

        # If true, automatically open TensorBoard in the local browser.
        open_browser: true

Contributing

Any help to the improvement of remi is more than welcome.
You may write an issue to warn me of any bug or desired feature by writing an issue.

Acknowledgement

This project was inspired from LabML Remote.
The latter does not support ssh through a bastion because it uses the paramiko library. RemI, on the other hand, uses traditional calls to ssh command.
The main motivation to start a separate project was to design a tool specifically with the Inria computing needs in mind (use of Inria clusters, natively using the available resources etc).