1. 29 Jan, 2018 2 commits
  2. 08 Jan, 2018 3 commits
  3. 21 Nov, 2017 2 commits
    • BAIRE Anthony's avatar
      always display the memory limit · 0aa7a6aa
      BAIRE Anthony authored
      but disable the widget for non-admins
    • BAIRE Anthony's avatar
      fix out of memory message · 06420260
      BAIRE Anthony authored
      this variant is closer to the actual meaning
      (the fact that the limit was reached does not automatically imply
       that the process is starving, we cannot decide how much memory
       a process needs without doing some profiling)
  4. 20 Nov, 2017 11 commits
  5. 16 Nov, 2017 1 commit
  6. 14 Nov, 2017 10 commits
    • BAIRE Anthony's avatar
      update the job container command · 2cfd30b8
      BAIRE Anthony authored
      - to have SIGTERM forwarded to the process
      - to propagate the exit code of the process
    • BAIRE Anthony's avatar
      order jobs by id in the job list · 9a4b2818
      BAIRE Anthony authored
      (should be less confusing)
    • BAIRE Anthony's avatar
      add the 'rescheduled' future · 7d5df09e
      BAIRE Anthony authored
      to let the task implementations detect that they are being rescheduled
    • BAIRE Anthony's avatar
      extend shutdown case (in test_manager_futures) · 9f37db99
      BAIRE Anthony authored
      - add a pending task
      - add a timeout
    • BAIRE Anthony's avatar
      factorisation · 5173f5e1
      BAIRE Anthony authored
    • BAIRE Anthony's avatar
      add --verbose · 6a2a886f
      BAIRE Anthony authored
      console log level:
       (default)    ->  WARNING
       -v/--verbose ->  INFO
       -d/--debug   ->  DEBUG
      log files:
        /vol/log/controller.log   -> INFO
        /vol/log/debug.log        -> DEBUG  (disabled unless -d/--debug or
                                     unless env var DEBUG is set)
    • BAIRE Anthony's avatar
      refactor the management of swarm/sandbox resources · 0e301e74
      BAIRE Anthony authored
      - add SwarmAbstractionClient: a class that extends docker.Client and
        hides the API differences between the docker remote API and the
        swarm API. Thus a single docker engine can be used like a swarm
      - add SharedSwarmClient: a class that extends SwarmAbstractionClient
        and monitors the swarm health and its resource (cpu/mem) and manages
        the resource allocation.
        - resources are partitioned in groups (to allow reserving resources
          for higher priority jobs)
        - two SharedSwarmClient can work together over TCP in a master/slave
          configuration (to allow the production and qualification platforms
          to use the same swarm without any interference)
      - the controller is modified to:
        - use SharedSwarmClient to:
          - wait for the end of a job (in place of DockerWatcher)
          - manage resource reservation (LONG_APPS vs. BIGMEM_APPS vs normal
            apps) and monitor swarm health (fix #124)
          - NOTE: resources of the swarm and sandbox are now managed
            separately (2 instances of SharedSwarmClient), whereas it was
            global before (this was suboptimal)
        - rely on SwarmAbstractionClient to compute the cpu quotas
        - store the container_id of jobs into the DB (fix #128), this is a
          prerequisite to permit renaming apps in the future
        - store the class of the job (normal vs. long app) in the container
          name (for the resource management with SharedSwarmClient)
        - read the configuration from a yaml file (/vol/ro/config.yml) for:
          - cpu/mem quotas
          - swarm resources allocation policy
          - master/slave configuration
    • BAIRE Anthony's avatar
      allow extra params in run-coverage · cfa6bb36
      BAIRE Anthony authored
    • BAIRE Anthony's avatar
    • BAIRE Anthony's avatar
  7. 13 Nov, 2017 1 commit
  8. 09 Nov, 2017 9 commits
  9. 27 Sep, 2017 1 commit