Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
E
environments-recipes
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Merge Requests
2
Merge Requests
2
Operations
Operations
Incidents
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
grid5000
environments-recipes
Commits
39080e1f
Commit
39080e1f
authored
Nov 16, 2020
by
Baptiste Jonglez
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'nvidia_ppc64'
parents
ee6dcc9a
d35c971f
Changes
11
Hide whitespace changes
Inline
Side-by-side
Showing
11 changed files
with
74 additions
and
86 deletions
+74
-86
steps/data/setup/puppet/modules/env/files/big/nvidia/cuda-9.0.conf
...a/setup/puppet/modules/env/files/big/nvidia/cuda-9.0.conf
+0
-1
steps/data/setup/puppet/modules/env/files/big/nvidia/cuda.conf
.../data/setup/puppet/modules/env/files/big/nvidia/cuda.conf
+1
-2
steps/data/setup/puppet/modules/env/files/big/nvidia/dcgm-exporter.service
...puppet/modules/env/files/big/nvidia/dcgm-exporter.service
+4
-0
steps/data/setup/puppet/modules/env/files/big/nvidia/nvidia-persistenced.service
.../modules/env/files/big/nvidia/nvidia-persistenced.service
+5
-0
steps/data/setup/puppet/modules/env/files/big/nvidia/nvidia-smi.service
...up/puppet/modules/env/files/big/nvidia/nvidia-smi.service
+10
-0
steps/data/setup/puppet/modules/env/files/big/nvidia/profile
steps/data/setup/puppet/modules/env/files/big/nvidia/profile
+0
-36
steps/data/setup/puppet/modules/env/manifests/big.pp
steps/data/setup/puppet/modules/env/manifests/big.pp
+5
-3
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu.pp
.../puppet/modules/env/manifests/big/configure_nvidia_gpu.pp
+2
-0
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu/cuda.pp
...et/modules/env/manifests/big/configure_nvidia_gpu/cuda.pp
+21
-42
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu/drivers.pp
...modules/env/manifests/big/configure_nvidia_gpu/drivers.pp
+11
-2
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu/services.pp
...odules/env/manifests/big/configure_nvidia_gpu/services.pp
+15
-0
No files found.
steps/data/setup/puppet/modules/env/files/big/nvidia/cuda-9.0.conf
deleted
100644 → 0
View file @
ee6dcc9a
/
usr
/
local
/
cuda
/
lib64
steps/data/setup/puppet/modules/env/files/big/nvidia/cuda.conf
View file @
39080e1f
/
usr
/
local
/
cuda
-
7
.
0
/
lib
/
usr
/
local
/
cuda
-
7
.
0
/
lib64
/
usr
/
local
/
cuda
/
lib64
steps/data/setup/puppet/modules/env/files/big/nvidia/dcgm-exporter.service
View file @
39080e1f
[Unit]
Description
=
NVIDIA DCGM prometheus exporter service
After
=
network.target
# Ensure that /dev/nvidia0 is created by first calling nvidia-smi.
# If no GPU is found, nvidia-smi will not create /dev/nvidia0 and we will not run.
Wants
=
nvidia-smi.service
After
=
nvidia-smi.service
ConditionPathExists
=
/dev/nvidia0
[Service]
...
...
steps/data/setup/puppet/modules/env/files/big/nvidia/nvidia-persistenced
-9.0
.service
→
steps/data/setup/puppet/modules/env/files/big/nvidia/nvidia-persistenced.service
View file @
39080e1f
[Unit]
Description
=
NVIDIA Persistence Daemon
Wants
=
syslog.target
# Ensure that /dev/nvidia0 is created by first calling nvidia-smi.
# If no GPU is found, nvidia-smi will not create /dev/nvidia0 and we will not run.
Wants
=
nvidia-smi.service
After
=
nvidia-smi.service
ConditionPathExists
=
/dev/nvidia0
[Service]
Type
=
forking
...
...
steps/data/setup/puppet/modules/env/files/big/nvidia/nvidia-smi.service
0 → 100644
View file @
39080e1f
[Unit]
Description
=
Call nvidia-smi once to create /dev/nvidiaX
[Service]
Type
=
oneshot
# Ignore the exit code: the command fails when no GPU is found
ExecStart
=
-/usr/bin/nvidia-smi
[Install]
WantedBy
=
multi-user.target
steps/data/setup/puppet/modules/env/files/big/nvidia/profile
deleted
100644 → 0
View file @
ee6dcc9a
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).
if [ "`id -u`" -eq 0 ]; then
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/cuda-7.0/bin"
else
PATH="/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/usr/local/cuda-7.0/bin"
fi
export PATH
if [ "$PS1" ]; then
if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ]; then
# The file bash.bashrc already sets the default PS1.
# PS1='\h:\w\$ '
if [ -f /etc/bash.bashrc ]; then
. /etc/bash.bashrc
fi
else
if [ "`id -u`" -eq 0 ]; then
PS1='# '
else
PS1='$ '
fi
fi
fi
if [ -d /etc/profile.d ]; then
for i in /etc/profile.d/*.sh; do
if [ -r $i ]; then
. $i
fi
done
unset i
fi
steps/data/setup/puppet/modules/env/manifests/big.pp
View file @
39080e1f
...
...
@@ -17,10 +17,12 @@ class env::big ( $variant = "big", $parent_parameters = {} ){
class
{
'env::big::configure_postfix'
:
}
# kvm
class
{
'env::big::configure_kvm'
:
}
if
$env::deb_arch
==
'amd64'
{
# nvidia
# nvidia
if
$env::deb_arch
==
'amd64'
or
$env::deb_arch
==
'ppc64el'
{
class
{
'env::big::configure_nvidia_gpu'
:
}
# beegfs install
}
# beegfs install
if
$env::deb_arch
==
'amd64'
{
class
{
'env::big::install_beegfs'
:
}
}
#Allow sshfs
...
...
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu.pp
View file @
39080e1f
...
...
@@ -6,6 +6,8 @@ class env::big::configure_nvidia_gpu () {
include
'env::big::configure_nvidia_gpu::modules'
# Install nvidia drivers
include
'env::big::configure_nvidia_gpu::drivers'
# Install additional services (currently nvidia-smi, needed by cuda and prometheus)
include
'env::big::configure_nvidia_gpu::services'
# Install cuda
include
'env::big::configure_nvidia_gpu::cuda'
# Install nvidia ganglia plugins
...
...
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu/cuda.pp
View file @
39080e1f
...
...
@@ -2,7 +2,20 @@ class env::big::configure_nvidia_gpu::cuda () {
case
"
${::lsbdistcodename}
"
{
"buster"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/cuda/cuda_10.1.168_418.67_linux.run'
case
"
$env::deb_arch
"
{
"amd64"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/cuda/cuda_10.1.243_418.87.00_linux.run'
$libcuda
=
'/usr/lib/x86_64-linux-gnu/libcuda.so'
}
"ppc64el"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/cuda/cuda_10.1.243_418.87.00_linux_ppc64le.run'
$libcuda
=
'/usr/lib/powerpc64le-linux-gnu/libcuda.so'
}
default
:
{
err
"
${env::deb_arch}
not supported"
}
}
$opengl_packages
=
[
'ocl-icd-libopencl1'
,
'opencl-headers'
]
exec
{
...
...
@@ -26,27 +39,7 @@ class env::big::configure_nvidia_gpu::cuda () {
"stretch"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/cuda/cuda_9.0.176_384.81_linux-run'
$opengl_packages
=
[
'ocl-icd-libopencl1'
,
'opencl-headers'
]
exec
{
'retrieve_nvidia_cuda'
:
command
=>
"/usr/bin/wget -q
$driver_source
-O /tmp/NVIDIA-Linux_cuda.run && chmod u+x /tmp/NVIDIA-Linux_cuda.run"
,
timeout
=>
1200
,
# 20 min
creates
=>
"/tmp/NVIDIA-Linux_cuda.run"
;
'install_nvidia_cuda'
:
command
=>
"/tmp/NVIDIA-Linux_cuda.run --silent --toolkit --samples && /bin/rm /tmp/NVIDIA-Linux_cuda.run"
,
timeout
=>
2400
,
# 20 min
user
=>
root
,
require
=>
File
[
'/tmp/NVIDIA-Linux_cuda.run'
];
'update_ld_conf'
:
command
=>
"/sbin/ldconfig"
,
user
=>
root
,
refreshonly
=>
true
;
}
}
"jessie"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/cuda/cuda_9.0.176_384.81_linux-run'
$opengl_packages
=
[
'ocl-icd-libopencl1'
,
'opencl-headers'
,
'amd-opencl-icd'
]
$libcuda
=
'/usr/lib/x86_64-linux-gnu/libcuda.so'
exec
{
'retrieve_nvidia_cuda'
:
...
...
@@ -74,7 +67,7 @@ class env::big::configure_nvidia_gpu::cuda () {
require
=>
Exec
[
'retrieve_nvidia_cuda'
];
'/usr/local/cuda/lib64/libcuda.so'
:
ensure
=>
'link'
,
target
=>
'/usr/lib/x86_64-linux-gnu/libcuda.so'
,
target
=>
$libcuda
,
require
=>
Exec
[
'install_nvidia_cuda'
],
notify
=>
Exec
[
'update_ld_conf'
];
'/etc/ld.so.conf.d/cuda.conf'
:
...
...
@@ -82,31 +75,17 @@ class env::big::configure_nvidia_gpu::cuda () {
owner
=>
root
,
group
=>
root
,
mode
=>
'0644'
,
source
=>
'puppet:///modules/env/big/nvidia/cuda
-9.0
.conf'
,
source
=>
'puppet:///modules/env/big/nvidia/cuda.conf'
,
notify
=>
Exec
[
'update_ld_conf'
];
'/etc/systemd/system/nvidia-persistenced.service'
:
ensure
=>
file
,
owner
=>
root
,
group
=>
root
,
mode
=>
'0644'
,
source
=>
'puppet:///modules/env/big/nvidia/nvidia-persistenced-9.0.service'
;
}
}
"jessie"
:
{
file
{
'/tmp/NVIDIA-Linux_cuda.run'
:
ensure
=>
file
,
require
=>
Exec
[
'retrieve_nvidia_cuda'
];
'/etc/ld.so.conf.d/cuda.conf'
:
ensure
=>
file
,
owner
=>
root
,
group
=>
root
,
mode
=>
'0644'
,
source
=>
'puppet:///modules/env/big/nvidia/cuda.conf'
,
notify
=>
Exec
[
'update_ld_conf'
];
'/usr/local/cuda/lib64/libcuda.so'
:
ensure
=>
'link'
,
target
=>
'/usr/lib/libcuda.so'
;
source
=>
'puppet:///modules/env/big/nvidia/nvidia-persistenced.service'
;
'/etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service'
:
ensure
=>
link
,
target
=>
'/etc/systemd/system/nvidia-persistenced.service'
;
}
}
}
...
...
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu/drivers.pp
View file @
39080e1f
...
...
@@ -2,8 +2,17 @@ class env::big::configure_nvidia_gpu::drivers () {
### This class exists for gpuclus cluster, that require a recent version of nvidia driver
# May be changed to a link inside g5k if required
$driver_source
=
'http://packages.grid5000.fr/other/nvidia//NVIDIA-Linux-x86_64-450.51.05.run'
case
"
$env::deb_arch
"
{
"amd64"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/nvidia/NVIDIA-Linux-x86_64-450.80.02.run'
}
"ppc64el"
:
{
$driver_source
=
'http://packages.grid5000.fr/other/nvidia/NVIDIA-Linux-ppc64le-450.80.02.run'
}
default
:
{
err
"
${env::deb_arch}
not supported"
}
}
package
{
[
'module-assistant'
,
'dkms'
]:
...
...
steps/data/setup/puppet/modules/env/manifests/big/configure_nvidia_gpu/services.pp
0 → 100644
View file @
39080e1f
class
env::big::configure_nvidia_gpu::services
()
{
# We only install the service but do not enable it.
# Services that depend on it can add "Wants=nvidia-smi.service"
# and "After=nvidia-smi.service", and this will automatically start
# this service.
file
{
'/etc/systemd/system/nvidia-smi.service'
:
ensure
=>
file
,
owner
=>
root
,
group
=>
root
,
mode
=>
'0644'
,
source
=>
'puppet:///modules/env/big/nvidia/nvidia-smi.service'
;
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment