c++ platform working in simgrid but not batsim
Describe the bug
I was trying to create c++ platforms, got it working on simgrid, but trying it on batsim failed. Following tests showed that i can't get any c++ platform working on batsim, but xml platforms work fine.
Provide information so the bug can be reproduced
Attached are the nix shells i'm using to compile and run, the platforms in c++, so and xml, and relevent other files.
default.nix platform_compiling.nix tuto-env.nix
very_small_platform.cpp very_small_platform.so very_small_platform.xml
test_one_delay_job.json s4u-comm-wait.cpp
I'm compiling the platform in platform_compiling.nix with :
$CXX -shared -o very_small_platform.so $(pkg-config --libs --cflags simgrid) very_small_platform.cpp
I'm running simgrid with simgrid-template-s4u-master. I compiled s4u-comm-wait.cpp with:
$CXX -o s4u-comm-wait $(pkg-config --libs --cflags simgrid) s4u-comm-wait.cpp
and ran :
./s4u-comm-wait ./very_small_platform.so
and
./s4u-comm-wait ./very_small_platform.xml
which gave very similar results (as expected, see logs).
Then I tried in batsim something similar. In the nix shell tuto-env.nix, I ran:
batsim -p ./platforms/very_small_platform.xml -w ./workloads/test_one_delay_job.json
and
batsim -p ./platforms/very_small_platform.so -w ./workloads/test_one_delay_job.json
With the xml platform, it ran as expected, but c++ platform crashed with an error
../src/xbt/config.cpp:255: [root/CRITICAL] Refusing to register the config element 'smpi/iprobe' twice.
As far as i can tell, i'm not using SMPI here.
Logs
- Running with simgrid
[nix-shell:~/proj/simgrid-template-s4u-master]$ ./s4u-comm-wait ./very_small_platform.so
[master_host:sender:(1) 0.000000] [s4u_comm_wait/INFO] sleep_start_time : 5.000000 , sleep_test_time : 0.000000
[Jupiter:receiver:(2) 0.000000] [s4u_comm_wait/INFO] sleep_start_time : 1.000000 , sleep_test_time : 0.100000
[Jupiter:receiver:(2) 1.000000] [s4u_comm_wait/INFO] Wait for my first message
[master_host:sender:(1) 5.000000] [s4u_comm_wait/INFO] Send 'Message 0' to 'receiver'
[master_host:sender:(1) 17.041445] [s4u_comm_wait/INFO] Send 'Message 1' to 'receiver'
[Jupiter:receiver:(2) 17.100000] [s4u_comm_wait/INFO] I got a 'Message 0'.
[master_host:sender:(1) 29.141445] [s4u_comm_wait/INFO] Send 'Message 2' to 'receiver'
[Jupiter:receiver:(2) 29.200000] [s4u_comm_wait/INFO] I got a 'Message 1'.
[master_host:sender:(1) 41.241445] [s4u_comm_wait/INFO] Send 'finalize' to 'receiver'
[Jupiter:receiver:(2) 41.300000] [s4u_comm_wait/INFO] I got a 'Message 2'.
[Jupiter:receiver:(2) 41.400000] [s4u_comm_wait/INFO] I got a 'finalize'.
[nix-shell:~/proj/simgrid-template-s4u-master]$ ./s4u-comm-wait ./very_small_platform.xml
[master_host:sender:(1) 0.000000] [s4u_comm_wait/INFO] sleep_start_time : 5.000000 , sleep_test_time : 0.000000
[Jupiter:receiver:(2) 0.000000] [s4u_comm_wait/INFO] sleep_start_time : 1.000000 , sleep_test_time : 0.100000
[Jupiter:receiver:(2) 1.000000] [s4u_comm_wait/INFO] Wait for my first message
[master_host:sender:(1) 5.000000] [s4u_comm_wait/INFO] Send 'Message 0' to 'receiver'
[master_host:sender:(1) 17.643478] [s4u_comm_wait/INFO] Send 'Message 1' to 'receiver'
[Jupiter:receiver:(2) 17.700000] [s4u_comm_wait/INFO] I got a 'Message 0'.
[master_host:sender:(1) 30.343478] [s4u_comm_wait/INFO] Send 'Message 2' to 'receiver'
[Jupiter:receiver:(2) 30.400000] [s4u_comm_wait/INFO] I got a 'Message 1'.
[master_host:sender:(1) 43.043478] [s4u_comm_wait/INFO] Send 'finalize' to 'receiver'
[Jupiter:receiver:(2) 43.100000] [s4u_comm_wait/INFO] I got a 'Message 2'.
[Jupiter:receiver:(2) 43.200000] [s4u_comm_wait/INFO] I got a 'finalize'.
- Running with batsim
[nix-shell:~/proj/Batsim]$ batsim -p ./platforms/very_small_platform.xml -w ./workloads/test_one_delay_job.json
[0.000000] [batsim/INFO] Workload 'w0' corresponds to workload file '/home/defryder/proj/Batsim/./workloads/test_one_delay_job.json'.
[0.000000] [batsim/INFO] Batsim version: 4.1.0
[0.000000] [workload/INFO] Loading JSON workload '/home/defryder/proj/Batsim/./workloads/test_one_delay_job.json'...
[0.000000] [workload/INFO] JSON workload parsed sucessfully. Read 1 jobs and 1 profiles.
[0.000000] [workload/INFO] Checking workload validity...
[0.000000] [workload/INFO] Workload seems to be valid.
[0.000000] [workload/INFO] Removing unreferenced profiles from memory...
[0.000000] [xbt_cfg/INFO] Configuration change: Set 'host/model' to 'ptask_L07'
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [machines/INFO] Creating the machines from platform file './platforms/very_small_platform.xml'...
[0.000000] [xbt_cfg/INFO] Switching to the L07 model to handle parallel tasks.
[0.000000] [machines/INFO] Looking for master host 'master_host'
[0.000000] [machines/INFO] The machines have been created successfully. There are 1 computing machines.
[0.000000] [batsim/INFO] Batsim's export prefix is 'out'.
[0.000000] [batsim/INFO] The process 'workload_submitter_w0' has been created.
[0.000000] [batsim/INFO] The process 'server' has been created.
[master_host:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Sending '{"now":0.000000,"events":[{"timestamp":0.000000,"type":"SIMULATION_BEGINS","data":{"nb_resources":1,"nb_compute_resources":1,"nb_storage_resources":0,"allow_compute_sharing":false,"allow_storage_sharing":true,"config":{"redis-enabled":false,"redis-hostname":"127.0.0.1","redis-port":6379,"redis-prefix":"default","profiles-forwarded-on-submission":false,"dynamic-jobs-enabled":false,"dynamic-jobs-acknowledged":false,"profile-reuse-enabled":false,"sched-config":"","forward-unknown-events":false},"compute_resources":[{"id":0,"name":"Jupiter","state":"idle","properties":{"role":""},"zone_properties":{}}],"storage_resources":[],"workloads":{"w0":"/home/defryder/proj/Batsim/./workloads/test_one_delay_job.json"},"profiles":{"w0":{"delay10":{"type":"delay","delay":10}}}}}]}'
^C
[master_host:Scheduler REQ-REP:(3) 0.000000] [ker_engine/INFO] CTRL-C pressed. The current status will be displayed before exit (disable that behavior with option 'debug/verbose-exit').
[master_host:Scheduler REQ-REP:(3) 0.000000] [ker_engine/INFO] 3 actors are still running, waiting for something.
[master_host:Scheduler REQ-REP:(3) 0.000000] [ker_engine/INFO] Legend of the following listing: "Actor <pid> (<name>@<host>): <status>"
[master_host:Scheduler REQ-REP:(3) 0.000000] [ker_engine/INFO] Actor 1 (workload_submitter_w0@master_host) simcall actor::CommIsendSimcall
[master_host:Scheduler REQ-REP:(3) 0.000000] [ker_engine/INFO] Actor 2 (server@master_host) simcall NONE
[master_host:Scheduler REQ-REP:(3) 0.000000] [ker_engine/INFO] Actor 3 (Scheduler REQ-REP@master_host) simcall NONE
Segmentation fault.
Segmentation fault (core dumped)
[nix-shell:~/proj/Batsim]$ batsim -p ./platforms/very_small_platform.so -w ./workloads/test_one_delay_job.json
[0.000000] [batsim/INFO] Workload 'w0' corresponds to workload file '/home/defryder/proj/Batsim/./workloads/test_one_delay_job.json'.
[0.000000] [batsim/INFO] Batsim version: 4.1.0
[0.000000] [workload/INFO] Loading JSON workload '/home/defryder/proj/Batsim/./workloads/test_one_delay_job.json'...
[0.000000] [workload/INFO] JSON workload parsed sucessfully. Read 1 jobs and 1 profiles.
[0.000000] [workload/INFO] Checking workload validity...
[0.000000] [workload/INFO] Workload seems to be valid.
[0.000000] [workload/INFO] Removing unreferenced profiles from memory...
[0.000000] [xbt_cfg/INFO] Configuration change: Set 'host/model' to 'ptask_L07'
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [machines/INFO] Creating the machines from platform file './platforms/very_small_platform.so'...
[0.000000] ../src/xbt/config.cpp:255: [root/CRITICAL] Refusing to register the config element 'smpi/iprobe' twice.
Backtrace (displayed in actor maestro):
(backtrace not set -- did you install Boost.Stacktrace?)
Aborted (core dumped)
Possible fixes
(Share any insight you have about the bug.)