Segfault avec nm_bench_coll_ibcast
mpirun -nodelist henri0,henri1 -n 4 ~/pm2/soft/x86_64/bin/nm_bench_coll_ibcast
Thread 1 "nm_bench_coll_i" received signal SIGSEGV, Segmentation fault.
0x00007ffff7f04ac7 in nm_coll_tree_status_signal (p_status=0x0)
at /home/pswartva/pm2/git/scripts/../nmad/interfaces/coll/include/nm_coll_trees.h:870
870 (*p_notify)(p_ref);
(gdb) bt
#0 0x00007ffff7f04ac7 in nm_coll_tree_status_signal (p_status=0x0) at /home/pswartva/pm2/git/scripts/../nmad/interface /coll/include/nm_coll_trees.h:870
#1 0x00007ffff7f04f35 in nm_coll_ibcast_step (p_bcast=0x7fffcc003b60) at /home/pswartva/pm2/git/scripts/../nmad/interfaces/coll/src/nm_coll_bcast.c:85
#2 0x00007ffff7f04f82 in nm_coll_ibcast_req_notifier (event=NM_SR_EVENT_FINALIZED, event_info=0x7fffffffcee0, _ref=0x7fffcc003b60) at /home/pswartva/pm2/git/scripts/../nmad/interfaces/coll/src/nm_coll_bcast.c:96
#3 0x00007ffff7eec288 in nm_sr_event_req_handler (p_event=0x55555574f0f0, _ref=0x0) at /home/pswartva/pm2/git/scripts/../nmad/interfaces/sendrecv/src/nm_sendrecv_interface.c:379
#4 0x00007ffff7e560c1 in nm_core_events_dispatch (p_core=0x5555555aced0) at /home/pswartva/pm2/git/scripts/../nmad/src/nm_core.c:384
#5 0x00007ffff7e765f0 in nm_ltask_core_progress (_p_core=0x5555555aced0) at /home/pswartva/pm2/git/scripts/../nmad/src/nm_piom_ltasks.c:193
#6 0x00007ffff7bed31b in piom_ltask_queue_schedule (queue=0x5555555a69a0, full=0) at /home/pswartva/pm2/git/scripts/../pioman/src/piom_ltask.c:112
#7 0x00007ffff7beddfb in piom_ltask_schedule (point=32) at /home/pswartva/pm2/git/scripts/../pioman/src/piom_ltask.c:408
#8 0x00007ffff7bf44ea in piom_cond_wait (cond=0x7fffcc003d48, mask=1) at /home/pswartva/pm2/git/scripts/../pioman/src/piom_sem.c:158
#9 0x00007ffff7efe171 in nm_cond_wait (p_cond=0x7fffcc003d48, bitmask=1, p_core=0x5555555aced0) at /home/pswartva/pm2/git/scripts/../nmad/include/nm_core_interface.h:523
#10 0x00007ffff7f04a46 in nm_coll_tree_status_wait (p_status=0x7fffcc003b60) at /home/pswartva/pm2/git/scripts/../nmad/interfaces/coll/include/nm_coll_trees.h:853
#11 0x00007ffff7f04fc3 in nm_coll_ibcast_wait (p_bcast=0x7fffcc003b60) at /home/pswartva/pm2/git/scripts/../nmad/interfaces/coll/src/nm_coll_bcast.c:107
#12 0x00005555555577b3 in nm_bench_coll_ibcast_run (p_common=0x55555555f280 <common>, buf=0x555555709b90, len=1) at /home/pswartva/pm2/git/scripts/../nmad/examples/bench-coll/nm_bench_coll_ibcast.c:31
#13 0x000055555555aa55 in main (argc=1, argv=0x7fffffffd3e8) at /home/pswartva/pm2/git/scripts/../nmad/examples/bench-coll/nm_bench_coll_generic.c:161
Valgrind n'est pas content non plus :
==472386== Jump to the invalid address stated on the next line
==472386== at 0x5028170: ???
==472386== by 0x4911F34: nm_coll_ibcast_step (nm_coll_bcast.c:85)
==472386== by 0x4911F81: nm_coll_ibcast_req_notifier (nm_coll_bcast.c:96)
==472386== by 0x48F9287: nm_sr_event_req_handler (nm_sendrecv_interface.c:379)
==472386== by 0x48630C0: nm_core_events_dispatch (nm_core.c:384)
==472386== by 0x48835EF: nm_ltask_core_progress (nm_piom_ltasks.c:193)
==472386== by 0x4C1331A: piom_ltask_queue_schedule (piom_ltask.c:112)
==472386== by 0x4C13DE7: piom_ltask_schedule (piom_ltask.c:404)
==472386== by 0x4C1A4E9: piom_cond_wait (piom_sem.c:158)
==472386== by 0x490B170: nm_cond_wait (nm_core_interface.h:523)
==472386== by 0x4911A45: nm_coll_tree_status_wait (nm_coll_trees.h:853)
==472386== by 0x4911FC2: nm_coll_ibcast_wait (nm_coll_bcast.c:107)
==472386== Address 0x5028170 is 272 bytes inside an unallocated block of size 528 in arena "client"
Le bug a été introduit par le commit 29e08fc8.