Mentions légales du service

Skip to content

relax sanity check on pending scheduling op while attempting data transfer progression

AUMAGE Olivier requested to merge issue-github-6 into master

Address GitHub issue 6:

  • Assertion failed: (0 && "!worker->state_sched_op_pending"), function __starpu_datawizard_progress, file datawizard.c, line 130. – In the backtrace below, state_sched_op_pending is set at frame #18 / _starpu_push_task_to_workers() around the call to the sched policy function.
* thread #12, name = 'CPU 0', stop reason = hit program assert
    frame #0: 0x000000019742e868 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000197465cec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x000000019739e2c8 libsystem_c.dylib`abort + 180
    frame #3: 0x000000019739d620 libsystem_c.dylib`__assert_rtn + 272
  * frame #4: 0x0000000102b79338 libstarpu-1.4.1.dylib`__starpu_datawizard_progress(may_alloc=_STARPU_DATAWIZARD_DO_NOT_ALLOC, push_requests=1) at datawizard.c:130:2
    frame #5: 0x0000000102b794a8 libstarpu-1.4.1.dylib`_starpu_datawizard_progress(may_alloc=_STARPU_DATAWIZARD_DO_NOT_ALLOC) at datawizard.c:159:2
    frame #6: 0x0000000102b90b70 libstarpu-1.4.1.dylib`_starpu_allocate_interface(handle=0x000000012d814200, replicate=0x000000012d814358, dst_node=0, is_prefetch=STARPU_TASK_PREFETCH, only_fast_alloc=0) at memalloc.c:1604:3
    frame #7: 0x0000000102b910d4 libstarpu-1.4.1.dylib`_starpu_allocate_memory_on_node(handle=0x000000012d814200, replicate=0x000000012d814358, is_prefetch=STARPU_TASK_PREFETCH, only_fast_alloc=0) at memalloc.c:1665:21
    frame #8: 0x0000000102b70044 libstarpu-1.4.1.dylib`_starpu_create_request_to_fetch_data(handle=0x000000012d814200, dst_replicate=0x000000012d814358, mode=STARPU_W, task=0x00000001687af9d0, is_prefetch=STARPU_TASK_PREFETCH, async=1, callback_func=0x0000000000000000, callback_arg=0x0000000000000000, prio=0, origin="task_prefetch_data_on_node") at coherency.c:660:8
    frame #9: 0x0000000102b70bc0 libstarpu-1.4.1.dylib`_starpu_fetch_data_on_node(handle=0x000000012d814200, node=0, dst_replicate=0x000000012d814358, mode=STARPU_W, detached=1, task=0x00000001687af9d0, is_prefetch=STARPU_TASK_PREFETCH, async=1, callback_func=0x0000000000000000, callback_arg=0x0000000000000000, prio=0, origin="task_prefetch_data_on_node") at coherency.c:874:6
    frame #10: 0x0000000102b70d7c libstarpu-1.4.1.dylib`task_prefetch_data_on_node(handle=0x000000012d814200, node=0, replicate=0x000000012d814358, mode=STARPU_W, task=0x00000001687af9d0, prio=0) at coherency.c:897:9
    frame #11: 0x0000000102b7173c libstarpu-1.4.1.dylib`_starpu_prefetch_task_input_prio(task=0x00000001687af9d0, target_node=-1, worker=0, prio=0, prefetch=STARPU_PREFETCH) at coherency.c:1020:4
    frame #12: 0x0000000102b7181c libstarpu-1.4.1.dylib`starpu_prefetch_task_input_prio(task=0x00000001687af9d0, target_node=-1, worker=0, prio=0) at coherency.c:1033:9
    frame #13: 0x0000000102b71b08 libstarpu-1.4.1.dylib`starpu_prefetch_task_input_for_prio(task=0x00000001687af9d0, worker=0, prio=0) at coherency.c:1070:9
    frame #14: 0x0000000102b71b9c libstarpu-1.4.1.dylib`starpu_prefetch_task_input_for(task=0x00000001687af9d0, worker=0) at coherency.c:1078:9
    frame #15: 0x0000000102b48548 libstarpu-1.4.1.dylib`push_task_on_best_worker(task=0x00000001687af9d0, best_workerid=0, predicted=2.0157669999999999, predicted_transfer=0, prio=0, sched_ctx_id=0) at deque_modeling_policy_data_aware.c:373:3
    frame #16: 0x0000000102b49a0c libstarpu-1.4.1.dylib`_dmda_push_task(task=0x00000001687af9d0, prio=0, sched_ctx_id=0, da=1, simulate=0, sorted_decision=0) at deque_modeling_policy_data_aware.c:755:10
    frame #17: 0x0000000102b49cc8 libstarpu-1.4.1.dylib`dmda_push_task(task=0x00000001687af9d0) at deque_modeling_policy_data_aware.c:789:9
    frame #18: 0x0000000102b2d2b8 libstarpu-1.4.1.dylib`_starpu_push_task_to_workers(task=0x00000001687af9d0) at sched_policy.c:778:11
    frame #19: 0x0000000102b2ca58 libstarpu-1.4.1.dylib`_starpu_repush_task(j=0x0000000169c63c00) at sched_policy.c:650:8
    frame #20: 0x0000000102b2c010 libstarpu-1.4.1.dylib`_starpu_push_task(j=0x0000000169c63c00) at sched_policy.c:544:9
    frame #21: 0x0000000102acbb10 libstarpu-1.4.1.dylib`_starpu_enforce_deps_starting_from_task(j=0x0000000169c63c00) at jobs.c:991:8
    frame #22: 0x0000000102af7bd0 libstarpu-1.4.1.dylib`_starpu_notify_cg(pred=0x0000000169c61e00, cg=0x000060001739a800) at cg.c:277:6
    frame #23: 0x0000000102af8084 libstarpu-1.4.1.dylib`_starpu_notify_cg_list(pred=0x0000000169c61e00, successors=0x0000000169c62020) at cg.c:377:3
    frame #24: 0x0000000102b021d0 libstarpu-1.4.1.dylib`_starpu_notify_task_dependencies(j=0x0000000169c61e00) at task_deps.c:66:2
    frame #25: 0x0000000102af8510 libstarpu-1.4.1.dylib`_starpu_notify_dependencies(j=0x0000000169c61e00) at dependencies.c:32:2
    frame #26: 0x0000000102acaa58 libstarpu-1.4.1.dylib`_starpu_handle_job_termination(j=0x0000000169c61e00) at jobs.c:542:3
    frame #27: 0x0000000102c1ff7c libstarpu-1.4.1.dylib`_starpu_cpu_driver_execute_task(cpu_worker=0x0000000102cdc748, task=0x00000001687aeee0, j=0x0000000169c61e00) at driver_cpu.c:576:3
    frame #28: 0x0000000102c20138 libstarpu-1.4.1.dylib`_starpu_cpu_driver_run_once(cpu_worker=0x0000000102cdc748) at driver_cpu.c:614:9
    frame #29: 0x0000000102c20890 libstarpu-1.4.1.dylib`_starpu_cpu_worker(arg=0x0000000102cdc748) at driver_cpu.c:732:3
    frame #30: 0x000000019746606c libsystem_pthread.dylib`_pthread_start + 148
Edited by THIBAULT Samuel

Merge request reports