Feature Request : Allowing usage of hyperthreading

Dear StarPU developers,

Thank you so much for this great library! A group of friends and I use it to distribute code for a coding competition, which requires solving hard problems quickly. So our kernels won't be the usual near-optimal ones, and we think hyperthreading could be useful in this case. I know StarPU assumes that a worker has at least an entire physical core bound to itself, but would it have undesired implications to allow the user to bind workers to logical cores instead if explicitly asked for?

I tried playing around with hwloc and starpu_conf::use_explicit_workers_bindid, but I can't get the binding I'm looking for. The default topology I get is

numa 0    pack 0    core 0    PU 0    CPU 0   
                              PU 1    
                    core 1    PU 2    CPU 1    
                              PU 3    
                    core 2    PU 4    CPU 2    
                              PU 5    
                    core 3    PU 6    CPU 3    
                              PU 7

If I set ncpus=4, and workers_bindid = {0, 1, 2, 3}, I get this :

numa 0    pack 0    core 0    PU 0    CPU 0   
                              PU 1    CPU 1    
                    core 1    PU 2    CPU 2
                              PU 3    CPU 3
                    core 2    PU 4   
                              PU 5    
                    core 3    PU 6    
                              PU 7

And when I set ncpus=8 and workers_bindid = {0, 1, 2, 3, 4, 5, 6, 7}, I get this :

numa 0    pack 0    core 0    PU 0    CPU 0    CPU 4   
                              PU 1    CPU 1    CPU 5
                    core 1    PU 2    CPU 2    CPU 6
                              PU 3    CPU 3    CPU 7
                    core 2    PU 4   
                              PU 5    
                    core 3    PU 6    
                              PU 7

I found the last one especially strange, as StarPU "refused" to bind to more than 4 PUs, but still bound many workers to one physical core. But I understand this breaks some made assumptions, and I could expect it to not work out of the box. This gives me the feeling that StarPU could bind workers to logical PUs, but the algorithm looking for available PUs "refuses" to do so. Again, would allowing this behaviour require changes outside of the initialization code?

The values set in workers_bindid came from hwloc directly. And just to be clear, the topology I would like to obtain in my case is :

numa 0    pack 0    core 0    PU 0    CPU 0
                              PU 1    CPU 1
                    core 1    PU 2    CPU 2
                              PU 3    CPU 3
                    core 2    PU 4    CPU 4
                              PU 5    CPU 5
                    core 3    PU 6    CPU 6
                              PU 7    CPU 7

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Feature Request : Allowing usage of hyperthreading