Anton Schreiner

Anton Schreiner



Another neat trick that is quite ubiquitous if you're doing something more or less complicated is using prefix sum and bisection to distribute work among threads

Common to have source buckets that generate a varying amount of work items. Meshes, for instance, having a different amount of vertices. You first write num_vertices per mesh into an array, do a scan and then spawn threads for the total sum. Toy example:

After you find the bucket_idx for your thread_idx you can find which work item within that bucket you need to process. Simple and efficient. The alternative is an array of size total_sum(could be millions of items) with each entry[tid]=bid - better to use that log2(N) bisect

Follow us on Twitter

to be informed of the latest developments and updates!

You can easily use to @tivitikothread bot for create more readable thread!
Donate 💲

You can keep this app free of charge by supporting 😊

for server charges...