shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

266
active users

#multithreading

0 posts0 participants0 posts today

Multithreaded CLI developers: let your users configure the number of threads.

Entire classes of use cases are hiding inside that will make your life easier as a dev -- and threads=1 is usually not hard to add.

One example: if your multithreaded tool works significantly faster on a single file when I force your tool to just use a single thread and parallelize it with parallel --pipepart --block instead, then either:

  1. you might decide to develop sharding the I/O of the physical file yourself, or

  2. you might consciously decide to not develop it, and leave that complexity to parallel (which is fine!)

But if your tool has no threads=N option, I have no workaround.

Configurable thread count lets me optimize in the meantime (or instead).

Leslie Lamport, of LaTeX fame, is a very accomplished mathematician and computer scientist with a Turing award for his work on “fundamental contributions to the theory and
practice of distributed and concurrent systems”. He just published a draft of his new book:

"A science of concurrent programs"

lamport.azurewebsites.net/tla/

True to his pedagogic approach to everything he does, "The book assumes only that you know the math one learns before entering a university." Even the appendices are fantastic. Can only wish I'll remain this lucid at his 82 years old.

At what point does setting more threads for OpenBLAS actually help?

For example, I have an SVD operation in #RStats on largish matrices (6000 rows and 6000 columns; doing an inverse), where default BLAS on Ubuntu is ~ 20 min.

OpenBLAS with 1 or 4 threads takes ~ 2 min (10X speedup!). With 4 threads, I can see the additional usage of cores, but overall time is the same as 1 thread.

Is there some magic size where using more threads for SVD will actually help?