Using guild for data parallelization

Yes, this is what I’ve been casually referring to as “summary ops” for now quite a while (with little progress alas!)

I wrote up my thinking here: Summary operations.

Please feel free to comment there — your input on the feature is most welcome!