Perform an (MPI) AllReduce over the contents of a matrix stored on different process (which overwrites every process’s input with the summation over all process’s copies).
AllReduce
TODO