Our algorithms involve a total of O(n2) atom operations, which are performed in O(n) parallel steps. This improves on previous reconfiguration algorithms, which either use O(n2) parallel steps [Rus and Vona, 2001, Vassilvitskii et al., 2002, Butler and Rus, 2003] or do not respect the constraints mentioned above [Aloupis et al., 2009b]. In fact, in the setting considered, our algorithms are optimal. A further advantage of our algorithms is that reconfiguration can take place within the union of the source and target configuration space, and only requires local communication