You will lead efforts to build distributed training support into PyTorch and JAX using XLA, the Neuron compiler, and runtime stacks. You will optimize...
Amazon Web Services is a dynamic and rapidly growing business. We are building some of the largest and most complex distributed systems in the world...
Position will require travel up to 25% for training. After training, travel will be minimal.
Position will require access to customer site locations...