Data-Driven Simulations And Policy Gradients For Limit Order Books