Changes to choose a launch configuration based on maximizing occupancy.
Using cudaOccupancyMaxPotentialBlockSize
we can calculate a configuration (grid size and block size) based on which the kernel is launched with a view of maximizing occupancy. cudaOccupancyMaxPotentialBlockSize
returns a potential block size, which we then divide up into a 2D configuration.