The goal is to give you a chance to test out how the command line profiler works.
Compile program
run the program through nvprof
Figure out the data transfer times and run time for the kernels, do they all take the same time?
For more of a challange you can try and determine the number of floating point operations performed.