Developed a CUDA version of the FDTD method and achieved a speedup 40x. Implemented on a NVIDIA Quadro FX 3800 GPU, which has 192 SPs, 1GB global memory, and a memory bandwidth of 51.2 GB/s.
Finite-difference approximations for the first derivative, valid halfway between equidistant gridpoints, are in general much more accurate than the corresponding approximations, which are valid at ...