This patch (patch.14032016) for vasp.5.4.1.05Feb16 improves the mapping between MPI-ranks and GPUs on multi-node/multi-GPU systems (concerns performance of the GPU accelerated version of VASP only, and is not a bugfix).
This patch can be downloaded from the download portal or our wiki.
Gunzip and apply this patch inside the vasp.5.4.1 root directory:
patch -p0 < patch.126.96.36.19932016
The VASP team.