Segfault when cuda_mem.cu frees memory

questions related to VASP with GPU support (vasp.5.4.1, version released in Feb 2016)

Moderators: Global Moderator, Moderator

Post Reply
Message
Author
luca
Newbie
Newbie
Posts: 1
Joined: Wed Mar 23, 2016 4:59 pm
License Nr.: 5-115

Segfault when cuda_mem.cu frees memory

#1 Post by luca » Thu Mar 31, 2016 6:04 am

Hi,

I wished to ask the VASP user community if anybody using VASP on GPUs (Tesla K20X on the Cray XC30 in my case) had a similar issue.

When running the simple CeO2 test of Peter Larsson's test suite on a single core, I got a segmentation fault at line 71 of cuda_mem.cu, when the code tries to free the memory on the device (#else free(*ptr);)
Process 0:
Thread 1 stopped in free from /lib64/libc.so.6 with signal SIGSEGV ​(Segmentation fault)​.
Reason/Origin: address not mapped to object ​(attempt to access invalid address)​

I needed to modify cuda_mem.cu, since the original one with the define statements for NVREGISTERSELF and NVPINNED was giving other error "Failed to free pinned memory!” coded in cuda_mem.cu in the lines immediately before the statement on line 71.
I use the INCAR provided by the VASP test suite, with NCORE=1 and without defining NPAR.

I would appreciate any advice from the VASP community.

Thanks,
Luca

guiyang_huang1
Newbie
Newbie
Posts: 12
Joined: Tue Nov 12, 2019 7:00 pm

Re: Segfault when cuda_mem.cu frees memory

#2 Post by guiyang_huang1 » Wed Nov 13, 2019 3:53 pm

Try to remove pinned memory option from the makefile.include.

Or use pgi compiler to compile the gpu vasp.

For gnu or xl compiler, it seems pined memory option can not be used for IBM power9. It is unclear whether it is also true for your system.

Post Reply