Discussion:
[Wien] Segmentation Fault after CORE END
Eric Kenney
2018-10-04 21:31:50 UTC
Permalink
I'm having an issue with segmentation faults during LAPW cycles. I keep
getting segmentation faults after running a standard run_lapw command:

run_lapw -ec 0.001
LAPW0 END
LAPW1 END
LAPW2 END
CORE END
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line
Source
mixer 00000000004827DD Unknown Unknown Unknown
libpthread-2.12.s 00000032F8C0F7E0 Unknown Unknown Unknown
mixer 0000000000415054 MAIN__ 999 mixer.F
mixer 000000000040475E Unknown Unknown Unknown
libc-2.12.so 00000032F841ED1D __libc_start_main Unknown Unknown
mixer 0000000000404669 Unknown Unknown Unknown
stop error
Naturally, I've been going through the mailing list and trying to deduce
what the issue here is. The error occurs for a large range of RKmax values
(5-8) and a large range of K mesh values (1000-10000). It occurs for -ec
0.001 to -ec 0.0000001, giving a large range on convergences. I've tried
ulimit, but my cluster does not appear to have that command installed, nor
do I think it is the issue.

Interestingly, I haven't had this issue in the past with other lattices.
This lattice is NaMnO2 in the B2/m setting; imported straight from a .cif
file from Springer Materials. I've done x nn, x sgroup and, x symmetry all
manually, and there seems to be no obvious issues here. For reference, the
lattice parameters are as follows:

a=10.720416 b=11.929841 c=5.397058 90.000000 90.000000 122.340000
NA X=0.50000000 Y=0.50000000 Z=0.50000000
Mn X=0.00000000 Y=0.00000000 Z=0.00000000
O X=0.50170000 Y=0.22940000 Z=0.00000000
4 atoms per unit cell.

Thank you, and have a nice day!
Laurence Marks
2018-10-04 21:38:07 UTC
Permalink
1) What version of Wien2k are you using?
2) What does "cat *.error" show?

This almost certainly has nothing to do with -ec, k-points. Changing ulimit
is obsolete (unless you are using an obsolete version of Wien2k). This
might be an error in mixer, but is more likely to be a setup error.
However, without more information it is not possible to say (yet).
Post by Eric Kenney
I'm having an issue with segmentation faults during LAPW cycles. I keep
run_lapw -ec 0.001
LAPW0 END
LAPW1 END
LAPW2 END
CORE END
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line
Source
mixer 00000000004827DD Unknown Unknown Unknown
libpthread-2.12.s 00000032F8C0F7E0 Unknown Unknown Unknown
mixer 0000000000415054 MAIN__ 999 mixer.F
mixer 000000000040475E Unknown Unknown Unknown
libc-2.12.so
<https://urldefense.proofpoint.com/v2/url?u=http-3A__libc-2D2.12.so&d=DwMFaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=_GDrA5gqSzwOUAk1SVsST692GzqxkO1n72mdGQGj3Yg&s=5C-9xGUj4eezgpBLcDSWyPjOahdmBgTkuZNTguIq8YQ&e=>
00000032F841ED1D __libc_start_main Unknown Unknown
mixer 0000000000404669 Unknown Unknown Unknown
stop error
Naturally, I've been going through the mailing list and trying to deduce
what the issue here is. The error occurs for a large range of RKmax values
(5-8) and a large range of K mesh values (1000-10000). It occurs for -ec
0.001 to -ec 0.0000001, giving a large range on convergences. I've tried
ulimit, but my cluster does not appear to have that command installed, nor
do I think it is the issue.
Interestingly, I haven't had this issue in the past with other lattices.
This lattice is NaMnO2 in the B2/m setting; imported straight from a .cif
file from Springer Materials. I've done x nn, x sgroup and, x symmetry all
manually, and there seems to be no obvious issues here. For reference, the
a=10.720416 b=11.929841 c=5.397058 90.000000 90.000000 122.340000
NA X=0.50000000 Y=0.50000000 Z=0.50000000
Mn X=0.00000000 Y=0.00000000 Z=0.00000000
O X=0.50170000 Y=0.22940000 Z=0.00000000
4 atoms per unit cell.
Thank you, and have a nice day!
--
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu ; Corrosion in 4D: MURI4D.numis.northwestern.edu
Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent
Co-Editor, Acta Cryst A
Gavin Abo
2018-10-05 02:26:53 UTC
Permalink
Also, what compiler and version are you using?  Errors like that seem
common among the 2016/2017/2018 Intel Fortran compilers [
https://www.mail-archive.com/***@zeus.theochem.tuwien.ac.at/msg17542.html
].  You could possibly try gfortran or the pre-built executables just to
see if it caused by your compiler [
https://www.mail-archive.com/***@zeus.theochem.tuwien.ac.at/msg17982.html
].

For example, I'm using the 2013 Intel Fortran compiler:

***@computername:~/Desktop$ ifort -v
ifort version 14.0.1

I don't know for sure, but it looks like your error message does not
come from WIEN2k 18.2 but an older version.  If so, the latest version
might resolve the error.

I'm using WIEN2k 18.2:

***@computername:~/Desktop$ cat $WIENROOT/WIEN2k_VERSION
WIEN2k_18.2 (Release 17/7/2018)

My version also has the ScaleDiag.F fixes to the mixer (i.e.,
ScaleDiag.patch [
https://github.com/gsabo/WIEN2k-Patches/tree/master/18.2 ]).

I tried a quick test calculation with the structure parameters you gave
for a couple scf cycles, and it seems to run fine:

***@computername:~/wiendata$ mkdir NaMnO2
***@computername:~/wiendata$ cd NaMnO2
***@computername:~/wiendata/NaMnO2$ makestruct_lapw
...
TITLE :NaMnO2
...
Would you like to enter  Spacegroup or Lattice (S/L)(def=S)? S
SPACE GROUP: (type ENTER or give first LETTER for a list)
give SPACE GROUP as SYMBOL or NUMBER: B2/m
 Info:  space group is : 12 Cxz B2/m -B2x

Units of lattice parameters (Bohr/Angstrom) (b/A) (def=ANG):b
Lattice PARAMETERS as a b c (1 or 3 numbers - if you specify only 1
number, a cubic system is assumed):
10.720416 11.929841 5.397058
ANGLES BETWEEN lattice vectors, as alpha beta gamma (def=90.0 90.0
90.0):90 90 122.34
NUMBER INEQUEVALENT ATOMS :3
ATOM  1 (ELEMENT): Na
POSITION OF ATOM Na as X,Y,Z (def=0 0 0) :0.5 0.5 0.5
ATOM  2 (ELEMENT): Mn
POSITION OF ATOM Mn as X,Y,Z (def=0 0 0) :0 0 0
ATOM  3 (ELEMENT): O
POSITION OF ATOM O as X,Y,Z (def=0 0 0) :0.5017 0.2294 0
...
SPECIFY possible REDUCTION of SPHERE RADII in % (def=0)
2
...
rerun setrmt ?(y,N) (def=N):
N
...
The file   init.struct   has been created

  for modifications of your input you can also edit file datastruct and run
  Tmaker / setrmt init -r X    individually

***@computername:~/wiendata/NaMnO2$ cp init.struct NaMnO2.struct
***@computername:~/wiendata/NaMnO2$ init_lapw -b
...
  init_lapw finished ok
***@computername:~/wiendata/NaMnO2$ run_lapw -ec 0.001
hup: Command not found.
 LAPW0 END
 LAPW1 END
 LAPW2 END
 CORE  END
 MIXER END
ec cc and fc_conv 0 1 1
in cycle 2    ETEST: 0   CTEST: 0
hup: Command not found.
 LAPW0 END
 LAPW1 END
 LAPW2 END
 CORE  END
 MIXER END
ec cc and fc_conv 0 1 1
in cycle 3    ETEST: 0   CTEST: 0
hup: Command not found.
 LAPW0 END
 LAPW1 END
 LAPW2 END
 CORE  END
 MIXER END
ec cc and fc_conv 0 1 1
...
Post by Laurence Marks
1) What version of Wien2k are you using?
2) What does "cat *.error" show?
This almost certainly has nothing to do with -ec, k-points. Changing
ulimit is obsolete (unless you are using an obsolete version of
Wien2k). This might be an error in mixer, but is more likely to be a
setup error. However, without more information it is not possible to
say (yet).
I'm having an issue with segmentation faults during LAPW cycles. 
I keep getting segmentation faults after running a standard
run_lapw -ec 0.001
 LAPW0 END
 LAPW1 END
 LAPW2 END
 CORE  END
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC Routine            Line Source
mixer              00000000004827DD Unknown               Unknown 
Unknown
libpthread-2.12.s  00000032F8C0F7E0 Unknown               Unknown 
Unknown
mixer              0000000000415054 MAIN__                    999 
mixer.F
mixer              000000000040475E Unknown               Unknown 
Unknown
libc-2.12.so
<https://urldefense.proofpoint.com/v2/url?u=http-3A__libc-2D2.12.so&d=DwMFaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=_GDrA5gqSzwOUAk1SVsST692GzqxkO1n72mdGQGj3Yg&s=5C-9xGUj4eezgpBLcDSWyPjOahdmBgTkuZNTguIq8YQ&e=>
00000032F841ED1D  __libc_start_main Unknown  Unknown
mixer              0000000000404669 Unknown               Unknown 
Unknown
   stop error
Naturally, I've been going through the mailing list and trying to
deduce what the issue here is.  The error occurs for a large range
of RKmax values (5-8) and a large range of K mesh values
(1000-10000).  It occurs for -ec 0.001 to -ec 0.0000001, giving a
large range on convergences.  I've tried ulimit, but my cluster
does not appear to have that command installed, nor do I think it
is the issue.
Interestingly, I haven't had this issue in the past with other
lattices.  This lattice is NaMnO2 in the B2/m setting; imported
straight from a .cif file from Springer Materials.  I've done x
nn, x sgroup and, x symmetry all manually, and there seems to be
no obvious issues here. For reference, the lattice parameters are
a=10.720416 b=11.929841  c=5.397058 90.000000 90.000000  122.340000
NA X=0.50000000 Y=0.50000000 Z=0.50000000
Mn X=0.00000000 Y=0.00000000 Z=0.00000000
O X=0.50170000 Y=0.22940000 Z=0.00000000
4 atoms per unit cell.
Thank you, and have a nice day!
--
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what
nobody else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu <http://www.numis.northwestern.edu> ;
Corrosion in 4D: MURI4D.numis.northwestern.edu
<http://MURI4D.numis.northwestern.edu>
Partner of the CFW 100% program for gender equity,
www.cfw.org/100-percent <http://www.cfw.org/100-percent>
Co-Editor, Acta Cryst A
Eric Kenney
2018-10-05 22:35:26 UTC
Permalink
Based on the recommendation of Gavin Abo, I entered the struct file by
hand. I got the system to run properly for a bit and did some
calculations. But, upon attempting to change the potential from PBE to
LDA, the mixer began crashing again. Now it's crashing even using PBE.

I'm using version 18.1; I'll look into seeing if I can get it patched.
Laurence Marks
2018-10-05 23:06:59 UTC
Permalink
Everything indicates that there are errors in what you are doing. Based
upon the case.scfm and case.outputm files, the prior error was not in the
mixer, it was a general failure due to incorrect useage.

As one example, you cannot just "change" the potential. At a minimum you
must save the prior version, change case.in0 then run.
Post by Eric Kenney
Based on the recommendation of Gavin Abo, I entered the struct file by
hand. I got the system to run properly for a bit and did some
calculations. But, upon attempting to change the potential from PBE to
LDA, the mixer began crashing again. Now it's crashing even using PBE.
I'm using version 18.1; I'll look into seeing if I can get it patched.
Loading...