Opened 8 years ago

Closed 8 years ago

#33 closed defect (fixed)

Segfault in atmosphere processing

Reported by: dbaack Owned by: somebody
Priority: critical Milestone: milestone:
Component: component1 Version:
Keywords: Cc:

Description

I get following segmentation fault for some EventIO files and I cant find the reason:

ROOT:Error: segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007fe2859e9b0c in waitpid () from /lib64/libc.so.6
#1  0x00007fe28596e092 in do_system () from /lib64/libc.so.6
#2  0x00007fe28af4a8d9 in TUnixSystem::StackTrace (this=0x1797bc0) at /home/dbaack/root/root_v5.34.36/core/unix/src/TUnixSystem.cxx:2419
#3  0x00007fe28af4c51c in TUnixSystem::DispatchSignals (this=0x1797bc0, sig=kSigSegmentationViolation) at /home/dbaack/root/root_v5.34.36/core/unix/src/TUnixSystem.cxx:1294
#4  <signal handler called>
#5  0x00007fe28c5bf984 in MSimAtmosphere::Process() () from libmars.so
#6  0x00007fe28c195df4 in MTask::CallProcess() () from libmars.so
#7  0x00007fe28c194707 in MTaskList::ProcessTaskList() () from libmars.so
#8  0x00007fe28c194ba8 in MTaskList::Process() () from libmars.so
#9  0x00007fe28c195df4 in MTask::CallProcess() () from libmars.so
#10 0x00007fe28c151e45 in MEvtLoop::Process(unsigned int) () from libmars.so
#11 0x00007fe28c152344 in MEvtLoop::Eventloop(unsigned int, MEvtLoop::Statistics_t) () from libmars.so
#12 0x00007fe28c852d20 in MJSimulation::Process(MArgs const&, MSequence const&) () from libmars.so
#13 0x00000000004043bb in main ()
===========================================================



Simulation Settings:
(Atmosphere should be handled by ceres)

RUNNR 104
EVTNR 1
NSHOW 4000
SEED 1041 0 0
SEED 1042 0 0
SEED 1043 0 0
SEED 1044 0 0
SEED 1045 0 0
THETAP 10 10
PRMPAR 14
ERANGE 100 30000
ESLOPE -2.0
PHIP 0 0
VIEWCONE 0 5
FIXCHI 0
OBSLEV 220000
MAGNET 30.3 24.1
ARRANG -7
ATMOSPHERE 7 T
ATMLAY 775000 1650000 5000000 10500000
RADNKG 20000
ECUTS 0.3 0.3 0.02 0.02
ECTMAP 10000
MUADDI 0
MUMULT 1
CWAVLG 290 900
CERSIZ 1
CERFIL 1
CSCAT 1 60000.0 0.0
DYNSTACK 1000000
TELESCOPE 0 0 0 500
LONGI 0 20 0 0
MAXPRT 0
PAROUT 0 0
DATBAS 0
DEBUG F 6 F 1000000
DIRECT ./
FLUDBG  F
TELFIL  cerdata
USER dominik
EXIT

I can upload the corsika file somewhere if needed

Change History (8)

comment:1 by tbretz, 8 years ago

Could you enable debugging by adding -g in Makefile.conf.linux at DEBUG, then do a recompile (make mrproper, make), run everything in a debugger and send me the line number where it crahses?

Thanks.

comment:2 by dbaack, 8 years ago

The fault changed to

#0  0x00007f8930015b0c in waitpid () from /lib64/libc.so.6
#1  0x00007f892ff9a092 in do_system () from /lib64/libc.so.6
#2  0x00007f89355768d9 in TUnixSystem::StackTrace (this=0x1c8dbc0) at /home/dbaack/root/root_v5.34.36/core/unix/src/TUnixSystem.cxx:2419
#3  0x00007f893557851c in TUnixSystem::DispatchSignals (this=0x1c8dbc0, sig=kSigSegmentationViolation) at /home/dbaack/root/root_v5.34.36/core/unix/src/TUnixSystem.cxx:1294
#4  <signal handler called>
#5  0x00007f8936beb984 in CalcOzoneAbsorption (theta=<optimized out>, wavelength=438, h=<optimized out>, this=0x7f8937611010) at MSimAtmosphere.cc:609
#6  GetTransmission (ph=..., this=0x7f8937611010) at MSimAtmosphere.cc:673
#7  MSimAtmosphere::Process (this=0x7ffdfcaa7420) at MSimAtmosphere.cc:815
#8  0x00007f89367c1df4 in MTask::CallProcess (this=0x7ffdfcaa7420) at MTask.cc:287
#9  0x00007f89367c0707 in MTaskList::ProcessTaskList (this=this

So its line 609

Last edited 8 years ago by dbaack (previous) (diff)

comment:3 by tbretz, 8 years ago

Can you print the values for h and theta and H and T please for which that happens? I cannot test that from here easily (I am in Mexico right now). ozone_path is defined as [501][90]. Maybe it is a trivial problem...

comment:4 by dbaack, 8 years ago

Here are the numbers before the crash:

Before access (h | theta | H | T) -nan | -nan | 2147483648 | 2147483648
ROOT:Error: segmentation violation

I don't know why it is nan and it seems to happen very rarely.
I think its save to throw them away and print a warning maybe

Last edited 8 years ago by dbaack (previous) (diff)

comment:5 by tbretz, 8 years ago

Could you please check in GetTransmission which value is the reason for the NaN?

comment:6 by dbaack, 8 years ago

Maybe its not important but i dont use compact output for eventio. I add the + infront of the filename to get the normal format.

The last 6 photons all have a sin2 between 0.9998 and 0.99992. The one that create the exception has a sin2 over 1.

Height: 1.11686e+06
sin2: 1.00002
cost: -nan
theta: -nan

Last edited 8 years ago by dbaack (previous) (diff)

comment:7 by tbretz, 8 years ago

This is just a numerical precision problem then. We just have to find the right place for a if(sin2>1) sin2=1.

comment:8 by tbretz, 8 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.