Version 5 (modified by 10 years ago) ( diff ) | ,
---|
Trouble Shooting - Computers
Information about the Computers
You can find information about the Computers on La Palma on the computing page.
Restart a computer after a power cut
You can switch on the computers from 10.0.100.234 (see http://fact-project.org/internal.html).
Startup Procedure
- bring up GATE (LDAP, DNS, DHCP, Gateway (Masquerading))
- bring up NEWDAQ (NFS Home, Raid)
- bring up DATA
- bring up DAQ
- bring up the other machines
- make sure that the needed mountpoints are there
- make sure that the needed screen-sessions are running (details Trouble Shooting Software)
mountpoints:
- daq, data: /newdaq from newdaq
- data, gate: /daq from daq
- gate: /users from newdaq (home of other machines)
- data,daq,aux,gui: /home from newdaq
If missing, do sudo mount -a on the corresponding machine.
Shutdown Procedure
- shut down aux, gui
- shut down daq
- shut down data
- shut down newdaq
- show down gate
Restarting a hanging PC
Symptom
- the PC can't be reached per ssh, or something similar
- be aware, that when all computers (except for gate) seem to hang, it is normally newdaq which hangs, the other only try to mount the home from the raid of newdaq, so they hang too
Solution
If it's not to late in the night, try to call an expert before you power cycle the computers.
When you have to restart more than one PC, be sure you follow the Shutdown and Start-up procedure above.
You can switch on the computers from 10.0.100.234 (see http://fact-project.org/internal.html)
or you can power cycle the hanging computer from any other computer on the FACT internal network:
- go in /usr/local/bin
- execute one off the following scripts: aux_off, gui_off, gate_off, daq_off, data_off
- wait a few minutes
- execute one off the following scripts: aux_ON, gui_ON, gate_ON, daq_ON, data_ON
- Rebooting will take a few minutes for aux, gui, gate and about 10 min. for daq and data, respectively
or power cycle the hanging computer manually from the FACT-container.