Wednesday, 18 April 2012


Troubleshooting of SAP system:-
Steps to follow if SAP system goes down and is not coming up:-
1. File System check:-
Use the command “df –gt” on AIX system to check the sizes of all FSs. Make sure that no FS should be 100% full/used. If that is the case, try to bring down the FS size below threshold (At least, it should not be 100% anymore for system to function properly).
2. Network issues:-
Use command “errpt | more” to know if there are any issues on hardware/network level (OS level).
Use the command “errpt –a | more” for more detailed analysis of the issue.
Check with command “df –gt” if there are any mount issues for any FS or on any server.
3. Tablespace check:-
Make sure no tablespace is at 100%.
If you are not able to login into the SAP system, check tablespaces from BRTOOLS as shown below:-



In case any tablespace is 100% full, extend it immediately using BRTOOLS after checking the space in FS from where the tablespace is taking space.
4. WorkProcess check:-
Use the command dpmon pf =<Instance Profile> in profile directory (cdpro) under user <sid>adm to check if all WPs are occupied. If all the WPs are “In Hold” or “Running” status that means there are no WPs in waiting state and no new requests will be entertained. In that case we need to check if something is stuck then we have to kill those WPs or restart those WPs to accommodate new requests in the system.





5. System Start/Stop logs:-
If the system is stopped/started implicitly by somebody then Start/Stop logs will be present under home directory of user <sid>adm. If we try to start/stop the system and it gets failed, in that case also we can check these logs for further analysis.

6. DB status check:-
Check if DB is up and running fine:-
Check if ora processes(background oracle processes)  are running using the command :-
Ps –ef | grep ora_

Also, you can go to sqlplus and use command “select status from v$instance” to check the status of DB :-


7. DB connectivity:-
Check DB connectivity of ABAP using command “R3trans –d” under user <sid>adm. If everything is fine it should give output RC 0000 else if RC is 0012 or some other non-zero code, then you have to check trans.log file to know the root cause of the problem. This file gets created at the location where you will execute the above command.

8. SAPPFPAR check:-
Check if all memory parameters on the system are fine using “sappfpar check pf =<Instance Profile>” command under user <sid>adm . This command checks memory for shared pools from which all shared buffers take space. If there are any warnings coming in the output of this command then it should not be the problem in starting the system but if there are any errors coming in the output of this command then system will not come up unless and until memory parameters are adapted accordingly.


9. Other commonly used SAP system logs for troubleshooting purposes:-
Check the SAP system log files in the directory g:\usr\sap\<sid>\instance no.\work. You should also check the contents of the developer traces (dev_ms (Message server trace) , dev_disp(Dispatcher trace), and dev_wx(WorkProcess trace)) at this point.


Check the trace files of the individual SAP work processes:
·         dev_ms:  Developer trace for the message server
·         dev_rd:  Developer trace for the gateway
·         dev_disp: Developer trace for the dispatcher
·         dev_w<m> (m is the work process number): Developer trace for the work processes.
If you can still log on to the SAP system, check the system log of the SAP system using transaction SM21.
10. Database logs (Oracle):-
All significant events such as starting and stopping the database and error messages are present in the file \oracle\<SID>\saptrace\background\ALRT.LOG.
This alert log can have any name like alert_<sid>.log.
Detailed information about errors is logged in the Oracle Trace File:
\oracle\<SID>\saptrace\usertrace\Ora<no>.trc.  We can get the reference for number of this trace file in alertlog file itself.


11. Java system logs for troubleshooting purposes:-
Java uses the Startup and Control framework for its startup. In the case of an error or unexpected behavior of the Startup and Control Framework, it is important to check the following trace and log files:
·         dev_jcontrol
·         dev_<node name>, such as dev_dispatcher
·         jvm_<node name>.out, such as jvm_dispatcher.out
·         std_server<X>.out, e.g. std_server0.out
·         std_dipatcher.out
The trace and log files are stored in the work directory of an instance. This directory is called /usr/sap/<SID>/<instance name>/work.
dev_jcontrol : It is the trace file for the JControl process. dev_jcontrol is the most important trace file for problem messages when starting NetWeaver AS Java. The most recent messages are written at the end of the file.
dev_<node name> : It is the trace file for JLaunch processes. The trace file dev_<node name> is written for each started JLaunch process, and therefore for every dispatcher and server process. Examples are dev_dispatcher, dev_server0, dev_server1 etc.
jvm_<node name>.out : It is the output file for the Java Virtual Machine (JVM). This JLaunch process represents a Java node such as a dispatcher or a server and therefore a JVM.
std_server<X>.out and std_dispatcher.out : These are the default output files for the started managers and services of the corresponding nodes.

All these trace and log files are very important for troubleshooting SAP systems as these help:-
1.    To know why the system went down.
2.    To know why the system is not getting up/started.
3.    To know the component where the problem lies.
4.    To know the action that can be taken to bring the system back to stable state.