Troubleshooting
of SAP system:-
Steps
to follow if SAP system goes down and is not coming up:-
1.
File System check:-
Use the command “df –gt” on AIX system to check the sizes of all FSs. Make sure
that no FS should be 100% full/used. If that is the case, try to bring down the
FS size below threshold (At least, it should not be 100% anymore for system to
function properly).
2.
Network issues:-
Use command “errpt | more” to know if there are any issues on hardware/network
level (OS level).
Use the command “errpt –a | more” for more detailed analysis of the issue.
Check with command “df –gt” if there are any mount issues for any FS or on any server.
3.
Tablespace check:-
Make sure no tablespace is at 100%.
If you are not able to login into the SAP
system, check tablespaces from BRTOOLS as shown below:-

In case any tablespace is 100% full, extend
it immediately using BRTOOLS after checking the space in FS from where the
tablespace is taking space.
4.
WorkProcess check:-
Use the command dpmon pf =<Instance
Profile> in profile directory (cdpro) under user <sid>adm to check if
all WPs are occupied. If all the WPs are “In Hold” or “Running” status that
means there are no WPs in waiting state and no new requests will be entertained.
In that case we need to check if something is stuck then we have to kill those
WPs or restart those WPs to accommodate new requests in the system.
5.
System Start/Stop logs:-
If the system is stopped/started implicitly
by somebody then Start/Stop logs will be present under home directory of user
<sid>adm. If we try to start/stop the system and it gets failed, in that
case also we can check these logs for further analysis.
6.
DB status check:-
Check if DB is up and running fine:-
Check if ora processes(background oracle
processes) are running using the command
:-
Ps –ef | grep ora_
Also, you can go to sqlplus and use command
“select status from v$instance” to check the status of DB :-
7.
DB connectivity:-
Check DB connectivity of ABAP using command
“R3trans –d” under user
<sid>adm. If everything is
fine it should give output RC 0000 else if RC is 0012 or some other non-zero
code, then you have to check trans.log file to know the root cause of the
problem. This file gets created at the location where you will execute the
above command.
8.
SAPPFPAR check:-
Check if all memory parameters on the
system are fine using “sappfpar check pf
=<Instance Profile>” command under user <sid>adm . This command
checks memory for shared pools from which all shared buffers take space. If
there are any warnings coming in the output of this command then it should not
be the problem in starting the system but if there are any errors coming in the
output of this command then system will not come up unless and until memory
parameters are adapted accordingly.
9.
Other commonly used SAP system logs for troubleshooting purposes:-
Check the SAP system log files in the
directory g:\usr\sap\<sid>\instance no.\work. You should also
check the contents of the developer traces (dev_ms (Message server trace) , dev_disp(Dispatcher
trace), and dev_wx(WorkProcess trace)) at this point.
Check the trace files of the individual SAP
work processes:
·
dev_ms: Developer trace for the message server
·
dev_rd: Developer trace for the gateway
·
dev_disp: Developer trace
for the dispatcher
·
dev_w<m> (m is the
work process number): Developer trace for the work processes.
If you can still log on to the SAP system,
check the system log of the SAP system using transaction SM21.
10.
Database logs (Oracle):-
All significant events such as starting and
stopping the database and error messages are present in the file
\oracle\<SID>\saptrace\background\ALRT.LOG.
This alert log can have any name like
alert_<sid>.log.
Detailed information about errors is logged
in the Oracle Trace File:
\oracle\<SID>\saptrace\usertrace\Ora<no>.trc. We can get the reference for number of this
trace file in alertlog file itself.
11.
Java system logs for troubleshooting purposes:-
Java uses the Startup and Control framework
for its startup. In the case of an error or unexpected behavior of the Startup
and Control Framework, it is important to check the following trace and log
files:
·
dev_jcontrol
·
dev_<node name>,
such as dev_dispatcher
·
jvm_<node name>.out,
such as jvm_dispatcher.out
·
std_server<X>.out,
e.g. std_server0.out
·
std_dipatcher.out
The trace and log files are stored in the
work directory of an instance. This directory is called
/usr/sap/<SID>/<instance name>/work.
dev_jcontrol : It is the trace file
for the JControl process. dev_jcontrol is the most important trace file for
problem messages when starting NetWeaver AS Java. The most recent messages are
written at the end of the file.
dev_<node name> : It is the trace file for
JLaunch processes. The trace file dev_<node name> is written for each
started JLaunch process, and therefore for every dispatcher and server process.
Examples are dev_dispatcher, dev_server0, dev_server1 etc.
jvm_<node name>.out : It is the output file for the
Java Virtual Machine (JVM). This JLaunch process represents a Java node such as
a dispatcher or a server and therefore a JVM.
std_server<X>.out and
std_dispatcher.out : These are
the default output files for the started managers and services of the
corresponding nodes.


All these trace and log files are very
important for troubleshooting SAP systems as these help:-
1.
To know why the system
went down.
2.
To know why the system is
not getting up/started.
3.
To know the component
where the problem lies.
4.
To know the action that
can be taken to bring the system back to stable state.