[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Using Truss <http://www.kevlo.com/~ebs/docs/truss.html>



 
    
<http://www.kevlo.com/%7Eebs/docs/truss.html>
Title: Using Truss

Using Truss

Truss is one of the best debugging commands in the administrator toolkit. There are a few ways to use truss, here are a few examples.

Note: in very rare circumstances, be aware that trussing an active process may actually be obtrusive, and possibly affect that running process.

    Ways to use truss:
  • Actively trussing a running process

    In this situation, you may want to find out what state a process is in. You can do this by running the command:

    # truss -p PID.

  • Dumping a truss log while running a process

    This is probably one of the first few things you should do if you cannot figure out why a process is aborting, or simply not running. What you do here is execute truss followed by the command that is not working too well, eg:

    # truss -vall -f -o /tmp/truss.log nslookup sarah

    In this case, we are capturing to a log file (/tmp/truss.log) everything that the command nslookup is doing. This includes files it is trying to open, reads, writes.

    If you examine the log file, you will see certain highlights like:

    1. open("/etc/resolv.conf", O_RDONLY) = 3
    2. open("/home/sextone/.nslookuprc", O_RDONLY) Err#2 ENOENT
    3. open("/dev/udp", O_RDWR) = 3
    4. ioctl(3, I_PUSH, "sockmod") = 0
    5. write(1, " 1 0 . 2 0 . 0 . 5 7".., 14) = 14
    6. _exit(0)

    Explanation of individual lines:

    1. nslookup is trying to open /etc/resolv.conf as read only. This makes sense since this is the location of the name server addresses. In order to have DNS you have to have servers in this file. You know it successfully opened this file because of the = 3 part. This means it opened the file and assigned it a file handle (file descriptor) of 3.
    2. Hey, you learn something new every day. Who would have thought one could have a resource configuration file called .nslookuprc in your home directory! Since I did not make one, it can't find it.

      It is returning an error: Err#2 ENOENT. We better check out what that abbreviation means.

      # grep ENOENT /usr/include/sys/*
      errno.h:#define ENOENT 2 /* No such file or directory */

      Excellent! This makes sense. Remember the errno.h file.

    3. Okay, nslookup is not trying to ride on /dev/udp. This must mean that it does not depend on TCP/IP but rather UDP.
    4. IOCTL, sockmod... hmmm. Sound like a socket connection to the name server. Not sure though.
    5. Something is writing back the actual DNS lookup of sarah, which is 10.20.0.57
    6. Exit(0). Okay return.

    Another useful tactic, especially on a very big truss log, is to grep for open. This way, you can see which files the process is trying to open up. Here is an example:

    Run "ps -aef | grep in.named" and get the process ID number for in.named. Then, run "truss -vall -f -o /tmp/outfile -p PID" (where PID is the process ID number number for in.named).

    The output file (here stored in /tmp/outfile, but you can put it anywhere you want), should contain a list of all system calls made by the in.named process.

    When the process dies, it should report at the end of the truss output what the failure code is. Something like the following:

       Incurred fault #6, FLTBOUNDS  %pc = 0xEF2617FC
           siginfo: SIGSEGV SEGV_MAPERR addr=0x00000000
         Received signal #11, SIGSEGV [caught]
           siginfo: SIGSEGV SEGV_MAPERR addr=0x00000000
    

    You may be able to determine from the truss output alone what's wrong (it could show that the process was trying to access a certain file, or directory, but did not have access, for example). This would probably be listed a few lines above the error/fault section of the truss output.

    If nothing is obvious from the truss output, you can use the "%pc" entry, which is listed as part of the fault/failure code section of the truss output, as input to adb to determine which instruction failed. For example, you could run:

     adb -k /usr/sbin/in.named core
     0xEF2617FC/i
     send_msg+0x44:     save    %sp, -0x68, %sp
    

    You should get output showing the instruction which failed (here the send_msg instruction).

    You'll need to use $q to quit out of adb.

Google