Frage

I had a program issue with the following stack.

6600:   ora_d006_LOOKUP
 ffffffff7addbbd0 __systemcall6 (3, ffffffff7d300440, 0, ffffffff7adc1268, d, fff7) + 24
 ffffffff7adcba74 pthread_sigmask (2000, 0, 0, 0, ffffffff7d300200, d) + 1c4
 00000001068ff3bc sslssalck (ffffffff7fffb138, 2, ffffffff7fffb070, 0, 3e8, 10c24d7e0) + 7c
 00000001069358e8 sltmarm (a00029810, 29810, 10c3f3ab0, 3f9, a00000000, 29810) + 88
 00000001069aa734 ltmdvp (8006689e, 3f9, 0, 10c55ba38, 10c3f8160, 10c3f34d0) + 154
 00000001068ff2a4 sslsstehdlr (e, 0, ffffffff7fffb570, 7fffff84, 10c3ed0d8, 10c24d7e0) + 224
 ffffffff7add7498 __sighndlr (e, 0, ffffffff7fffb570, 1068fcba0, 0, d) + c
 ffffffff7adcb02c call_user_handler (ffffffff7d300200, ffffffff7d300200, ffffffff7fffb570, c, 0, 0) + 3e0
 ffffffff7adcb238 sigacthandler (0, 0, ffffffff7fffb570, ffffffff7d300200, 0, ffffffff7af3e000) + 68
 --- called from signal handler with signal 0 (SIGEXIT) ---
 ffffffff7addad48 ioctl (10c3f80c0, bb8, 400, 10c426810, 10c6aae90, 2001420c) + c
 0000000109e47668 nteveque (10c40c940, bb8, ffffffff7fffca98, 1afbfb85a4, 1c, 98) + 28
 0000000109e3f0c0 ntevque (7, bb8, 10c2cbfd0, 10c40c940, ffffffff7fffca98, 10c2cbfd0) + 80
 0000000109d8e738 nsevwait (0, 0, 10c25cc00, 0, 10c25cc04, 10c3f7a60) + 1b8
 000000010092e7b4 ksnwait (10c25cc00, 6, 10c403fb0, 10c25c000, 10c25c, 10c000) + 54
 000000010072060c ksliwat (0, ffffffff7fffd8e8, 1770, 10c25b, 10c000, 0) + 140c
 0000000100704b28 kslwait (1770, ffffffff7fffd8e8, ffffffff7fffd8e8, ffffffff7fffd8e8, 0, 0) + e8
 00000001065707a0 kmdmai (1b1bfffe00, 10c2628e8, 1b02faf258, 10c26c190, 10c25b, 38000d000) + e40
 00000001063b0400 opirip (10a726000, 0, 380002, 380000, 38002a000, 38002a) + a80
 00000001035c59cc opidrv (32, 4, ffffffff7ffff590, 1ebb90, ffffffff7af45050, ffffffff7ffff9a0) + 30c
 000000010474117c sou2o (ffffffff7ffff568, 32, 4, ffffffff7ffff590, 10c000, 10b800) + 5c
 0000000100604f64 opimai_real (3, ffffffff7ffff838, ffffffff7ffffb60, ffffffff7ffffbb5, 0, 0) + 204
 0000000104757380 ssthrdmain (10c000, 3, 44dc00, 100604d60, 10c27c000, 10c27c) + 140
 0000000100604c74 main (3, ffffffff7ffff948, 0, ffffffff7ffff840, ffffffff7ffff950, ffffffff7d300200) + 134
 0000000100604b1c _start (0, 0, 0, 0, 0, 0) + 17c

this process is used to dispatch request from client. During the issue, no more request can be sent in and this process consumed many SYS cpu.

man ioctl, I will get the prototype of ioctl in system call. but I don't think it is same as the ioctl. The ioctl in the output of pstack should be a function in userland.

In the pstack:

--- called from signal handler with signal 0 (SIGEXIT) ---
ffffffff7addad48 ioctl (10c3f80c0, bb8, 400, 10c426810, 10c6aae90, 2001420c) + c

I wrote a small dtrace script.

pid$target::ioctl:entry
{
        printf("%s", probemod)
}

I get

3  82218                      ioctl:entry libc.so.1

so I think this ioctl came from libc.so.

But I can't get the manual for ioctl from libc.so.

1 where can I get the manual for ioctl in libc of solaris?

2 it is said that SIGEXIT is a pseudo signal. how to set up signal handle for this? how to sent SIGEXIT signal for a process? and at the last, we will have the following stack?

  ...  my_handle_signal .... 
  --- called from signal handler with signal 0 (SIGEXIT) ---
  ... xxxx
War es hilfreich?

Lösung

Your ioctl on /devices/pseudo/poll@0:poll device (or /dev/pool) seems to be handled by kernel function from common/io/devpoll.c file (online copy - http://fxr.watson.org/fxr/source/common/io/devpoll.c?v=OPENSOLARIS)

More exact, by the dpioctl function:

 692 dpioctl(dev_t dev, int cmd, intptr_t arg, int mode, cred_t *credp, int *rvalp)

zhihuifan, after checking your stacktrace I see that you program had executed:

main() -> ... nteveque() -> ioctl()

Then the signal hanlder was called.. I see no sending of signals from dpioctl, so I think the signal was send by some external function (or program or by user):

--- called from signal handler with signal 0 (SIGEXIT) ---

Then the user-space signal handler was called:

sigacthandler ->     call_user_handler ->     __sighndlr 
-> sslsstehdlr 

The sslsstehdlr did many actions, and according to my knowledge and POSIX standards ("2.4 Signal Concepts" from The Open Group Base Specifications Issue 6; IEEE Std 1003.1, 2004 Edition), the signal handler may only call (directly or indirectly) functions listed in table

The following table defines a set of functions that shall be either reentrant or non-interruptible by signals and shall be async-signal-safe. Therefore applications may invoke them, without restriction, from signal-catching functions:

... huge list but there is no ptherad_sigmask here...

All functions not in the above table are considered to be unsafe with respect to signals. .... when a signal interrupts an unsafe function and the signal-catching function calls an unsafe function, the behavior is undefined.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top