Question

I've used Samurai and I can see there is no deadlock and several threads waiting but I can't seem to figure out exactly which lock is stalling the process. Can anyone help me out?

I'm not looking for people to guess what my problem is, more advice on how to alter the code to make it easier to trace the problem. I get through my first send and receive and then the program stalls when it should be sending the second message. I'm new to multithreading and this is my first time pulling a jstack

2013-04-02 23:43:12
Full thread dump OpenJDK Zero VM (22.0-b10 mixed mode):

"Attach Listener" daemon prio=10 tid=0x0037c880 nid=0x105b waiting on condition     [0x00000000]
java.lang.Thread.State: RUNNABLE

"Thread-3" prio=10 tid=0x0037c488 nid=0x1041 waiting on condition [0xa7ddd000]
java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0xab770958> (a     java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
    at org.dnsdojo.ryanhost.GA.MuPlusOne.RobotInterface.evaluate(RobotInterface.java:38)
    at org.dnsdojo.ryanhost.GA.MuPlusOne.RobotInterface.run(RobotInterface.java:69)
    at java.lang.Thread.run(Thread.java:722)

"Thread-1" prio=10 tid=0x0036ff10 nid=0x1036 waiting on condition [0xa7f5d000]
java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0xab770940> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
    at   java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
    at org.dnsdojo.ryanhost.GA.MuPlusOne.TwoWaySerialCommTest$SerialWriter.run(TwoWaySerialCommTest.java:229)
    at java.lang.Thread.run(Thread.java:722)

"Thread-0" daemon prio=10 tid=0x0036a1e8 nid=0x1035 runnable [0xa80dd000]
java.lang.Thread.State: RUNNABLE
    at gnu.io.RXTXPort.eventLoop(Native Method)
    at gnu.io.RXTXPort$MonitorThread.run(RXTXPort.java:1644)

"Service Thread" daemon prio=10 tid=0x00253440 nid=0x102b runnable [0x00000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00251988 nid=0x102a runnable [0x00000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x0020c880 nid=0x1029 in Object.wait() [0xa8ac1000]
java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0xab718a88> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
    - locked <0xab718a88> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00209f18 nid=0x1028 in Object.wait() [0xa8c41000]
java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0xab718b10> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:503)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
    - locked <0xab718b10> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00180fa8 nid=0x1022 in Object.wait() [0xb6848000]
java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0xa90716a8> (a org.dnsdojo.ryanhost.GA.MuPlusOne.CandidateTest)
    at java.lang.Object.wait(Object.java:503)
    at org.dnsdojo.ryanhost.GA.MuPlusOne.MuPlusOneAlgorithm.runOnBot(MuPlusOneAlgorithm.java:120)
    - locked <0xa90716a8> (a org.dnsdojo.ryanhost.GA.MuPlusOne.CandidateTest)
    at org.dnsdojo.ryanhost.GA.MuPlusOne.MuPlusOneAlgorithm.initialFitness(MuPlusOneAlgorithm.java:72)
    - locked <0xab718bd0> (a org.dnsdojo.ryanhost.GA.MuPlusOne.MuPlusOneAlgorithm)
    at org.dnsdojo.ryanhost.GA.MuPlusOne.MuPlusOneAlgorithm.main(MuPlusOneAlgorithm.java:138)

"VM Thread" prio=10 tid=0x00204910 nid=0x1027 runnable

"VM Periodic Task Thread" prio=10 tid=0x00255130 nid=0x102c waiting on condition

JNI global references: 36

Adding code as requested The Serial communication class

    package org.dnsdojo.ryanhost.GA.MuPlusOne;

    import gnu.io.CommPort;
    import gnu.io.CommPortIdentifier;
    import gnu.io.SerialPort;
    import gnu.io.SerialPortEvent;
    import gnu.io.SerialPortEventListener;

    import java.io.IOException;
    import java.io.InputStream;
    import java.io.OutputStream;

    /**
     * This version of the TwoWaySerialComm example makes use of the
     * SerialPortEventListener to avoid polling.
     *
     */
    public class TwoWaySerialComm
    {
        public TwoWaySerialComm()
        {
            super();
        }

        void connect ( String portName ) throws Exception
        {
            listPorts();
            CommPortIdentifier portIdentifier = CommPortIdentifier.getPortIdentifier(portName);
            if ( portIdentifier.isCurrentlyOwned() )
            {
                System.out.println("Error: Port is currently in use");
            }
            else
            {
                CommPort commPort = portIdentifier.open(this.getClass().getName(),2000);

                if ( commPort instanceof SerialPort )
                {
                    SerialPort serialPort = (SerialPort) commPort;
                    serialPort.setSerialPortParams(57600,SerialPort.DATABITS_8,SerialPort.STOPBITS_1,SerialPort.PARITY_NONE);

                    InputStream in = serialPort.getInputStream();
                    OutputStream out = serialPort.getOutputStream();

                    (new Thread(new SerialWriter(out))).start();

                    serialPort.addEventListener(new SerialReader(in));
                    serialPort.notifyOnDataAvailable(true);

                }
                else
                {
                    System.out.println("Error: Only serial ports are handled by this example.");
                }
            }
        }

        static void listPorts()
        {
            java.util.Enumeration<CommPortIdentifier> portEnum = CommPortIdentifier.getPortIdentifiers();
            while(portEnum.hasMoreElements())
            {
                    CommPortIdentifier portIdentifier = portEnum.nextElement();
                    if(portIdentifier == null)
                    {
                            System.out.println("No ports");
                    }
                    System.out.println("Available - " + portIdentifier.getName());
            }
        }

        /**
         * Handles the input coming from the serial port. A new line character
         * is treated as the end of a block in this example.
         */
        public static class SerialReader implements SerialPortEventListener
        {
            private InputStream in;
            private byte[] buffer = new byte[1024];

            public SerialReader ( InputStream in )
            {
                this.in = in;
            }

            public void serialEvent(SerialPortEvent arg0) {
                int data;

                try
                {
                    int len = 0;
                    while ( ( data = in.read()) > -1 )
                    {
                        if ( data == '\n' ) {
                            break;
                        }
                        buffer[len++] = (byte) data;
                    }
                    System.out.print(new String(buffer,0,len));
                }
                catch ( IOException e )
                {
                    e.printStackTrace();
                    System.exit(-1);
                }
            }

        }

        /** */
        public static class SerialWriter implements Runnable
        {
            OutputStream out;

            public SerialWriter ( OutputStream out )
            {
                this.out = out;
            }

            public void run ()
            {
                try
                {
                    int c = 0;
                    while ( ( c = System.in.read()) > -1 )
                    {
                        this.out.write(c);
                    }
                }
                catch ( IOException e )
                {
                    e.printStackTrace();
                    System.exit(-1);
                }
            }
        }



        public static void main ( String[] args )
        {
            try
           {
               (new TwoWaySerialComm()).connect("/dev/ttyS82");
           }
           catch ( Exception e )
           {
              TwoWaySerialComm.listPorts();
               e.printStackTrace();
           }
        }


    }

The RobotInterface class

package org.dnsdojo.ryanhost.GA.MuPlusOne;

import org.apache.log4j.Logger;
import org.apache.log4j.PropertyConfigurator;

public class RobotInterface implements Runnable
{
    // create a serial connection
    // transmit a string and check for response
    // wait for evaluation
    // take evaluation
    private CandidateTest candidate;
    private TwoWaySerialCommTest serialConnection;
    //private Random rng = new Random();

    protected static Logger logger = Logger.getLogger("Thread" + Thread.currentThread().getName());

    public RobotInterface(CandidateTest test, TwoWaySerialCommTest serialConnection)
    {
            this.candidate = test;
            this.serialConnection = serialConnection;
            PropertyConfigurator.configure("log4j.properties");
    }

    public void evaluate (Genome genome)
    {
            //send to robot and return fitness
            //genome.setFitness(rng.nextDouble());
            logger.debug("fitness is " + genome.getFitness());

                    try
        {
                            String s = candidate.getCandidate().toString();
                            System.out.println(s);
                            TwoWaySerialCommTest.lock.lock();
                            System.out.println(s);
            serialConnection.put(s);
            TwoWaySerialCommTest.inputAvailable.await();
            try
            {
                    candidate.getCandidate().setFitness(Float.parseFloat(serialConnection.take()));
            }
            catch(Exception e)
            {
                    e.printStackTrace();
            }


        }
        catch ( Exception e )
        {
            TwoWaySerialCommTest.listPorts();
            e.printStackTrace();
        }
                    finally
                    {
                            TwoWaySerialCommTest.lock.unlock();
                    }
    }




    public void run()
    {
            logger.debug("entering run of Robot Interface");
            logger.debug("Send Genome via serial and wait for a response");
            Genome testSubject = candidate.getCandidate();
            evaluate(testSubject);
            candidate.finished();
    }
}

To summarise the execution - I have a for loop which creates RobotInterface threads, each for a Genome which contains a bitstring. This genome reference is held in a candidateTest which is just a shared class for the RobotInterface and the calling thread to signal when finished.

The bitstring is sent to the robot by RobotInterface using the TwoWaySerialCommTest. The robot evaluates it and returns a fitness function which is parsed as a float by the SerialReader. The next String is then sent to the robot.

The lock is occuring after the first string has returned to the robot and the second string is being sent. I can see the first String makes it to the robot but the second string doesn't. The output is as follows -

initialFitness method
1101010101111111001100000101011100110000100001111111001000001001101101000011100101011000000100000000110110001110
5436 [main] DEBUG org.dnsdojo.ryanhost.GA.MuPlusOne.GeneticAlgorithm  - Testing candidate
1101010101111111001100000101011100110000100001111111001000001001101101000011100101011000000100000000110110001110
1101010101111111001100000101011100110000100001111111001000001001101101000011100101011000000100000000110110001110
5853 [Thread-2] DEBUG Threadmain  - entering run of Robot Interface
5869 [Thread-2] DEBUG Threadmain  - Send Genome via serial and wait for a response
5881 [Thread-2] DEBUG Threadmain  - fitness is 0.0
1101010101111111001100000101011100110000100001111111001000001001101101000011100101011000000100000000110110001110
1101010101111111001100000101011100110000100001111111001000001001101101000011100101011000000100000000110110001110
10
Output buffer after put
1101010101111111001100000101011100110000100001111111001000001001101101000011100101011000000100000000110110001110

Buffer isn't empty

initialFitness method
1110110101110000001100101100110001001010010101011110001101010010100100111011111000011101110000001110100111001111
16593 [main] DEBUG org.dnsdojo.ryanhost.GA.MuPlusOne.GeneticAlgorithm  - Testing candidate
1110110101110000001100101100110001001010010101011110001101010010100100111011111000011101110000001110100111001111
1110110101110000001100101100110001001010010101011110001101010010100100111011111000011101110000001110100111001111
16944 [Thread-3] DEBUG Threadmain  - entering run of Robot Interface
16953 [Thread-3] DEBUG Threadmain  - Send Genome via serial and wait for a response
16964 [Thread-3] DEBUG Threadmain  - fitness is 0.0
1110110101110000001100101100110001001010010101011110001101010010100100111011111000011101110000001110100111001111
1110110101110000001100101100110001001010010101011110001101010010100100111011111000011101110000001110100111001111
10
Output buffer after put
1110110101110000001100101100110001001010010101011110001101010010100100111011111000011101110000001110100111001111

Buffer isn't empty
Était-ce utile?

La solution

My approach is to eliminate the obviously unrelated threads. For you, that's likely to be

  • Attach Listener
  • Service Thread
  • Signal Dispatcher
  • Finalizer
  • Reference Handler
  • VM Thread
  • VM Periodic Task Thread

These are VM managed threads, ie, you didn't create them explicitly or implicitly.

That leaves Thread-x and the Main threads.

Looking at the stack traces, you can infer why things are WAITING based on your knowledge of the code. For example, it looks like the queue is waiting for a Condition to be met;

parking to wait for <0xab770958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

Conditions are signaled to wake so you might think about why a signal message isn't being sent. Because this condition is contained in the queue implementation, it feels unlikely that the problem isn't that you forgot to call signal. However, you may have forgotten to call something that would in turn call it? or inadvertently setup a situation where a condition is never met?

So I'd look at the flow of events with a view to ensuring anything that is locked is unlocked or for any conditions that would cause a live lock (bear in mind that the JVM/samauri can detect deadlocks but not live locks).

That's my general thoughts on how to solve it. If you wanted to post the code somewhere, it'd be helpful in actually debugging it / offering more insight how people might go about solving it.

Good luck!

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top