I hate to beat a dead horse here but I upped the log output to something other than the default 3. This particular machine just will not connect. I can telnet to the ports fine from this particular machine, and other servers that sit on its LAN are able to hit our LEM device w/o issue.
I get two consistent error messages. One where the client can't seem to establish and keep the connection.
(Fri Aug 30 15:24:37 CDT 2013) DD:DEBUG MODE [NioSelector v23873] {NioComNetworkParent:52} Connecting to server using address: '172.20.0.167' and port: '37892';
(Fri Aug 30 15:24:37 CDT 2013) DD:DEBUG [NioSelector v23873] {NioComNetworkParent:52} Successful binding to port: 37893;
(Fri Aug 30 15:24:38 CDT 2013) DD:DEBUG MODE [NioComNetworkParent v24745] {BBS:DequeueToComm-1:28} Waiting on trigeoauth before sending message.;
(Fri Aug 30 15:24:38 CDT 2013) DD:DEBUG MODE [NioComNetworkParent v24745] {BBS:DequeueToComm-1:28} Setting kill timer for: sendPacketViaDataChannel;
(Fri Aug 30 15:24:45 CDT 2013) DD:DEBUG MODE [NioComNetworkParent v24745] {Timer-2:25} Waiting on trigeoauth before sending message.;
(Fri Aug 30 15:24:45 CDT 2013) DD:DEBUG MODE [NioComNetworkParent v24745] {Timer-2:25} Setting kill timer for: sendPacketViaDataChannel;
(Fri Aug 30 15:24:59 CDT 2013) EE:ERR [NioSelector v23873] {NioComNetworkParent:52} Connection status: Unable to complete nio connection to address 172.20.0.167/37892 Connection timed out: no further information;
(Fri Aug 30 15:24:59 CDT 2013) DD:DEBUG MODE [NioSelector v23873] {NioComNetworkParent:52} EXCEPTION: java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at com.trigeo.core.communications.nio.client.NioSelectorOnClient.initiateConnection(NioSelectorOnClient.java:200)
at com.trigeo.core.communications.nio.NioCenter.connect(NioCenter.java:329)
at com.trigeo.core.communications.nio.NioCenter.run(NioCenter.java:296)
at java.lang.Thread.run(Unknown Source)
at com.trigeo.util.TriGeoThread.run(TriGeoThread.java:57)
The second is where the initial request is made from the agent to the appliance but for whatever reason the data stream is broken. I can see the connection made on the LEM device but it will eventually drop.
$ netstat -ano | grep 172.22.0.41
tcp 0 0 172.20.0.167:37890 172.22.0.41:37893 ESTABLISHED off (0.00/0/0)
(Fri Aug 30 15:30:04 CDT 2013) DD:DEBUG MODE [NioComNetworkParent v24745] {ComModuleSpop:20} bound to local port: 37893;
(Fri Aug 30 15:31:04 CDT 2013) EE:ERR [NioComNetworkParent v24745] {ComModuleSpop:20} EXCEPTION: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at com.trigeo.core.communications.common.ComNetworkParent.writeMessageToCommandChannel(ComNetworkParent.java:1217)
at com.trigeo.core.communications.common.ComNetworkParent.sendParentViaCommandChannelForResponse(ComNetworkParent.java:328)
at com.trigeo.core.communications.common.ComNetworkParent.installRequest(ComNetworkParent.java:247)
at com.trigeo.core.communications.nio.client.NioComNetworkParent.installRequest(NioComNetworkParent.java:107)
at com.trigeo.core.communications.common.ComModule.autoInstall(ComModule.java:550)
at com.trigeo.core.communications.common.ComModule.setUp(ComModule.java:364)
at com.trigeo.core.communications.spop.ComModuleSpop.run(ComModuleSpop.java:172)
at java.lang.Thread.run(Unknown Source)
at com.trigeo.util.TriGeoThread.run(TriGeoThread.java:57)
Is there something I'm missing java related perhaps? Another strange thing I noticed was that the CommDataQueue will fill up with about 320 files each only about 32kb for a total of 9.9mb but it seems to puke as well when attempting to send the queued data.
(Fri Aug 30 15:15:58 CDT 2013) WW:WARNING [BuffBytesOneReaderOneWriter v24761] {pool-1-thread-1:38} CommDataQueue 38 Queue file disk space has exceeded 10240 KBs which is the maximum allowed;
(Fri Aug 30 15:15:58 CDT 2013) WW:WARNING [BuffBytesOneReaderOneWriter v24761] {pool-1-thread-1:38} CommDataQueue 38 Buffer dump to queue file cancelled;
I'm getting tired of having to uninstall, trash the files from the registry, reboot, and re-install the agent on this machine.