There is a potential deadlock issue with synchronous asyn port drivers (ASYN_CANBLOCK=0) and asyn device support. Thanks to Ambroz Bizjak of CosyLab for finding this problem. The problem occurs under the following conditions:
Note that the deadlock will not occur unless the following 3 conditions are met:
Because the problem only occurs when the above conditions are met, the problem has been very rare in practice.
I don't see a simple solution to this problem. Drivers need to take a mutex to protect data structures when they are accessed for doing callbacks. However, the problem can be worked around by either setting the driver to be asynchronous (ASYN_CANBLOCK=1) or by enabling ring buffers on any stringin or waveform records that are in the same lockset as asyn output records.
The problem can be reproduced using the following files in testAsynPortDriverApp:
asynPortDriver was not returning an error when undefined values were read from the parameter library. This could cause output records to contain undefined data because the initial read of the driver in device support initialization did not return an error when it should have.
As reported by Perter Müller, some NULL-pointer dereferences were added to asyn/asnGpib/asynGpib.c in R4-7. These The following change fixes the problem:
--- asyn/asynGpib/asynGpib.c 28 May 2008 18:58:50 -0000 1.43 +++ asyn/asynGpib/asynGpib.c 19 Oct 2009 13:58:50 -0000 @@ -410,12 +410,12 @@ if(status!=asynSuccess) return status; if(pgpibPvt->eoslen==1 && nt>0) { if(data[nt-1]==pgpibPvt->eos) { - *eomReason |= ASYN_EOM_EOS; + if (eomReason) *eomReason |= ASYN_EOM_EOS; nt--; } } if(nt<maxchars) data[nt] = 0; - if(nt==maxchars) *eomReason |= ASYN_EOM_CNT; + if((nt==maxchars) && eomReason) *eomReason |= ASYN_EOM_CNT; *nbytesTransfered = (size_t)nt; pasynOctetBase->callInterruptUsers(pasynUser,pgpibPvt->pasynPvt, data,nbytesTransfered,eomReason);
A problem was introduced in R4-11 by not starting the autoconnect process until iocInit(), and operations that do not use the XXXSyncIO functions thus fail before iocInit(). This means, for example, that calls to asynSetOption() to set serial port parameters fail if done in a startup script before iocInit(). This is fixed in R4-12.
There is a bug with autoconnecting to devices. The following change to asynManager.c fixes the problem:
@@ -605,7 +607,7 @@ epicsMutexUnlock(pport->asynManagerLock); connectAttempt(&pdevice->dpc); epicsMutexMustLock(pport->asynManagerLock); - pport->dpc.autoConnectActive = FALSE; + pdevice->dpc.autoConnectActive = FALSE; } return pdevice->dpc.connected; }
The change that was done in R4-9 with direct calls to dbScanLock and process in the interrupt callback functions can lead to deadlocks in some unusual circumstances. This is fixed in R4-10.
There is a null pointer dereference problem for all device support when SCAN=I/O Intr and asyn port could not be found.
There is buffer overflow problem if NRRD is set to more than 40 in ASCII input mode.
Does not yet support new timeout semantics (timeout<0 means wait forever, timeout=0 means return characters immediately available). Support for timeout values less than zero will be part of the next release.
The wrong interrupt callback routine was being called.
UDF is not set false when the VAL field is modified.
Index: devAsynFloat64.c =================================================================== RCS file: /net/phoebus/epicsmgr/cvsroot/epics/modules/soft/asyn/asyn/devEpics/devAsynFloat64.c,v retrieving revision 1.18 retrieving revision 1.20 diff -u -r1.18 -r1.20 --- devAsynFloat64.c 13 Apr 2006 17:16:46 -0000 1.18 +++ devAsynFloat64.c 11 May 2006 23:25:15 -0000 1.20 @@ -41,13 +41,6 @@ #include "asynFloat64.h" #include <epicsExport.h> -typedef enum { - typeAiFloat64, - typeAiFloat64Average, - typeAiFloat64Interrupt, - typeAoFloat64 -}asynAnalogDevType; - typedef struct devPvt{ dbCommon *pr; asynUser *pasynUser; @@ -56,7 +49,6 @@ void *float64Pvt; void *registrarPvt; int canBlock; - asynAnalogDevType devType; epicsMutexId mutexId; asynStatus status; int gotValue; @@ -303,7 +295,7 @@ dbCommon *pr = pPvt->pr; asynPrint(pPvt->pasynUser, ASYN_TRACEIO_DEVICE, - "%s devAsynFloat64::interruptCallbackInput new value=%lu\n", + "%s devAsynFloat64::interruptCallbackAverage new value=%f\n", pr->name, value); epicsMutexLock(pPvt->mutexId); pPvt->numAverage++; @@ -399,7 +391,7 @@ devPvt *pPvt; status = initCommon((dbCommon *)pai,&pai->inp, - 0,interruptCallbackInput); + 0,interruptCallbackAverage); if (status != asynSuccess) return 0; pPvt = pai->dpvt; status = pPvt->pfloat64->registerInterruptUser( @@ -417,13 +409,16 @@ devPvt *pPvt = (devPvt *)pai->dpvt; epicsMutexLock(pPvt->mutexId); - if (pPvt->numAverage == 0) pPvt->numAverage = 1; + if (pPvt->numAverage == 0) + pPvt->numAverage = 1; + else + pai->udf = 0; pai->val = pPvt->sum/pPvt->numAverage; pPvt->numAverage = 0; pPvt->sum = 0.; epicsMutexUnlock(pPvt->mutexId); asynPrint(pPvt->pasynUser, ASYN_TRACEIO_DEVICE, - "%s devAsynAnalog::callbackAiAverage val=%f\n", + "%s devAsynFloat64::callbackAiAverage val=%f\n", pai->name, pai->val); return 2; }
NULL pointer dereference. How this one slipped through testing is quite surprising.
Index: asyn/drvAsynSerial/drvAsynIPPort.c =================================================================== RCS file: /net/phoebus/epicsmgr/cvsroot/epics/modules/soft/asyn/asyn/drvAsynSerial/drvAsynIPPort.c,v retrieving revision 1.30 retrieving revision 1.31 diff -u -r1.30 -r1.31 --- asyn/drvAsynSerial/drvAsynIPPort.c 25 Apr 2006 17:50:02 -0000 1.30 +++ asyn/drvAsynSerial/drvAsynIPPort.c 11 May 2006 21:12:45 -0000 1.31 @@ -11,7 +11,7 @@ ***********************************************************************/ /* - * $Id: KnownProblems.html,v 1.39 2009-10-19 14:05:49 norume Exp $ + * $Id: KnownProblems.html,v 1.39 2009-10-19 14:05:49 norume Exp $ */ /* Previous versions of drvAsynIPPort.c (1.29 and earlier, asyn R4-5 and earlier) @@ -386,7 +386,7 @@ status = asynError; } #endif - *gotEom = 0; + if (gotEom) *gotEom = 0; #ifdef USE_POLL { struct pollfd pollfd; Index: asyn/drvAsynSerial/drvAsynSerialPort.c =================================================================== RCS file: /net/phoebus/epicsmgr/cvsroot/epics/modules/soft/asyn/asyn/drvAsynSerial/drvAsynSerialPort.c,v retrieving revision 1.34 retrieving revision 1.35 diff -u -r1.34 -r1.35 --- asyn/drvAsynSerial/drvAsynSerialPort.c 3 Apr 2006 23:38:19 -0000 1.34 +++ asyn/drvAsynSerial/drvAsynSerialPort.c 11 May 2006 21:12:45 -0000 1.35 @@ -11,7 +11,7 @@ ***********************************************************************/ /* - * $Id: KnownProblems.html,v 1.39 2009-10-19 14:05:49 norume Exp $ + * $Id: KnownProblems.html,v 1.39 2009-10-19 14:05:49 norume Exp $ */ #include <string.h> @@ -778,7 +778,7 @@ #endif } tty->timeoutFlag = 0; - *gotEom = 0; + if (gotEom) *gotEom = 0; for (;;) { #ifdef vxWorks /*
If a client calls asynCommon->connect when the asyn port is already connected to the IP port, then the asyn port will be disconnected from the IP port for all clients. The correct behavior in this case is to simply return an asynError status.
The following change fix can be made to drvAsynIPPort.c to fix the problem:
corvette> cvs diff -rR4-5 drvAsynIPPort.c Index: drvAsynIPPort.c =================================================================== RCS file: /net/phoebus/epicsmgr/cvsroot/epics/modules/soft/asyn/asyn/drvAsynSerial/drvAsynIPPort.c,v retrieving revision 1.27 retrieving revision 1.28 diff -u -r1.27 -r1.28 --- drvAsynIPPort.c 3 Apr 2006 23:38:19 -0000 1.27 +++ drvAsynIPPort.c 17 Apr 2006 15:36:40 -0000 1.28 @@ -11,7 +11,7 @@ ***********************************************************************/ /* - * $Id: KnownProblems.html,v 1.39 2009-10-19 14:05:49 norume Exp $ + * $Id: KnownProblems.html,v 1.39 2009-10-19 14:05:49 norume Exp $ */ #include <string.h> @@ -206,6 +206,11 @@ * Sanity check */ assert(tty); + if (tty->fd >= 0) { + epicsSnprintf(pasynUser->errorMessage,pasynUser->errorMessageSize, + "%s: Link already open!", tty->serialDeviceName); + return asynError; + } asynPrint(pasynUser, ASYN_TRACE_FLOW, "Open connection to %s\n", tty->serialDeviceName);
This fails for fast processors like the 2700 and 5100
UDF is not set false when the VAL field is modified.
Device support is not returning 2 (do not convert) for ai records when it should. This means that the VAL field is being set back to 0 by the record after device support writes to it.
The record sometimes does not read the current input and output EOS values from the driver when it connects.
If read reads maxchars, it forces the last character to be 0 and returns asynOverflow if it wasn't.
These do not properly set an error message in asynUser.errorMessage when they return asynError.
This calls setOption for clocal. This only works on vxWorks because vxWorks uses the name CLOCAL for what POSIX calls CRTSCTS.
If a call to a low level driver, which registered itself as canBlock, completes without blocking then the asynchronous completion may never occur. This will be fixed in the next release.
The problem reported for version 4 about segmentation faults on cygwin-x86 has been fixed.
asynRecord (and other code) use epicsStrSnPrintEscaped. In EPICS 3.14.6 the files epicsVsnprintf on vxWorks (which gets called by several other epicsXXXprintf routines) and epicsStrSnPrintEscaped each have a bug could cause buffer overflow. These bugs are guaranteed to lead to corruption in asynRecord if the received string is longer than 40 characters.
static const struct asynOctet drvAsynSerialPortAsynOctet = {must be changed to
static struct asynOctet drvAsynSerialPortAsynOctet = {i.e. remove the const keyword.
static const struct asynOctet drvAsynIPPortAsynOctet = {must be changed to
static struct asynOctet drvAsynIPPortAsynOctet = {i.e. remove the const keyword.
John Sinclair (ONRL) reported that the IOC crashed if an E5810 was power cycled. This could not be reproduced at APS. We will have to see if it is still a problem.
Attempts to provide support for the serial port of a E5810 have not been successful
If vxiName is specified as "inst" then the driver incorrectly says that it does not block.
If asynRecord is attached to a port that does not implement asynOctet, then asynRecord crashes of it attempt to send/receive a message.
The next release gaurantees that when queueRequest is called:
The RPC library on Mac OS X 10.3.3 does not handle device timeouts properly and may cause core dumps. A bug report has been filed with Apple. A workaround is to use the GNU glibc RPC/XDR routines.
Attempting to change the trace file to "stdout" does not work becuase vxWorks has per-task standard output streams.
When building with EPICS Base R3.14.6 or greater, comment out the epicsInterruptibleSyscall.h and epicsInterruptibleSyscall.c lines in asyn/Makefile:
@@ -19,10 +19,10 @@ SRC_DIRS += $(ASYN)/asynDriver INC += asynDriver.h INC += epicsInterruptibleSyscall.h -INC += epicsString1.h +#INC += epicsString1.h asyn_SRCS += asynManager.c asyn_SRCS += epicsInterruptibleSyscall.c -asyn_SRCS += epicsString1.c +#asyn_SRCS += epicsString1.c SRC_DIRS += $(ASYN)/asynGpib INC += asynGpibDriver.h
The RPC library on Mac OSD 10.3.3 does not handle device timeouts properly and may cause core dumps. A bug report has been filed with Apple. A workaround is to use the GNU glibc RPC/XDR routines.
If a user callback calls a low level driver with an infinite or very long timeout, there is no way to make the call terminate. Is there a generic way to abort the call?
Does not support GPIB specific functions.
This needs to be implemented for asynDriver.
Think about creating generic support for connecting to EPICS records.
Consider generic support for various network protocals: Modbus, etherIP, etc.