Since last week we have spent quite some time debugging Samsung SMRP6400/SMDK64X0 IIC drivers, I thought I might share with one particular example here. It is both a good showcase of the hardware/software synchronization issues and since the bugs are in the latest version of the drivers shipped together with the development platforms, it might save someone from additional headaches.
So, while working on the drivers for IIC one of our automated tests to make sure that IIC still works was to write some data on the bus, do an immediate read-back and verify that both data written and read back matches; something similar to this:
UCHAR outData[3] = { REGISTER, DATA_BYTE_1, DATA_BYTE2 }; UCHAR inData[2] = { 0, 0 }; IIC_Write(SLAVE_ADDRESS, outData, 3); IIC_Read(SLAVE_ADDRESS, REGISTER, inData, 2); if ((inData[0] != DATA_BYTE_1) || (inData[1] != DATA_BYTE_2)) { DEBUGMSG(ERROR_MSG, (TEXT("Immediate write/read data mismatch: data sent [0x%X " \ "0x%X] differs from the data received [0x%X, 0x%X]."), DATA_BYTE_1, DATA_BYTE_2, inData[0], inData[1])); }
However, after we added some IIC read calls from another hardware driver it started spitting fire throwing the following error message:
Immediate write/read data mismatch: data sent [0xCA, 0xFE] differs from the data received [0xCA, 0xFE].
Now this clearly meant we had a serious problem: the error message was saying that the data does not match, which obviously was not the case as shown by the message text!
Since we were pretty confident about our side of things, as well as the readings from the scope, it seemed like a good time to start looking at the Samsung IIC drivers (and especially s3c64X0_iic_lib.cpp). The driver structure there is pretty straightforward: the first read/write byte on the bus triggers the hardware IRQ, which is mapped to SysIntr triggering a transfer event; then any subsequent call to a read/write function blocks waiting for a transfer-done
event, which is triggered when the last byte is read/written in the IST.
Everything looks sane up to the point where a transfer-done
event is signalled from the IST (pseudocode, not the original code below, due to the legal issues):
static HANDLE ghTransferEvent; static HANDLE ghTransferDone; static DWORD IICSysIntr; ... static DWORD IST(LPVOID lpContext) { BOOL bTransferDone = FALSE; while (TRUE) { WaitForSingleObject(ghTransferEvent, INFINITE); switch (IIC_BUS_STATUS) { case MASTER_RECEIVE: // receive bytes and store them in the buffer break; case MASTER_TRANSMIT: // transmit bytes from the buffer in memory if (LAST_BYTE) bTransferDone = TRUE; break; InterruptDone(IICSysIntr); if (bTransferDone) { SetEvent(ghTransferDone); } } } }
Two major problems with this code are:
- The
bTransferDone
variable is never set from the IIC read, and hence thetransfer-done
event for bus reads is never triggered, - After the
bTransferDone
is set from the IIC write, it is never reset, hence thetransfer-done
event is triggered after reading/writing single byte on the bus in all subsequent transactions.
That explains the initial test case failure: during the write/immediate read data comparison the data arrives exactly between the if
statement and the following printout, thus triggering the error message, but printing the correct data to the output due to the early signal of the event.
The way to solve this is also straightforward: make sure that the bTransferDone
variable is cleared after the transfer-done
event is triggered, and make sure that master-receive mode sets the bTransferDone
variable after reading the last byte of the transaction from the IIC bus.
In pseudocode it would look similar to:
static DWORD IST(LPVOID context) { BOOL bTransferDone = FALSE; while (TRUE) { WaitForSingleObject(ghTransferEvent, INFINITE); switch (IIC_BUS_STATUS) { case MASTER_RECEIVE: // receive bytes and store them in the buffer if (LAST_BYTE) bTransferDone = TRUE; break; case MASTER_TRANSMIT: // transmit bytes from the buffer in memory if (LAST_BYTE) bTransferDone = TRUE; break; InterruptDone(IICSysIntr); if (bTransferDone) { bTransferDone = FALSE; SetEvent(ghTransferDone); } } } }
The lesson of the day (quoting my colleague) is: "The first rule about multithreading - you're wrong".