I wrote a python 3 script which tests an SPI link to an FPGA. It runs on an Raspberry Pi 3. The test works like this: after putting the FPGA in test mode (a push switch), send the first byte, which can be any value. Then further bytes are sent indefinitely. Each one increments by the first value sent, truncated to 8 bits. Thus, if the first value is 37, the FPGA expects the following sequence:
37, 74, 111, 148, 185, 222, 4, 41 ...
Some additional IO pins are used to signal between the devices - RUN (RPi output) starts the test (necessary because the FPGA times out in about 15ms if it expects a byte) and ERR (FPGA output) signals an error. Errors can thus be counted at both ends.
In addition, the RPi script writes a one line summary of bytes sent and number of erros every million bytes.
All of this works just fine. But after running for about 3 days, I get the following error on the RPi:
free(): invalid pointer: 0x00405340
I get this exact same error on two identical test setups, even the same memory address. The last report says "4294M bytes sent, 0 errors"
I seem to have proved the SPI link, but I am concerned that this long-running program crashes for no apparent reason.
Here is the important part of my test code:
def _report(self, msg):
now = datetime.datetime.now()
os.system("echo \"{} : {}\" > spitest_last.log".format(now, msg))
def spi_test(self):
global end_loop
input("Put the FPGA board into SPI test mode (SW1) and press any key")
self._set_run(True)
self.END_LOOP = False
print("SPI test is running, CTRL-C to end.")
# first byte is sent without LOAD, this is the seed
self._send_byte(self._val)
self._next_val()
end_loop = False
err_flag = False
err_cnt = 0
byte_count = 1
while not end_loop:
mb = byte_count % 1000000
if mb == 0:
msg = "{}M bytes sent, {} errors".format(int(byte_count/1000000), err_cnt)
print("\r" + msg, end="")
self._report(msg)
err_flag = True
else:
err_flag = False
#print("sending: {}".format(self._val))
self._set_load(True)
if self._errors and err_flag:
self._send_byte(self._val + 1)
else:
self._send_byte(self._val)
if self.is_error():
err_cnt += 1
msg = "{}M bytes sent, {} errors".format(int(byte_count/1000000), err_cnt)
print("\r{}".format(msg), end="")
self._report(msg)
self._set_load(False)
# increase the value by the seed and truncate to 8 bits
self._next_val()
byte_count += 1
# test is done
input("\nSPI test ended ({} bytes sent, {} errors). Press ENTER to end.".format(byte_count, err_cnt))
self._set_run(False)
(Note for clarification : there is a command line option to artifically create an error every million bytes. Hence the " err_flag" variable.)
I've tried using python3 in console mode, and there seems to be no issue with the size of the byte_count variable (there shouldn't be, according to what I have read about python integer size limits).
Anyone have an idea as to what might cause this?
In Python, we use the try and except statements to handle exceptions. Whenever the code breaks down, an exception is thrown without crashing the program. Let's modify the add number program to include the try and except statements. Python would process all code inside the try and except statement.
A segfaulting program might be the symptom of a bug in C code–or it might be that your process is running out of memory. Crashing is just one symptom of running out of memory. Your process might instead just run very slowly, your computer or VM might freeze, or your process might get silently killed.
This is known as a 'fork bomb'. It will open more and more copies of itself, and those copies will open yet more copies. Your computer will be unusable within a couple of seconds after clicking the file and it will crash completely within a minute or two.
This issue is connected to spidev versions older than 3.5 only. The comments below were done under assumption that I was using the upgraded version of spidev.
#############################################################################
I can confirm this problem. It is persistent with both RPi3B and RPi4B. Using python 3.7.3 at both RPi3 and RPi4. The version of spidev which I tried were 3.3, 3.4 and the latest 3.5. I was able to reproduce this error several times by simply looping through this single line.
spidevice2.xfer2([0x00, 0x00, 0x00, 0x00])
It takes up to 11 hours depending on the RPi version. After 1073014000 calls (rounded to 1000), the script crashes because of "invalid pointer". The total amount of bytes sent is the same as in danmcb's case. It seems as if 2^32 bytes represent a limit.
I tried different approaches. For example, calling close() from time to time followed by open(). This did not help.
Then, I tried to create the spiDev object locally, so it would re-created for every batch of data.
def spiLoop():
spidevice2 = spidev.SpiDev()
spidevice2.open(0, 1)
spidevice2.max_speed_hz = 15000000
spidevice2.mode = 1 # Data is clocked in on falling edge
for j in range(100000):
spidevice2.xfer2([0x00, 0x00, 0x00, 0x00])
spidevice2.close()
It still crashed at after approx. 2^30 calls of xfer2([0x00, 0x00, 0x00, 0x00]) which corresponds to approx. 2^32 bytes.
EDIT1
To speed up the process, I was sending in blocks of 4096 bytes using the code below. And I repeatedly created the SpiDev object locally. It took 2 hours to arrive at 2^32 bytes count.
def spiLoop():
spidevice2 = spidev.SpiDev()
spidevice2.open(0, 1)
spidevice2.max_speed_hz = 25000000
spidevice2.mode = 1 # Data is clocked in on falling edge
to_send = [0x00] * 2**12 # 4096 bytes
for j in range(100):
spidevice2.xfer2(to_send)
spidevice2.close()
del spidevice2
def runSPI():
for i in range(2**31 - 1):
spiLoop()
print((2**12 * 100 * (i + 1)) / 2**20, 'Mbytes')
EDIT2
Reloading the spidev on the fly does not help either. I tried this code on both RPi3 and RPi4 with the same result:
import importlib
def spiLoop():
importlib.reload(spidev)
spidevice2 = spidev.SpiDev()
spidevice2.open(0, 1)
spidevice2.max_speed_hz = 25000000
spidevice2.mode = 1 # Data is clocked in on falling edge
to_send = [0x00] * 2**12 # 4096 bytes
for j in range(100):
spidevice2.xfer2(to_send)
spidevice2.close()
del spidevice2
def runSPI():
for i in range(2**31 - 1):
spiLoop()
print((2**12 * 100 * (i + 1)) / 2**20, 'Mbytes')
EDIT3
Executing the code snippet did not isolate the problem either. It crashed after the 4th chuck of 1Gbyte-data was sent.
program = '''
import spidev
spidevice = None
def configSPI():
global spidevice
# We only have SPI bus 0 available to us on the Pi
bus = 0
#Device is the chip select pin. Set to 0 or 1, depending on the connections
device = 1
spidevice = spidev.SpiDev()
spidevice.open(bus, device)
spidevice.max_speed_hz = 250000000
spidevice.mode = 1 # Data is clocked in on falling edge
def spiLoop():
to_send = [0xAA] * 2**12
loops = 1024
for j in range(loops):
spidevice.xfer2(to_send)
return len(to_send) * loops
configSPI()
bytes_total = 0
while True:
bytes_sent = spiLoop()
bytes_total += bytes_sent
print(int(bytes_total / 2**20), "Mbytes", int(1000 * (bytes_total / 2**30)) / 10, "% finished")
if bytes_total > 2**30:
break
'''
for i in range(100):
exec(program)
print("program executed", i + 1, "times, bytes sent > ", (i + 1) * 2**30)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With