Discussion:
pci_abort messages from cx88 driver
(too old to reply)
John Sager
2008-12-17 14:59:16 UTC
Permalink
This seems to have cropped up sporadically on mailing lists and fora,
with no real resolution indicated. I have just bought a Hauppauge
WinTV-NOVA-HD-S2 card (recognised as HVR4000(Lite)) which exhibits
this problem in my system. I'm running Mythbuntu 8.10 on a quad core
Intel-based system - P35/ICH9 chipset - with the v4l-dvb drivers
cloned on 16th December. I don't get the problem on first start-up,
but if I change channels it starts to appear. However it does seem to
stop sometimes on channel change. I suspect the problem is either some
kind of race condition between the Intel & Conexant PCI controllers, or
some kind of missed or wrong step in chip reconfiguration after a channel
change.

When this error occurs, the standard behaviour of the code in cx88-mpeg.c
is to stop the DMA current transfer & then restart the queue. This drops
data, leading to blocky visuals & sound glitches. As an experiment, I
changed the test for general errors in cx8802_mpeg_irq() to ignore the
pci_abort error (change 0x1f0100 to 0x170100), and this completely
eliminates the dropped data problem. This suggests that the pci transfers
complete properly and the pci_abort status is a spurious indication.
I also fixed the mask in the test for cx88_print_irqbits() to stop these
messages filling up the log (change ~0xff to ~0x800ff).

It may be worth fixing this in the main code to hide the problem for
unfortunate users of this & related cards until the real problem is
found. Unfortunately I doubt I can help there as a detailed knowledge
of the Conexant PCI interface device is probably required to pursue it.

regards,

John
Andy Walls
2008-12-17 22:28:20 UTC
Permalink
Post by John Sager
This seems to have cropped up sporadically on mailing lists and fora,
with no real resolution indicated. I have just bought a Hauppauge
WinTV-NOVA-HD-S2 card (recognised as HVR4000(Lite)) which exhibits
this problem in my system. I'm running Mythbuntu 8.10 on a quad core
Intel-based system - P35/ICH9 chipset - with the v4l-dvb drivers
cloned on 16th December. I don't get the problem on first start-up,
but if I change channels it starts to appear. However it does seem to
stop sometimes on channel change. I suspect the problem is either some
kind of race condition between the Intel & Conexant PCI controllers, or
some kind of missed or wrong step in chip reconfiguration after a channel
change.
When this error occurs, the standard behaviour of the code in cx88-mpeg.c
is to stop the DMA current transfer & then restart the queue. This drops
data, leading to blocky visuals & sound glitches. As an experiment, I
changed the test for general errors in cx8802_mpeg_irq() to ignore the
pci_abort error (change 0x1f0100 to 0x170100), and this completely
eliminates the dropped data problem. This suggests that the pci transfers
complete properly and the pci_abort status is a spurious indication.
You've logically leaped too far. You can only say that the aborted PCI
transfers, if any actually happened, didn't matter to apparent proper
operation of the device in it's current mode of operation.

That said, maybe the best course of action is to ignore PCI aborts when
a capture is ongoing. It however, may not be the best idea to ignore
such errors when setting up for a capture or controlling I2C device
through the chip.
Post by John Sager
I also fixed the mask in the test for cx88_print_irqbits() to stop these
messages filling up the log (change ~0xff to ~0x800ff).
It may be worth fixing this in the main code to hide the problem for
unfortunate users of this & related cards until the real problem is
found. Unfortunately I doubt I can help there as a detailed knowledge
of the Conexant PCI interface device is probably required to pursue it.
Maybe not. Look at the cx18 driver where a similar issue was
confronted.

1) All the PCI MMIO accesses were wrapper-ed into functions defined in
cx18-io.[ch]

2) All PCI writes were double checked for a proper readback & retried;
PCI reads were checked for being 0xffffffff and retried; and statistics
were collected on how often this happened and what actions
mattered/helped.

3) The read retires were eliminated - they never helped fix anything.
Some of the write retries were modified slightly: some registers will
never readback what you just wrote to them, by the very nature of their
operation (e.g. clearing interrupt masks)

4) The statistics gathering was removed.


A lot of work that toughed almost every file in the driver and was a
real pain to implement. It was needed for reliable operation of the
device, especially in older systems.

So much for a "software transparent IO bus" that PCI was supposed to be.

Regards,
Andy
Post by John Sager
regards,
John
John Sager
2008-12-18 14:38:54 UTC
Permalink
Andy,

Thanks for the reply.
Post by Andy Walls
You've logically leaped too far. You can only say that the aborted PCI
transfers, if any actually happened, didn't matter to apparent proper
operation of the device in it's current mode of operation.
That said, maybe the best course of action is to ignore PCI aborts when
a capture is ongoing. It however, may not be the best idea to ignore
such errors when setting up for a capture or controlling I2C device
through the chip.
The interrupts in question are specifically related to the transport
stream.
Post by Andy Walls
Post by John Sager
It may be worth fixing this in the main code to hide the problem for
unfortunate users of this & related cards until the real problem is
found. Unfortunately I doubt I can help there as a detailed knowledge
of the Conexant PCI interface device is probably required to pursue it.
Maybe not. Look at the cx18 driver where a similar issue was
confronted.
Yuk. I see what you mean. I don't think I really want to go that far.
The fix is minor so I don't mind patching kernels for my own use
when they get upgraded. If I start getting other problems as a consequence
I may just give up on the card but fingers-crossed it's working OK
for now.

regards,

John

Loading...