Page 1 of 1

UDP receive latency on NANO54415

Posted: Fri Nov 08, 2013 10:58 am
by roman959
Hello,

We are running a latency-sensitive application on a NANO54415 module and are experiencing very strange periodic spikes in UDP receive latency. The problem can be demonstrated with a following test setup:

1. NANO54415 is running a trivial loop that receives UDP packets and measures the time spent waiting for the next packet. This code looks like this (where RunningStat object keeps a running average of the wait time using Welford’s method):

Code: Select all

       //  Initialization code is here …

	DEBUGTimer = HiResTimer::getHiResTimer();
	DEBUGTimer->init();
	DEBUGTimer->start();

	while (1) {
		RunningStat stats;
		int i = 0;
		while (i < 10000) {
			const double udpReadBlock = 1.0e6 * DEBUGTimer->readTime();
			UDPPacket upkt(&fifo, 0);
			const double udpReadUnblock = 1.0e6 * DEBUGTimer->readTime();
			if (upkt.Validate())
			{
				++i;
				const double delta = udpReadUnblock - udpReadBlock;
				stats.push(delta);
				
			}
		} 

		// stats are printed to the terminal here…
	}

2. The Netburner is connected to another embedded device via a crossover Ethernet cable and this device is spitting a single 15-byte UDP packet at 100Hz. We are keeping track of the elapsed time between successive sends using the same RunningStat object.

When running with the default TICKS_PER_SECOND value of 20, a 10000-sample test run produces the following output (all times are in microseconds):
On the sender device:
mean: 10077.3
sigma: 3.7036
min: 10050.8
max: 10106.8
On the Netburner:
mean: 10077.068004
sigma: 541.726130
min: 48.128000
max: 59706.088000
If we increase the TICKS_PER_SECOND value to 100, the latency improves as follows:

On the sender device:
mean: 10077.3
sigma: 3.8715
min: 10058.7
max: 10107.8
On the Netburner:
mean: 10077.189543
sigma: 120.278000
min: 1990.520000
max: 18856.000000

These results are completely repeatable. We have tried switching to non-blocking UDP receive by modifying the UDPPacket::UDPPacket( OS_FIFO *pfifo, DWORD wait ) constructor in udp.cpp to use:
m_p = ( pool_buffer * ) OSFifoPendNoWait( pfifo);

However, this did not alter the timing results.


Any ideas as to what may be causing the latency spike will be greatly appreciated.

Thanks,

-Roman

Re: UDP receive latency on NANO54415

Posted: Fri Nov 08, 2013 3:36 pm
by pbreed
What NNDK version are you running?

We had some latency issues with the NANO and MOD54 early on due to cache prefetch and null ponters...

Re: UDP receive latency on NANO54415

Posted: Fri Nov 08, 2013 3:46 pm
by roman959
NNDK 2.6.2

Re: UDP receive latency on NANO54415

Posted: Fri Nov 08, 2013 4:35 pm
by pbreed
2.6.2 still has the latency bug.

Run the latest beta and this should fix your issue.

I wrote a personal blog post about finding this bug...

http://unreasonablerocket.blogspot.com/ ... e-bug.html

Paul

Re: UDP receive latency on NANO54415

Posted: Mon Nov 11, 2013 11:48 am
by roman959
Unfortunately, switching to the latest NNDK-2.6.3 did not substantially change the stats. And the dependence on TICKS_PER_SECOND is still there. There is another source of latency lurking somewhere...

Is there anything else we can try?

Re: UDP receive latency on NANO54415

Posted: Mon Nov 11, 2013 1:15 pm
by pbreed
Are you 100% sure you rebuilt everything?

Re: UDP receive latency on NANO54415

Posted: Mon Nov 11, 2013 1:59 pm
by roman959
I am, actually. I created a new workspace and a new project in NBEclipse 2.6.3, verified that correct paths are used by the compiler, and even moved the old 2.6.2 install to a new location just to be sure..

Re: UDP receive latency on NANO54415

Posted: Tue Nov 12, 2013 9:54 am
by pbreed
Can you submit a support ticket on this.

Looking at recent checkins, its possible there is a TCP issue causing the
a one tick lock, but I need to know a lot more about what is going on and what other things are runnign besides UDP.

Paul

Re: UDP receive latency on NANO54415

Posted: Tue Nov 12, 2013 11:52 am
by roman959
Just realized that the problem goes away if we stay with default UserMain priority of 10, instead of changing it to 50. So something is indeed preempting our code for 1 tick when main is running at lower priority. Any idea what that might be?

Created ticket 33347 and attached our test code.

Thanks!