Clocking down 16bit wide DRAM with 32bit MCU when strange issue is encountered

2

I am working on custom board containing a 32bit MCU (Cortex A5) and a 16bit wide DRAM chip (LPDDR2). The MCU has an on-board DRAM controller which supports both DDR3 and LPDDR2, and I do have a working setup using LPDDR2.

Now, I am trying to half the clock rate at boot time on both MCU and DRAM (they both use the same PLL) due to power-restrictions, and this is where my troubles begin.

As mentioned, I do have a working setup using the full frequency (DRAM: 400MHz, MCU: 396MHz), so one would expect that halving the frequency and updating the timings according to the DRAM datasheet should yeld another working setup, but no.

The DRAM init runs at boot time from MCU intram, so does any tests. The whole procedure is handled by a board-specific version of U-Boot 2015.04.

I have a collection of tests that run at MCU boot to verify DRAM integrity. One of these tests is a so-called "walking bit"-test, where I use a 32bit uint, a toggle each bit in sequence, reading back to verify.

What I found was that, when reading back, the lower 16 bits have not been touched, while the upper 16 bits seems altered. After some investigation, I found the following pattern (assuming a watermark "0xaa"):

   write    ->  readback
0x8000_0000 -> 0x0000_aaaa
0x4000_0000 -> 0x0000_aaaa
0x2000_0000 -> 0x0000_aaaa
0x1000_0000 -> 0x0000_aaaa
[...]
0x0008_0000 -> 0x0000_aaaa
0x0004_0000 -> 0x0000_aaaa
0x0002_0000 -> 0x0000_aaaa
0x0001_0000 -> 0x0000_aaaa

0x0000_8000 -> 0x8000_aaaa
0x0000_4000 -> 0x4000_aaaa
0x0000_2000 -> 0x2000_aaaa
0x0000_1000 -> 0x1000_aaaa
[...]
0x0000_0008 -> 0x0008_aaaa
0x0000_0004 -> 0x0004_aaaa
0x0000_0002 -> 0x0002_aaaa
0x0000_0001 -> 0x0001_aaaa

The watermark is present, although I suspect it got there from a previous debugging-session. This I will address later, hence my primary focus at the moment is getting the "walking bit"-test to pass.

Here is a memory dump:

(gdb) x/16b addr  
0x80000000:     0x00    0x00    0x55    0x55    0x55    0x55    0x00    0x80
0x80000008:     0xaa    0xaa    0xaa    0xaa    0xaa    0xaa    0x00    0x55
(gdb) p/x *addr
$59 = 0x55550000
(gdb) set *addr = 0xaabbccdd
(gdb) p/x *addr 
$60 = 0xccdd0000
(gdb) x/16b addr
0x80000000:     0x00    0x00    0xdd    0xcc    0xbb    0xaa    0x00    0x80
0x80000008:     0xaa    0xaa    0xaa    0xaa    0xaa    0xaa    0x00    0x55

Can anyone tell my what might cause this type of behaviour?

Cheers

Note: I have intentionally left out MCU and DRAM specifications, as I believe that the question can be addressed only with JEDEC/DFI in mind.

Edit: Added memory dump.

Edit: Here is the source of the "walking bit"-test. Run from MCU intram on memory area located on DRAM. Assumed bug-free:

static u32 __memtest_databus(volatile u32 * const addr)
{
  /* Walking bit */

  u32 pattern = (1u << 31);
  u32 failmask = 0;

  for(; pattern; pattern >>= 1)
  {
    *addr = pattern;

    if(*addr != pattern)
      failmask |= pattern;
  }

  return failmask;
}

Edit: The PLL and VCO has been checked, and settings are correct. PLL is stable and DRAM PHY does obtain a lock.

Link to DRAM Data Sheet

arm
embedded
cortex-a
asked on Stack Overflow May 10, 2016 by Tom • edited May 13, 2016 by Tom

2 Answers

0

the bytes look like they have shifted, not altered.

quote

(gdb) x/16b addr
0x80000000:     0x00    0x00    *0xdd    0xcc    0xbb    0xaa*    0x00    0x80
0x80000008:     0xaa    0xaa    0xaa    0xaa    0xaa    0xaa    0x00    0x55

unquote

answered on Stack Overflow May 10, 2016 by Mrunmoy
0

You have one severe bug here: u32 pattern = (1 << 31);.

The integer constant 1 is of type int, which is 32 bits on your ARM system.

You left shift this signed number out of bounds and invoke undefined behavior; anything could happen. The variable pattern can get any value.

Correct code would be u32 pattern = (u32)1 << 31; or u32 pattern = 1u << 31;

answered on Stack Overflow May 10, 2016 by Lundin

User contributions licensed under CC BY-SA 3.0