ARM Cortex A7 returning PMCCNTR = 0 in kernel mode, and Illegal instruction in user mode (even after PMUSERENR = 1)

3

I want to read the cycle count register (PMCCNTR) on a Raspberry Pi 2, which has an ARM Cortex A7 core. I compile a kernel module for it as follows:

#include <linux/module.h>
#include <linux/kernel.h>

int init_module()
{
  volatile u32 PMCR, PMUSERENR, PMCCNTR;

  // READ PMCR
  PMCR = 0xDEADBEEF;
  asm volatile ("mrc p15, 0, %0, c9, c12, 0\n\t" : "=r" (PMCR));
  printk (KERN_INFO "PMCR = %x\n", PMCR);

  // READ PMUSERENR 
  PMUSERENR = 0xDEADBEEF;
  asm volatile ("mrc p15, 0, %0, c9, c14, 0\n\t" : "=r" (PMUSERENR));
  printk (KERN_INFO "PMUSERENR = %x\n", PMUSERENR);

  // WRITE PMUSERENR = 1
  asm volatile ("mcr p15, 0, %0, c9, c14, 0\n\t" : : "r" (1));

  // READ PWMUSERENR AGAIN
  asm volatile ("mrc p15, 0, %0, c9, c14, 0\n\t" : "=r" (PMUSERENR));
  printk (KERN_INFO "PMUSERENR = %x\n", PMUSERENR);

  // READ PMCCNTR
  PMCCNTR = 0xDEADBEEF;
  asm volatile ("mrc p15, 0, %0, c9, c13, 0\n\t" : "=r" (PMCCNTR));
  printk (KERN_ALERT "PMCCNTR = %x\n", PMCCNTR);
  return 0;
}

void cleanup_module()
{
}

MODULE_LICENSE("GPL");

and, after insmod, I observe the following in /var/log/kern.log:

PMCR = 41072000
PMUSERENR = 0
PMUSERENR = 1
PMCCNTR = 0

When I try to read PMCCNTR from user-mode, I get illegal instruction, even after PMUSERENR has been set to 1.

Why does PMCCNTR read as 0 in kernel mode, and an illegal instruction in user-mode? Is there something else I need to do that I'm not doing to enable the PMCCNTR?

Update 1

Partly solved. The solution to the multi-core issue is to call on_each_cpu like so:

#include <linux/module.h>
#include <linux/kernel.h>

static void enable_ccnt_read(void* data)
{
  // WRITE PMUSERENR = 1
  asm volatile ("mcr p15, 0, %0, c9, c14, 0\n\t" : : "r" (1));
}

int init_module()
{
  on_each_cpu(enable_ccnt_read, NULL, 1);
  return 0;
}

void cleanup_module()
{
}

MODULE_LICENSE("GPL");

I can now read PMCCNTR from userland:

#include <iostream>

unsigned ccnt_read ()
{
  volatile unsigned cc;
  asm volatile ("mrc p15, 0, %0, c9, c13, 0" : "=r" (cc));
  return cc;
}

int main() {
  std::cout << ccnt_read() << std::endl;
}

To run a userland program on a specific core you can use taskset like so (example, run on core 2):

$ taskset -c 2 ./ccnt_read
0

The PMCCNTR are still not incrementing. They need to be "switched on" somehow.

linux
arm
raspberry-pi
raspberry-pi2
asked on Stack Overflow Jul 24, 2015 by Andrew Tomazos • edited Jul 24, 2015 by Andrew Tomazos

3 Answers

7

Here is the working solution for posterity:

The kernel module:

#include <linux/module.h>
#include <linux/kernel.h>

static void enable_ccnt_read(void* data)
{
  // PMUSERENR = 1
  asm volatile ("mcr p15, 0, %0, c9, c14, 0" :: "r"(1));

  // PMCR.E (bit 0) = 1
  asm volatile ("mcr p15, 0, %0, c9, c12, 0" :: "r"(1));

  // PMCNTENSET.C (bit 31) = 1
  asm volatile ("mcr p15, 0, %0, c9, c12, 1" :: "r"(1 << 31));
}

int init_module()
{
  on_each_cpu(enable_ccnt_read, NULL, 1);
  return 0;
}

void cleanup_module()
{
}

MODULE_LICENSE("GPL");

The client program:

#include <iostream>

unsigned ccnt_read ()
{

  volatile unsigned cc;
  asm volatile ("mrc p15, 0, %0, c9, c13, 0" : "=r" (cc));
  return cc;
}

int main() {
  std::cout << ccnt_read() << std::endl;
}
answered on Stack Overflow Jul 27, 2015 by Andrew Tomazos • edited Jul 27, 2015 by Andrew Tomazos
1

What you have done is to enable User level access of the counter. You have not enabled the counter as such. In addition to enabling access you have to program 31st bit (C-bit) of PMCNTENSET to enable counting. This along with your on_each_cpu() changes should enable the functionality you are looking for.

A word of caution: your measurements will be messed up, if a process migrates to a different core between CCNT reads.

answered on Stack Overflow Jul 27, 2015 by Arun Valiaparambil
0

I am running this chip in simulation, and found a further problem to those described above. The performance counter must be reset when it is enabled, otherwise asserts are generated from the undefined values. This means that the PMCR register should be set as follows:

// PMCR.E (bit 0) = 1, PMCR.C (bit 2) = 1
asm volatile ("mcr p15, 0, %0, c9, c12, 0" :: "r"(5));
answered on Stack Overflow May 5, 2017 by PatB

User contributions licensed under CC BY-SA 3.0