OpenMP: Access violation and other errors

2

Preface

Recently, I implemented OpenMP into our group's project code. Main runs in two for loops; the outer controls the 'run', while the inner controls the 'generation.' Generations are completely independent from different runs, though dependent on other generations in the same run.

The idea is to parallelize the outer loop, the 'run' loop, while letting each thread maintain evolution of generations on whatever specific run number it was assigned to.

The Problem

When setting OMP_THREADS = 1 , i.e. letting the program run with only one thread, it runs without a hitch. If this number is any higher, I get the following error:

Unhandled exception at 0x00F5C4C3 in projectc.exe: 0xC0000005: Access violation writing location 0x00000072.

with the following appearing in the "Autos" section of Visual Studio:

Visual Studio error

(Note: t, t->active_cells, and t->cellx are "error red" while the rest are white when I get this error)

If I change default(none) to default(shared) in the #pragma right above the outer loop, and remove t, s, and bn from threadprivate (these are structures initialized in external files), then the program runs normally for a generation on each thread before freezing (though CPU activity shows that both threads are still running with the same intensity as before).

Attempts at Solutions

I cannot figure out what is going wrong. Trying a simple #pragma omp parallel for outside of the outer loop of course doesn't work, but I have also tried declaring all of main as #pragma omp parallel and the outer loop as #pragma omp for. A few other subtle approaches were tried like this as well, which leads me to the conclusions that it must be something to do with the way the variables are shared between threads...because all runs, and so threads, are independent, really all of the variables could be set as private; though there is some overlap that you see reflected in shared(..).

The code is attached below.

main.c

/* General Includes */
#include <stdio.h> 
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <omp.h>

/* Project Includes */
#include "main.h"
#include "randgen.h"
#include "board7.h"
#include "tissue.h"
#include "io.h"

#define BitFlp(arg,posn) ((arg) ^ (1L << (posn)))
#define BitClr(arg,posn) ((arg) & ~(1L << (posn)))

#define display_dbg 1 //Controls whether print statements in main.c are displayed.
#define display_time 1 //Controls whether timing print statements are executed.
#define BILLION 1000000000L;

#define num_runs 10 //Controls number of runs per simulation
#define num_gens 4000//Controls number of generations per run

#define OMP_THREADS 1 // Max number of threads used if OpenMP is enabled

int n, i, r, j, z, x, sxa, y, flagb, m;
int j1, j2;
char a;
int max_fit_gen, collect_data, lb_run, w, rn, sx;
float f, max_fitness;
tissuen *fx;
input_vec dx;
calookup ra;

#pragma omp threadprivate(n, r, j, x, z, sxa, y, flagb, m, \
        j1, j2, a, max_fit_gen, collect_data, lb_run, w, \
    rn, sx, f, max_fitness, fx, dx, ra, run_data, t, s, bn)

int main(int argc, char *argv[])
{

    int* p = 0x00000000;   // pointer to NULL
        char sa[256];
    char ss[10];
    long randn;
    boardtable ba;
    srand((unsigned)time(NULL));
    init_mm();
    randn = number_range(1, 100);

    #ifdef OS_WINDOWS
    // Timing parameters
        LARGE_INTEGER clk_freq;
    LARGE_INTEGER t1, t2, t3;
    #endif

    #ifdef OS_UNIX
    struct timespec clk_freq, t1, t2, t3;
    #endif

    double avg_gen_time, avg_run_time, run_time, sim_time, est_run_time, est_sim_time;

    // File System and IO Parameters
    char cwd[FILENAME_MAX];
    getcwd(&cwd, sizeof(cwd));
    char curState[FILENAME_MAX];
    char recState[FILENAME_MAX];
    char recMode[FILENAME_MAX];
    char curGen[FILENAME_MAX];
    char curRun[FILENAME_MAX];
    char genTmp[FILENAME_MAX];

    strcpy(curState, cwd);
    strcpy(recState, cwd);
    strcpy(recMode, cwd);
    strcpy(curGen, cwd);
    strcpy(curRun, cwd);
    strcpy(genTmp, cwd);

    #ifdef OS_WINDOWS
    strcat(curState, "\\current.txt");
    strcat(recState, "\\recover.txt");
    strcat(recMode, "\\recovermode.txt");
    strcat(curGen, "\\gen.txt");
    strcat(curRun, "\\run");
    strcat(genTmp, "\\tmp\\gentmp");
    #endif

    #ifdef OS_UNIX
    strcat(curState, "/current.txt");
    strcat(recState, "/recover.txt");
    strcat(recMode, "/recovermode.txt");
    strcat(curGen, "/gen.txt");
    strcat(curRun, "/run");
    strcat(genTmp, "/tmp/gentmp");
    #endif

    //Read current EA run variables (i.e. current run number, generation, recover mode status)
    z = readorcreate(curState);
    x = readorcreate(recState);
    sxa = readorcreate(recMode);
    y = readorcreate(curGen);

    //Initialize simulation parameters
    s.count = 0;

    s.x[0] = 0;
    s.y[0] = 0;

    s.addvec[0] = 0;

    s.bestnum = 0;
    s.countb = 0;
    s.count = 0;
    initialize_sim_param(&s, 0, 200);

    collect_data = 0;

    //Build a collection of experiment initial conditions
    buildboardcollection7(&bn);

    //Determine clock frequency.
    #ifdef OS_WINDOWS
    if (display_time)   get_frequency(&clk_freq);
    #endif

    #ifdef OS_UNIX
    if (display_time)   get_frequency(CLOCK_REALTIME, &clk_freq);
    #endif


//Start simulation timer
    #ifdef OS_WINDOWS
    if (display_time)   read_clock(&t1);
    #endif

    #ifdef OS_UNIX
    if (display_time)   read_clock(CLOCK_REALTIME, &t1);
    #endif

#pragma omp parallel for schedule(static) default(none) num_threads(OMP_THREADS) \
            private(sa, ss, randn, ba, t2, t3, avg_gen_time, avg_run_time, sim_time, \
        run_time, est_run_time, est_sim_time) \
            shared(i, cwd, recMode, curRun, curGen, curState, genTmp, clk_freq, t1)
for (i = z; i < num_runs; i++)
{


    // randomly initialize content of  tissue population
    initialize_tissue_pop_s2(&(t.tgen[0]), &s);
    initialize_tissue_pop_s2(&(t.tgen[1]), &s);

    max_fit_gen = 0;
    max_fitness = 0.0;
    flagb = 0;

    if ((i == z) && (x == 1))
    {
        w = y;
    }
    else
    {
        w = 0;
    }

    rn = 200;
    j1 = 0;

    s.run_num = i;
    s.maxfitness = 0.0;

    //Start run timer
        #ifdef OS_WINDOWS
    if (display_time)   read_clock(&t2);
        #endif

        #ifdef OS_UNIX
    if (display_time)   read_clock(CLOCK_REALTIME, &t2);
        #endif

        #if defined(_OPENMP)
    printf("\n ======================================= \n");
    printf("  OpenMP Status Message \n");
    printf("\n --------------------------------------- \n");
    printf("| RUN %d : \n", i);
    printf("|   New Thread Process (Thread %d) \n", omp_get_thread_num());
    printf("|   Available Threads: %d of %d \n", omp_get_num_threads(), omp_get_max_threads());
    printf(" ======================================= \n\n");
        #endif

    for (j = w; j < num_gens; j++)
    {

        // Flips on lightboard data collection. See board7.h.
        if (enable_collection == 1) {
            if ((i >= run_collect) && (j >= gen_collect)) { collect_data = 1; }
        }

        sx = readcurrent(recMode);

        // Pseudo loop code. Uses bit flipping to cycle through boards.
        j2 = ~(j1)& 1;
        if (display_dbg)    printf("start evaluation...\n");

        // evaluate tissue
        // Most of the problems in the code happen here.
        evaluatepopulation_tissueb(&(t.tgen[j1]), &ra, &bn, &s, j, i);
        if (display_dbg)    printf("\n");

        // display fitness stats to screen
        printmaxfitness(&(t.tgen[j1]), i, j, j1, &cwd);

        if (display_dbg)    printf("start tournament...\n");

        // Perform tournament selection and have children ready for evaluation
        // Rarely have to touch. Figure out best parents. Crossover operator.
        // Create a subgroup. Randomly pick individuals from the population.
        // Pick fittest individuals out of the random group.
        // 2 parents and 2 children. Children replace parents.
        tournamentsel_tissueb(&(t.tgen[j1]), &(t.tgen[j2]), &s);

        printf("Tournament selection complete.\n");

        // keep track of best fitness during run
        if (t.tgen[j1].fit_max > max_fitness)
        {
            max_fitness = t.tgen[j1].fit_max;
            max_fit_gen = j;
        }

        if ((t.tgen[j1].fit_max > 99.0) && (flagb == 0))
        {
            flagb = 1;
            run_data.fit90[i] = t.tgen[j1].fit_max;
            run_data.gen90[i] = j;
        }

        sa[0] = 0;
        strcat(sa, curRun);
        sprintf(ss, "%d", i);
        strcat(sa, ss);
        strcat(sa, ".txt");

        printf("Write fitness epc...\n");

        // write fitness stats to file
        writefitnessepc(sa, &(t), j1, j);

        printf("Write fitness complete.\n");

        // trunk for saving population to disk
        if (sx != 0)
        {
            sa[0] = 0;
            strcat(sa, genTmp);
            sprintf(ss, "%d", 1);
            strcat(sa, ss);
            strcat(sa, ".txt");

            if (display_dbg)    printf("Saving Current Run\n");
        }

        //update current generation to file
        writecurrent(curGen, j + 1);

        if (display_time && j > 0 && (j % 10 == 0 || j % (num_gens - 1) == 0))
        {
            #ifdef OS_WINDOWS
            read_clock(&t3);
            sim_time = (t3.QuadPart - t1.QuadPart) / clk_freq.QuadPart;
            run_time = (t3.QuadPart - t2.QuadPart) / clk_freq.QuadPart;
            #endif

            #ifdef OS_UNIX
            read_clock(CLOCK_REALTIME, &t3);
            sim_time = (double)(t3.tv_sec - t1.tv_sec);
            run_time = (double)(t3.tv_sec - t2.tv_sec);
            #endif

            avg_gen_time = run_time / (j + 1);
            est_run_time = avg_gen_time * (num_gens - j);
            avg_run_time = est_run_time + run_time;
            est_sim_time = (est_run_time * (num_runs - i)) / (i + 1);
            printf("\n============= Timing Data =============\n");
            printf("Time in Simulation: %.2fs\n", sim_time);
            printf("Time in Run: %.2fs\n", run_time);
            printf("Est. Time to Complete Run: %.2fs\n", est_run_time);
            printf("Est. Time to Complete Simulation: %.2fs\n\n", est_sim_time);
            printf("Average Time Per Generation: %.2fs/gen\n", avg_gen_time);
            printf("Average Time Per Run: %.2fs/run\n", avg_run_time);
            printf("=======================================\n\n");

            if (j % (num_gens - 1) == 0) {

            }
        }

            //Display Position Board
            //displayboardl(&bn.board[0]);

            j1 = j2;
        }
    }
}

Structures

typedef struct boardcollectionn
{
    boardtable board[boardnumb];

} boardcollection;

boardcollection bn;

typedef struct tissue_gent
{
    tissue_population tgen[2]; 

} tissue_genx;

typedef struct sim_paramt   //struct for storing simulation parameters
{
int penalty;
int addnum[cell_numz];
int x[9];
int y[9];
uint8_t addvec[9];
uint8_t parenta[50];
uint8_t parentb[50];
int errorstatus;
int ones[outputnum][5000];
int zeros[outputnum][5000];
int probcount;
int num;
int numb;
int numc;
int numd;
int nume;
int numf;
int bestnum;
int count;
int col_flag;
int behaviour[outputnum];
int memm[4];
int sel;
int seldecnum;
int seldec[200];
int selx[200];
int sely[200];
int selz[200];
int countb;
float maxfitness;
float oldmaxfitness;
int run_num;
int collision;

} sim_param;

tissue_genx t;
sim_param s;
c
visual-studio
parallel-processing
openmp
asked on Stack Overflow Oct 7, 2015 by Daniel R. Livingston • edited Jan 17, 2016 by halfer

1 Answer

3

The code is too big for a proper testing and the use of global variables really doesn't help to figure out the data dependencies. However I can just make a few remarks:

  • i is declared shared whereas it is the index of the parallelised loop. This is wrong! If there is a variable that you really want to be private in a omp for loop, it is the loop index. I didn't find anything clear about that in the OpenMP standard for C and C++, whereas for Fortran, the loop index (and the ones of all enclosed loops) is implicitly privatised. Nonetheless, the Intel compiler gives an error while attempting to explicitly declare shared such an index:

    sharedi.cc(11): warning #2555: static control variable for parallel loop
          for ( i=0; i<10; i++ ) {
                           ^
    sharedi.cc(10): error: index variable "i" of for statement following an OpenMP for pragma must be private
          #pragma omp parallel for shared(i) schedule(static)
          ^
    compilation aborted for sharedi.cc (code 2)
    

    by the mean-time, gcc version 5.1.0 doesn't emit any warning or error for the same code, and acts as if the variable had been declared private... I tend to find Intel's compiler's behaviour more reasonable, but I'm not 100% sure which one is correct. What I know however is that declaring i shared is definitely a very very bad idea (and even a bug AFAIC). So I feel like this is a grey area where your compiler may or may not do a sensible job, which could all by itself explain most of your problems.

  • You seem to output your data into files, which names might conflict across threads. Be careful with that as you might end-up with a big mess...

  • Your printing is very likely to be all messed-up. I don't know what importance you put in that, but that won't be pretty the way it is written for now.

In summary, your code is just to tangled for me to get a clear view on what's happening. Try to address at least the two first points I mentioned, it might be sufficient for getting it to "work". However, I couldn't encourage you enough to clean the code up and to get rid of your global variables. Likewise, try to only declare your variables as late in the sources as possible, since this reduces the need of declaring them private for OpenMP, and it improves greatly readability.

Good luck with your debugging.

answered on Stack Overflow Oct 7, 2015 by Gilles

User contributions licensed under CC BY-SA 3.0