Quantcast
Channel: LabWindows/CVI topics
Viewing all articles
Browse latest Browse all 5376

Multithreading and partitioned shared memory

$
0
0

Hi All,

 

I'm having no success with this (simple?) multithreading problem on my core-i7 processor, using CVI 9.0 (32-bit compiler).

 

In the code snippets below, I have a node level structure of 5 integers, and I use 32 calls to calloc() to allocate space for 32 blocks of 128*128 (16K) nodes and store the returned pointers in an array as a global var. 

Node size in bytes = 20, block size in bytes = (approx) 328KB, total allocated size in bytes = (approx) 10.5MB.

 

I then spawn 32 threads, each of which is passed a unique index into the "node_space" pointer_array (see code below), so each thread is manipulating (reading/writing) a separate 16K block of nodes.

 

It should be thread safe and scale by the number of threads because each thread is addressing a different memory block (with no overlap), but multithreading goes no faster (maybe slightly) than a single thread.

 

I've tried various threadpool sizes, padding nodes to 16 and 64 byte boundaries,  all to no avail.

 

Is this a memory bandwidth problem due to the size of the arrays? Does each thread somehow load the whole 32 blocks?  Any help appreciated.

 

struct  Nodes

   {
   unsigned int a;  
   unsigned int b;  
   unsigned int c;
   unsigned int d;  
   unsigned int e;

   } ;                                           
typedef struct Nodes  Nodes;
typedef  Nodes   *Node_Ptr;

 

Node_Ptr            node_space[32];          /* pointer array into 32 separate blocks ( loaded via individual calloc calls for each block) */

 

.... Thread Spawning  ....

         for (index = 0; index < 32; ++index)

             CmtScheduleThreadPoolFunction(my_thread_pool_handle, My_Thread_Function, &index, NULL);

 


Viewing all articles
Browse latest Browse all 5376

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>