Re: thread static

Stefan Monnier <>
Mon, 21 Aug 1995 08:31:19 GMT

          From comp.compilers

Related articles
thread static (1995-08-08)
Re: thread static (1995-08-15)
Re: thread static (1995-08-18)
Re: thread static (Stefan Monnier) (1995-08-21)
Re: thread static (1995-08-21)
Re: thread static (Roger Barnett) (1995-08-21)
Re: thread static (1995-08-21)
Re: thread static (1995-08-22)
Re: thread static (1995-08-22)
Re: thread static (1995-08-24)
[3 later articles]
| List of all articles for this month |

Newsgroups: comp.compilers
From: Stefan Monnier <>
Keywords: parallel, C, comment
Organization: Ecole Polytechnique Federale de Lausanne
References: 95-08-078 95-08-128
Date: Mon, 21 Aug 1995 08:31:19 GMT

In article 95-08-128,
Michael McNamara <,> wrote:
] I recognize that one can dedicate a register to hold one's
] thread number, thus avoid the os call; but then consider the cost of
] removing a register from register allocator's pool.

Registers are not *that* scarce !

] Moreover, one still incurs the cost of the array index to get
] one's own foo, and potentially the false cache line sharing problem
] if one packs the array of thread local data in a data major order,
] rather than a thread major order.

Oh, come on ! You wouldn't have a thread number in your register, but
a thread-object pointer with all the thread-local objects part of the
thread-object. So you don't need your array (I hate arrays, cause I'm
fear you might set an arbitrary limit on the number of threads just
so that you can statically allocate your array). Also don't forget
that the extra register doesn't have to be always used: it'd just be
an additional parameter to the main function of the thread and would
only be transmitted to the functions that need it.

And don't forget a few details with your scheme:
- taking the address of a threadlocal variable has to ba done
    carefully since this address cannot be passed to another thread
    (well it can, but it points to the other thread's variable. Very
    subtle bugs expected !)
- thread creation/destruction and context-switches have to go through the
    kernel: this imposes a minimum weight to your threads. But that's OK since
    most treads packages need to go into the kernel in order to setup the
    extensible stack.
- going through the kernel is one thing, but your scheme requires to
    also change the pagetable at every context-switch (and thread
    creation, etc...). This can be expensive, especially if it requires
    some cache flushes. These threads are looking real fat !

I'm not saying fat threads are bad, but your neat trick can make your
system slower than one using an additional register that points to
threadlocals, depending on the grain of the parallelism.

[It's certainly true that if all threads share the same address
space, it's possible to switch threads without a kernel context
switch, which can be a performance boon in some cases. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.