Re: Instrumenting multithreaded applications

nmm1@cus.cam.ac.uk (Nick Maclaren)
25 Jan 2003 00:43:47 -0500

          From comp.compilers

Related articles
Instrumenting multithreaded applications legendre@u.arizona.edu (Matthew Legendre) (2003-01-21)
Re: Instrumenting multithreaded applications nmm1@cus.cam.ac.uk (2003-01-25)
Re: Instrumenting multithreaded applications mailbox@dmitry-kazakov.de (Dmitry A.Kazakov) (2003-01-25)
Re: Instrumenting multithreaded applications jstracke@speakeasy.net (John Stracke) (2003-01-25)
Re: Instrumenting multithreaded applications bobduff@World.std.com (Robert A Duff) (2003-01-25)
Re: Instrumenting multithreaded applications bje@redhat.com (Ben Elliston) (2003-01-25)
Re: Instrumenting multithreaded applications joachim_d@gmx.de (Joachim Durchholz) (2003-01-25)
Re: Instrumenting multithreaded applications chase@world.std.com (David Chase) (2003-01-25)
[2 later articles]
| List of all articles for this month |

From: nmm1@cus.cam.ac.uk (Nick Maclaren)
Newsgroups: comp.compilers
Date: 25 Jan 2003 00:43:47 -0500
Organization: University of Cambridge, England
References: 03-01-118
Keywords: performance, testing, parallel
Posted-Date: 25 Jan 2003 00:43:47 EST

Matthew Legendre <legendre@u.arizona.edu> writes:
|> Does anyone have any experience with or know of any work related to
|> instrumenting a multithreaded application?
|>
|> We've got a situation where we want to insert instrumentation into a
|> multithreaded application (for profiling purposes). We need to use some
|> form of mutual exclusion since the instrumentation writes to a global data
|> structure.


That is standard technology, if not easy to do efficiently and
portably.


|> The problem is that we may insert instrumentation into a signal handler.
|> So a thread may enter the instrumentation code and open the appropriate
|> lock. A signal then fires and transfers control to the signal handler.
|> The thread then re-enters the instrumentation and deadlocks on a lock it
|> already owns.


You are probably in DEAD trouble. See below.


|> I've already seen a paper by people on the Paradyn project at the Univ. of
|> Wisconsin called 'Dynamic Instrumentation of Thread Applications' in which
|> they suggest creating a locking mechanism that won't let a thread block on
|> or enter a lock that it already has open. Unfortunately this involves
|> knowledge of the underlying threading package, and I'd like to avoid
|> opening that can of worms.


You have just opened a much worse one, I am afraid.


To avoid writing an essay, I won't go into the history or into the
technical problems in any depth, but the executive summary is that
most standard languages and most current implementations do not
support the handling of 'real' signals at all. Even worse, this is
often not just because the implementation is missing a few tricks or
has bugs, but can sometimes be due to deep underlying problems with
the operating system or hardware.


The point here is that the run-time system needs to be able to trap
all exceptions in such a way that it can restore the language's
environment with all relevant global data to a clean, well-defined
state before calling the handler. And then it needs to be able to
return transparently, except for changes to the global data. At best,
this is very hard to achieve.


One common myth is that the trappability is a property of the of the
signal and exception, but it rarely is - it is a property of the code
being executed and the program state at the time the trap is taken.
Now, POSIX attempts to resolve this by providing controllable blocking
of signals, but that is not a good approach for most applications (it
is more appropriate for codes like device drivers), and it doesn't
help anyway. The problem cannot be resolved by ANY library, as it
is at a different level.


What is needed is a COMPILER facility to synchronise the compiled
code's data (registers, floating-point flags etc.) with the abstract
model so that clean trapping is possible. And few languages have
such a feature. Most standards say or imply that the occurrence of
any real signal or exception is undefined behaviour, which is
usually interpreted by the implementation vendor that it need not
handle all such cases, document precisely what is supported or even
accept bug reports.


Of course, this can be done by explicit constructions in the program,
automatically inserted at suitable points or a combination. Just
like any other feature of compiled code. But, without it, you are
wrestling with dragons deep in a pool of mud.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nmm1@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.