Re: Machine Learning in Optimization ?

=?ISO-8859-1?Q?Bj=F6rn_Franke?= <bfranke@inf.ed.ac.uk>
Tue, 10 Jun 2008 16:51:14 +0100

          From comp.compilers

Related articles
Machine Learning in Optimization ? linuxkaffee@gmx.net (Stephan Ceram) (2008-06-02)
Re: Machine Learning in Optimization ? bfranke@inf.ed.ac.uk (=?ISO-8859-1?Q?Bj=F6rn_Franke?=) (2008-06-04)
Re: Machine Learning in Optimization ? linuxkaffee@gmx.net (Stephan Ceram) (2008-06-09)
Re: Machine Learning in Optimization ? bfranke@inf.ed.ac.uk (=?ISO-8859-1?Q?Bj=F6rn_Franke?=) (2008-06-10)
| List of all articles for this month |

From: =?ISO-8859-1?Q?Bj=F6rn_Franke?= <bfranke@inf.ed.ac.uk>
Newsgroups: comp.compilers
Date: Tue, 10 Jun 2008 16:51:14 +0100
Organization: Edinburgh University
References: 08-06-003 08-06-007 08-06-019
Keywords: optimize
Posted-Date: 14 Jun 2008 08:49:30 EDT

Hi Stephan,


> I'm still interested in 2 other issue: 1) how many lines of code (or
> benchmarks) were required in your paper to achieve good results for
> machine learning, i.e. how much code had to be passed in the
> training phase before the learned data base provided reliable
> results for the classification of new benchmarks?


For this particular paper we have only considered fairly small
programs. In fact, these programs are more like individual
compute-intensive kernels from larger applications. Hence, a
comparatively small number of training examples (a few dozen or so)
were sufficient. Please also keep in mind that all training and test
programs have been taken from the same application domain, ie. digital
signal processing. We haven't evaluated if the training results are
transferable to a different domain. More important than the number of
programs has been the number of training sample points per program,
i.e. the number of different transformation sequences applied for each
program. Here we have applied several thousand to ten-thousand
versions of each program to collect the training data.


More recently, I have looked into using "dynamic program features"
such as instruction frequencies and some micro-architectural event
counters (cache hits/misses etc.) to estimate the execution time of a
new program after some training on a set of benchmarks. In the course
of this work I have also looked into how much training is needed and
what features should be considered. Here's a link for you to this
specific paper:


Bjvrn Franke.
Fast Cycle-Approximate Instruction Set Simulation.
Proceedings of the Workshop on Software & Compilers for Embedded Systems
(SCOPES 2008), March 2008, Munich, Germany.


> 2) The selection of static features that describe a particular C
> construct (as is used in your paper to learn "good" optimizations for
> different classes of programs) seems to be a crucial part. Can you
> give me some advices how such "good" features for learning can
> be found? Is it promising to specify as many static features of
> the C programs in the training phase as possible (intuitively more
> features should allow one to describe a program more precisely) or
> might a too large set of features have a negative impact of machine
> learning aiming at pattern recognition?


Some of my colleagues are looking into exactly this problem. Maybe these
two papers answer your questions:


* Enabling dynamic selection of good optimization passes in MILEPOST GCC
      Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru,
Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Edwin Bonilla, John Thomson,
Hugh Leather, Chris Williams, Michael O'Boyle, Phil Barnard, Elton
Ashton, Eric Courtois and Francois Bodin. Accepted for GCC Summit 2008


* Automatic Feature Generation for Setting Compilers Heuristics
      Hugh Leather, Elad Yom-Tov, Mircea Namolaru and Ari Freund
      SMART Workshop - January 2008


You can download these papers from here:


http://homepages.inf.ed.ac.uk/hleather/index.xhtml


Cheers,


        Bjoern



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.