This is an old revision of the document!

How to run TMVA

TMVA is a tool to run a multivariate analysis on a root tree. It is included in ROOT from version release 5.34/11 on. Among many others, it includes boosted decision tree (BDT) and neural network (MLP) methods. More information can be found in:

When using the neural netwerk method MLP, you might need ROOT 34.0.0 or newer, to have larger buffer for the xml reader, for example:

. /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.26/x86_64-slc6-gcc48-opt/root/bin/thisroot.sh

Simple example

To run the MVA in python on a signal tree treeS and background tree treeB with a list of variables, varNames, that are available in the tree, you first need:

from ROOT import TFile, TMVA, TCut
f_out = TFile("MVA.root","RECREATE")
TMVA.Tools.Instance()
factory = TMVA.Factory( "TMVAClassification", f_out, "" )
for name in varNames:
  factory.AddVariable(name,'F')
factory.AddSignalTree(treeS)
factory.AddBackgroundTree(treeB)
cut_S = TCut("")
cut_B = TCut("")
factory.PrepareTrainingAndTestTree( cut_S, cut_B, "" )

In the empty quote marks you can add cuts or options. More information on can be found on the Factory Class Reference.

Then you can book multiple methods like the BDT and MLP:

factory.BookMethod( TMVA.Types.kBDT, "BDT", "!H:!V" )
factory.BookMethod( TMVA.Types.kBDT, "BDTTuned",
                    "!H:!V:NTrees=2000:MaxDepth=4:BoostType=AdaBoost"+\
                    "AdaBoostBeta=0.1:SeparationType=GiniIndex:nCuts=80" )
factory.BookMethod( TMVA.Types.kMLP, "MLPTanh",
                    "!H:!V:LearningRate=0.01:NCycles=200:NeuronType=tanh"+\
                    "VarTransform=N:HiddenLayers=N,N:UseRegulator" )

Finally train, test and evaluate all the booked methods:

factory.TrainAllMethods()
factory.TestAllMethods()
factory.EvaluateAllMethods()
f_out.Close()

Parameters to tune

The parameters and options of the MVA method can be optimized from the default settings for better a performance, see the reference page.

For the BDT important parameters are the the learning rate, number of boost steps and maximal tree depth:

AdaBoostBeta=0.5: learning rate, smaller (~0.1) is better, but takes longer
nTrees=800: number of boost steps, too large mainly costs time and can cause overtraining
MaxDepth=3: maximum tree depth, ~2-5 depending on interaction of the variables
nCuts=20: grid points in variable range to find the optimal cut in node splitting
MinNodeSize=5%

Important MLP parameters to tune are the number of neurons on each hidden layer, learning rate and the activation function.

HiddenLayers=N,N-1: number of nodes in each hidden layer for N variables
- N = one hidden layer with N nodes
- N,N = two hidden layers
- N+2,N = two hidden layers, with N+2 nodes in the first
LearningRate=0.02

Physik-Institut

CMS Wiki Pages

Table of Contents

How to run TMVA

Simple example

Parameters to tune

Tutorials and examples

Physik-Institut

CMS Wiki Pages

User Tools

Site Tools

Table of Contents

How to run TMVA

Simple example

Parameters to tune

Tutorials and examples

Page Tools