User Tools

Site Tools


mva:mva

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
mva:mva [2016/08/06 12:43] – [Parameters to tune] iwnmva:mva [2023/06/01 13:29] (current) – [Other information] iwn
Line 8: Line 8:
   * [[https://root.cern.ch/doc/v606/classTMVA_1_1Factory.html|Factory Class Reference]]   * [[https://root.cern.ch/doc/v606/classTMVA_1_1Factory.html|Factory Class Reference]]
   * [[https://root.cern.ch/doc/v606/classTMVA_1_1Reader.html|Reader Class Reference]]   * [[https://root.cern.ch/doc/v606/classTMVA_1_1Reader.html|Reader Class Reference]]
 +  * [[https://root.cern.ch/doc/master/group__tutorial__tmva.html|Official examples (C++)]]
  
 When using the neural netwerk method MLP, you might need ROOT 5.34.0.0 or newer, to have larger buffer for the xml reader, for example: When using the neural netwerk method MLP, you might need ROOT 5.34.0.0 or newer, to have larger buffer for the xml reader, for example:
-<code>+<code bash>
 . /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.26/x86_64-slc6-gcc48-opt/root/bin/thisroot.sh . /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.26/x86_64-slc6-gcc48-opt/root/bin/thisroot.sh
 </code> </code>
- 
  
  
Line 20: Line 20:
 To run the MVA in python on a signal tree ''treeS'' and background tree ''treeB'' with a list of variables, ''varNames'', that are available in the tree, you first need: To run the MVA in python on a signal tree ''treeS'' and background tree ''treeB'' with a list of variables, ''varNames'', that are available in the tree, you first need:
  
-<code>+<code python>
 from ROOT import TFile, TTree, TMVA, TCut from ROOT import TFile, TTree, TMVA, TCut
 f_out = TFile("MVA.root","RECREATE") f_out = TFile("MVA.root","RECREATE")
Line 34: Line 34:
 </code> </code>
  
-In the empty quote marks you can add cuts or options. More information on can be found on the [[https://root.cern.ch/doc/v606/classTMVA_1_1Factory.html|Factory Class Reference]].+In the empty quote marks you can add cuts or options. More information on can be found in section 3.1 in the [[http://tmva.sourceforge.net/docu/TMVAUsersGuide.pdf|manual]] and in the [[https://root.cern.ch/doc/v606/classTMVA_1_1Factory.html|Factory Class Reference]].
  
 Then you can book multiple methods like the BDT and MLP for different parameters: Then you can book multiple methods like the BDT and MLP for different parameters:
  
-<code>+<code python>
 factory.BookMethod( TMVA.Types.kBDT, "BDT", "!H:!V" ) factory.BookMethod( TMVA.Types.kBDT, "BDT", "!H:!V" )
 factory.BookMethod( TMVA.Types.kBDT, "BDTTuned", factory.BookMethod( TMVA.Types.kBDT, "BDTTuned",
Line 50: Line 50:
 Finally train, test and evaluate all the booked methods: Finally train, test and evaluate all the booked methods:
  
-<code>+<code python>
 factory.TrainAllMethods() factory.TrainAllMethods()
 factory.TestAllMethods() factory.TestAllMethods()
Line 56: Line 56:
 f_out.Close() f_out.Close()
 </code> </code>
- 
  
  
Line 63: Line 62:
 The factory will output weights in a XML file you can use to apply to a tree that contains the same variable. The factory will output weights in a XML file you can use to apply to a tree that contains the same variable.
  
-<code>+<code python>
 reader = TMVA.Reader() reader = TMVA.Reader()
 vars = [ ] vars = [ ]
Line 69: Line 68:
     vars.append(array('f',[0]))     vars.append(array('f',[0]))
     reader.AddVariable(name,vars[-1])     reader.AddVariable(name,vars[-1])
-reader.BookMVA(TMVAClassification.weights.xml")+reader.BookMVA("TMVAClassification.weights.xml")
     for i in range(len(config.varNames)):     for i in range(len(config.varNames)):
         tree.SetBranchAddress(config.varNames[i],vars[i])         tree.SetBranchAddress(config.varNames[i],vars[i])
Line 89: Line 88:
   * ''MaxDepth=3'': maximum tree depth, ~2-5 depending on interaction of the variables   * ''MaxDepth=3'': maximum tree depth, ~2-5 depending on interaction of the variables
   * ''nCuts=20'': grid points in variable range to find the optimal cut in node splitting   * ''nCuts=20'': grid points in variable range to find the optimal cut in node splitting
-  * ''SeparationType=GiniIndex'': separating criterion at each splitting node to select best variable. The [[https://en.wikipedia.org/wiki/Gini_coefficient|Gini index]] is one measure+  * ''SeparationType=GiniIndex'': separating criterion at each splitting node to select best variable. The [[https://en.wikipedia.org/wiki/Gini_coefficient|Gini index]] is one often used measure
   * ''MinNodeSize=5%'': minimum percentage of training events required in a leaf node   * ''MinNodeSize=5%'': minimum percentage of training events required in a leaf node
  
Line 107: Line 106:
   * [[https://aholzner.wordpress.com/2011/08/27/a-tmva-example-in-pyroot/|TMVA in PyRoot tutorial]]   * [[https://aholzner.wordpress.com/2011/08/27/a-tmva-example-in-pyroot/|TMVA in PyRoot tutorial]]
   * [[mva:mvaexample|This python example]] shows how you can run over multiple trees and different sets of variables. It also includes functions to apply the MVA output to trees and make background rejection vs. signal efficiency plots and correlation plots.   * [[mva:mvaexample|This python example]] shows how you can run over multiple trees and different sets of variables. It also includes functions to apply the MVA output to trees and make background rejection vs. signal efficiency plots and correlation plots.
 +  * [[https://root.cern/doc/master/group__tutorial__tmva.html|ROOT official TMVA tutorials]]
 +
 +
 + ===== Other information =====
 +
 +  * The working principles of a [[https://www.physik.uzh.ch/~grazzini/teaching/higgsnotes/lecture6.pdf|neural network]] and a [[https://www.physik.uzh.ch/~grazzini/teaching/higgsnotes/lecture10.pdf|BDT]] are visually explained in the UZH's [[https://www.physik.uzh.ch/~grazzini/teaching/higgs.html|Higgs Physics course]] by Mauro Donegà.
 +  * [[https://arogozhnikov.github.io/2016/07/05/gradient_boosting_playground.html|Gradient Boosting Interactive Playground]] with interactive visuals
 +  * TMVA BibTex reference:
 +<code latex>
 +@article{TMVA, 
 +  title         = {TMVA: Toolkit for Multivariate Data Analysis},
 +  author        = {Hoecker, Andreas and Speckmayer, Peter and
 +                   Stelzer, Joerg and Therhaag, Jan and
 +                   von Toerne, Eckhard and Voss, Helge},
 +  journal       = {PoS},
 +  volume        = {ACAT},
 +  year          = {2007},
 +  month         = {Mar},
 +  pages         = {040},
 +  url           = {http://inspirehep.net/record/746087/},
 +  reportNumber  = {CERN-OPEN-2007-007},
 +  eprint        = {physics/0703039},
 +  archivePrefix = {arXiv},
 +  primaryClass  = {physics},
 +  SLACcitation  = {%%CITATION = arXiv:physics/0703039;%%},
 +}
 +</code>
mva/mva.1470480198.txt.gz · Last modified: 2016/08/06 12:43 by iwn