User Tools

Site Tools


btag:btag

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
btag:btag [2014/04/22 16:50] – [Useful TWikis] taarrebtag:btag [2014/07/02 17:39] (current) – [Future plans for boosted H->bb CSV tagging] taarre
Line 1: Line 1:
-====== Analysis notes ======+====== Future plans for boosted H->bb CSV tagging ====== 
 +In the process of going from using the standard CSV algorithm on jet substructure to a dedicated boosted jet b-tagging algorithm, the current plan is to perform a training using fat H->bb jet containing two reconstructed secondary vertices.\\ 
 + We will start of by performing a dedicated training separating jets containing one b quark from jets containing two. To do this we create 2 flat trees using fat jets (starting with CA8, then perhaps moving to AK8); one containing 1 RecoVertex matched to 1 true B hadron and one containing 2 RecoVertices matched to 2 true B hadrons (RecoVertex_B and RecoVertex_BB). We will then perform a dedicated training to try to distinguish b from bb jets.\\ 
 +The obvious next step will then be to look to the cases where you have not necessarily have two RecoVertices. This implies performing the training in 5 different vertex categories: 
 +  - Reco+Reco 
 +  - Reco+Pseudo 
 +  - Reco 
 +  - Pseudo 
 +  - No 
 +Petra Van Mulders + group are currently working on creating these different categories. \\ 
 +You then have the question of weather or not to perform dedicated BB vs UDSG, BB vs CC, BB vs CB etc. training as well. This requires in the end several different training categories. One might also consider a 2D discriminant where the user can choose his required efficiency vs purity cut in 2D space; here b vs light will be on the x axis and b vs bb on the y axis.\\ 
 +Steps for the future: 
 +  * Reproduce previous training results by Petra et al. 
 +  * Implement CA8 as default jet algorithm in variable extractor 
 +  * Create BB flag in variable extractor. This should be enough for running a first training with current variables. 
 +  * Add new variables (take multiple SVs into account) and redo training. Improvement? 
 +  * Repeat study for AK8 
 +  * Compare with subjet b-tagging
  
-  * A Combined Secondary Vertex Based B-Tagging Algorithm in CMS: http://cds.cern.ch/record/927399/files/NOTE2006_014.pdf 
-  * Algorithms for b Jet Identification in CMS: http://cms-physics.web.cern.ch/cms-physics/public/BTV-09-001-pas.pdf 
-  * Performance of b tagging at sqrt s=8 TeV in multijet, tt and boosted topology events: http://cds.cern.ch/record/1581306/files/BTV-13-001-pas.pdf 
-  * Performance Measurement of b-tagging Algorithms Using Data containing Muons within Jets: http://cms-physics.web.cern.ch/cms-physics/public/BTV-07-001-pas.pdf 
-  * Implementation and training of the Combined Secondary Vertex MVA b-tagging algorithm in CMSSW: http://cms.cern.ch/iCMS/jsp/openfile.jsp?tp=draft&files=AN2012_441_v3.pdf 
  
-====== Useful TWikis ======+====== Analysis notes ======
  
-  * SFrame tutorial [[https://wiki-zeuthen.desy.de/ATLAS/Projects/TopPhysicsInternal/AnalysisFramework/Tutorial]] +[[http://cds.cern.ch/record/927399/files/NOTE2006_014.pdf |A Combined Secondary Vertex Based B-Tagging Algorithm in CMS]]\\ 
-  * Documentation on BTag MVA Trainings [[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagMVATrainerDocumentation]] +[[http://cms-physics.web.cern.ch/cms-physics/public/BTV-09-001-pas.pdf|Algorithms for b Jet Identification in CMS]]\\ 
-  * Particle PDG id'[[http://pdg.lbl.gov/2002/montecarlorpp.pdf]] +[[http://cds.cern.ch/record/1581306/files/BTV-13-001-pas.pdf|Performance of b tagging at sqrt s=8 TeV in multijet, tt and boosted topology events]]\\ 
-  * BTV activities [[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagSoftware#Post_BTV_13_001_activities]] +[[http://cms-physics.web.cern.ch/cms-physics/public/BTV-07-001-pas.pdf|Performance Measurement of b-tagging Algorithms Using Data containing Muons within Jets]]\\ 
-  BTV indico page [[https://indico.cern.ch/category/1309/]] +[[http://cms.cern.ch/iCMS/jsp/openfile.jsp?tp=draft&files=AN2012_441_v3.pdf|Implementation and training of the Combined Secondary Vertex MVA b-tagging algorithm in CMSSW]]\\ 
-====== Using batch submission for SFrame jobs ======+====== Meetings and activities ====== 
 +[[https://indico.cern.ch/category/1309/|BTV indico page]]\\ 
 +[[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagSoftware#Post_BTV_13_001_activities|BTV activities]]\\ 
 +[[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagPerformanceGroup|BTV POG Performance/Validation Subgroup]]\\ 
 +[[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BoostedBTagCommissioning|Commissioning of b tagging in boosted event topologies]]\\ 
 +====== Useful TWikis ======
  
-  * Copy **BatchSubmission** to your analysis directory\\ +[[https://wiki-zeuthen.desy.de/ATLAS/Projects/TopPhysicsInternal/AnalysisFramework/Tutorial|SFrame tutorial ]]\\ 
-''cp -../../clange/ExoVV/Analysis/BatchSubmission/''\\ +[[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagMVATrainerDocumentation|Documentation on BTag MVA Trainings]]\\ 
-  * Create directories AnalysisOutput and AnalysisTemp parallel to BatchSubmissions\\ + [[http://pdg.lbl.gov/2002/montecarlorpp.pdf|Particle PDG id's ]]\\ 
-''mkdir AnalysisOutput AnalysisTemp''\\ +[[https://twiki.cern.ch/twiki/bin/view/TMVA/WebHome|TMVA tutorial]]\\ 
-  * Make sure you have an updated version of Python (Python 2.6 or later). You can use the Python version of CMSSW doing  \\ +[[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagSoftwareMVATrainer|MVA Trainer in CMSSW]]\\ 
-''cd ...CMSSW_5_3_13/src/\\ +[[https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagPerformanceOP | Btag OP points]] 
-cmsenv'' +====== Useful tools====== 
-  * Create your list of infiles in an .xml file and store it under **BatchSubmission/xmls/**. Use only name of infile and lumi (see test.xml):\\ +[[btag:BatchSub|Using batch submission to split SFrame jobs]]
-''<In FileName="dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/jngadiub/Thea/FLATtuple/HH4b_1000_newCones8/flatTuple_Graviton_1000_newCones8_1.root" Lumi="1.0"/>\\ +
-      <In FileName="dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/jngadiub/Thea/FLATtuple/HH4b_1000_newCones8/flatTuple_Graviton_1000_newCones8_2.root" Lumi="1.0"/>\\ +
-      <In FileName="dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/jngadiub/Thea/FLATtuple/HH4b_1000_newCones8/flatTuple_Graviton_1000_newCones8_3.root" Lumi="1.0"/>\\ +
-      ......''\\ +
-        * Create new or edit **BatchSubmission/MyTestAnalysisOptions.py**. This will be the same for all jobs (signal and background) and should only contain global variables and libraries used  (see comments under).\\ +
-''loadLibs=[\\ +
-  "libMyTestPackage",                     //libraries you are using, order matters (separate by comma)\\ +
-  ]\\ +
-\\ +
-\\ +
-loadPacks=["SFrameCore.par",\\ +
-    "MyTestPackage.par",           //name of your SFrame package+.par\\ +
-    ]\\ +
-\\ +
-compilePacks=[\\ +
-  "../AnalysisPackage",                     //name of your SFrame package\\ +
-  ]\\ +
-\\ +
-AddUserItems = [\\                 +
-  ["InputTreeName" ,"tree"],//your //global// user items\\ +
-  ]\\ +
-\\ +
-#End''\\ +
-  * Create python script with job specific configurations (one for signal, one for background), see **BatchSubmission/test.py**Here you add all job specific item names, name of in and output file etc (see comments under)\\ +
-''#/usr/bin/python\\ +
-# -*- coding: utf-8 -*-\\ +
-\\ +
-path2xml="$HOME/ExoVV/Analysis/BatchSubmission/xmls" //path to xmls\\ +
-path2tmp="$HOME/ExoVV/Analysis/AnalysisTemp" //path to temporary directory (create this if you have not done so already)\\ +
-outDir="$HOME/ExoVV/Analysis/AnalysisOutput" //path to output directory\\  +
-jobName="clTestJob" //name of job (optional name)\\ +
-cycleName="MyTestAnalysis" //**important!** must match SFrame cycle name (see sframe config file)\\  +
-nEventsMax=-1 //nr events\\ +
-nProcesses=2\\ +
-nFiles=2 //nr of files per job\\ +
-hCPU="00:30:00"\\ +
-hVMEM="3000M"\\ +
-postFix ""\\ +
-\\ +
-dataSets=[\\ +
-        ["Test", ["test"]], //Output name and name of xml containing in file names\\ +
-        ]\\ +
-\\ +
-userItems [\\ +
-#               ["InputTreeName", "tree"], // job specific item names\\ +
-            ]\\ +
-\\ +
-jobOptionsFile2=open("MyTestAnalysisOptions.py", 'r')//name of file containing global item names and libraries\\ +
-command2=""\\ +
-for i in [o for o in jobOptionsFile2.readlines()]:\\ +
-  if ("#E" + "nd") in i : break\\ +
-  command2+=i\\ +
-jobOptionsFile2.close()\\ +
-exec command2\\ +
-userItems += AddUserItems\\ +
-\\ +
-inputTrees=["ntuplizer/tree"]//name of intput tree\\ +
-outputTrees=["analysis"]//name of output tree\\'' +
-  * Compile and run from **BatchSubmission** with\\ +
-''python submitSFrame.py -j test.py --batch''\\ +
-  * Output files stored under **AnalysisOutput**+
  
 ====== nTuples ====== ====== nTuples ======
 dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/jngadiub/Thea/FLATtuple dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/jngadiub/Thea/FLATtuple
 +====== Notes ======
 +[[btag:mvaTrainer|btag:mvaTrainer]]
btag/btag.1398178214.txt.gz · Last modified: 2014/04/22 16:50 by taarre