This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
btag:tmva [2014/12/17 14:14] – vlambert | btag:tmva [2014/12/17 14:43] – vlambert | ||
---|---|---|---|
Line 5: | Line 5: | ||
Below are the subsequent steps for preparing the training samples for the TMVA: | Below are the subsequent steps for preparing the training samples for the TMVA: | ||
- | 1) You want to make the trees really flat without vectors and set variables that are not defined for a given vertex category to a default value. For this, run your ntuples through | ||
+ | **1)** The samples most likely need to be skimmed to not cause a memory allocation error for the TMVA training. One can first skim the samples, selecting 20,000 events in each pt/eta bin for each flavour/ | ||
+ | |||
+ | **2)** Make the trees really flat without vectors and set variables that are not defined for a given vertex category to a default value. For this, run your ntuples through **createNewTree.py** which will produce sets of new flat ntuples split in event range such as // | ||
+ | |||
+ | *For the training, one can either combine these ntuples with hadd or leave them as is for the rest of the processing. | ||
+ | |||
+ | **3)** Produce the category normalization weights for the training sample with **Normalization_Weights.C** and save the output to a text such as // | ||
+ | |||
+ | **4)** Assuming the evaluation sample vertex category weights have been produced (look at procedures for Evaluation Samples), add the normalization and category weight branches to the flat ntuples with **addWeightBranch.py**. The combination of these weights will remove the training sample vertex category information and match it with that of the evaluation sample. | ||
+ | |||
+ | **5)** Create 2D Pt/Eta Histograms for the weighted ntuples with **createEtaPtWeightHists.py** (make sure " | ||
+ | |||
+ | **6)** Make the final weighted ntuples making sure that the new Pt/Eta histogram files are pointed to in **addWeightBranch.py**. There should be six new branches created:\\ | ||
+ | -**weight_etaPt** | ||
+ | -**weight_etaPtInc**: | ||
+ | -**weight_category**: | ||
+ | -**weight_norm** | ||
+ | -**weight_flavour** : the ratio of the flavour prevalences in the evaluation process\\ | ||
+ | -**weight** | ||