This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
computing:crab [2019/10/24 17:08] – iwn | computing:crab [2019/10/24 17:17] (current) – [CRAB3] iwn | ||
---|---|---|---|
Line 2: | Line 2: | ||
====== CRAB3 ====== | ====== CRAB3 ====== | ||
- | See [[https:// | + | See |
+ | * Tutorial: https:// | ||
+ | * Configuration: | ||
+ | * Commands: https:// | ||
+ | * Example: | ||
+ | \\ | ||
====== CRAB2 ====== | ====== CRAB2 ====== | ||
+ | CRAB2 has been superseded by CRAB3. | ||
===== Setup local environment ===== | ===== Setup local environment ===== | ||
In order to submit jobs to the Grid, you must have an access to a LCG User Interface (LCG UI). It will allow you to access WLCG-affiliated resources in a fully transparent way. Then, the setup of the CMSSW software and the source of the CRAB environment should be done with this order. Remember to create a proxy certificate for CMS | In order to submit jobs to the Grid, you must have an access to a LCG User Interface (LCG UI). It will allow you to access WLCG-affiliated resources in a fully transparent way. Then, the setup of the CMSSW software and the source of the CRAB environment should be done with this order. Remember to create a proxy certificate for CMS | ||
Line 30: | Line 35: | ||
kinit YourCERNAFSName@CERN.CH aklog cern.ch | kinit YourCERNAFSName@CERN.CH aklog cern.ch | ||
</ | </ | ||
- | |||
===== CRAB setup ===== | ===== CRAB setup ===== | ||
< | < | ||
Line 47: | Line 51: | ||
===== CRAB configuration file for Monte Carlo data ===== | ===== CRAB configuration file for Monte Carlo data ===== | ||
- | The CRAB configuration file (default name crab.cfg) should be located at the same location as the CMSSW parameter-set to be used by CRAB with the following content: | + | The CRAB configuration file (default name '' |
< | < | ||
[CMSSW] | [CMSSW] | ||
Line 76: | Line 80: | ||
===== Analyse published results ====== | ===== Analyse published results ====== | ||
To analyse results that have been published in a local DBS you may use a CRAB configuration identical to any other, with the addition that you must specify the DBS | To analyse results that have been published in a local DBS you may use a CRAB configuration identical to any other, with the addition that you must specify the DBS | ||
- | instance to which the data was published, datasetpath name of your dataset and the dbs_url. To do this you must modify the [CMSSW] section of your CRAB configuration file, e.g. | + | instance to which the data was published, datasetpath name of your dataset and the '' |
< | < | ||
[CMSSW] | [CMSSW] | ||
Line 83: | Line 87: | ||
dbs_url=url_local_dbs | dbs_url=url_local_dbs | ||
</ | </ | ||
- | <wrap important> | + | Note: As '' |
Writing: https:// | Writing: https:// | ||
Reading: http:// | Reading: http:// | ||
 |  | ||
===== Local jobs ===== | ===== Local jobs ===== | ||
- | It is possible to run CRAB jobs on the T3 only if the dataset used is as well on the T3. In this case you need " | ||
- | Example of local jobs: | ||
- | https:// | ||
- | note: This type of jobs cannot be used to process a dataset that is not on the tier3. The network connection to the T3 is not fast enough to sustain a useful write speed in the stage-out step and the jobs will fail in the very end - i.e. when trying to copy the results. | ||
+ | It is possible to run CRAB jobs on the T3 only if the dataset used is as well on the T3. In this case you need ''" | ||
+ | |||
+ | ==== Example of local jobs ==== | ||
+ | https:// | ||
+ | |||
+ | Note: This type of jobs cannot be used to process a dataset that is not on the tier3. The network connection to the T3 is not fast enough to sustain a useful write speed in the stage-out step and the jobs will fail in the very end - i.e. when trying to copy the results. | ||
===== Non local jobs ===== | ===== Non local jobs ===== | ||
- | To run " | + | To run " |
- | Example of remote jobs (change the ' | + | |
- | https:// | + | ==== Example of remote jobs (change the '' |
- | note: This is the recommended solution in case the dataset is not stored on the tier3. So the recommended solution is to stage-out to the T2 and then copy the files (using lcg-cp or data_replica) to the T3. | + | |
+ | https:// | ||
+ | |||
+ | Note: This is the recommended solution in case the dataset is not stored on the tier3. So the recommended solution is to stage-out to the T2 and then copy the files (using lcg-cp or data_replica) to the T3. | ||
If it is only very few jobs (say <50 or so) with small output files it is also possible to | If it is only very few jobs (say <50 or so) with small output files it is also possible to | ||
- | stage out directly to the T3 (just put ' | + | stage out directly to the T3 (just put '' |
< | < | ||
# CRAB cfg file used for tH analysis | # CRAB cfg file used for tH analysis | ||
Line 120: | Line 129: | ||
/ | / | ||
publish_data = 0 | publish_data = 0 | ||
- | < | + | </code> |
===== List of configuration parameters ===== | ===== List of configuration parameters ===== | ||
- | The list of the main parameters you need to specify on your crab.cfg: | + | The list of the main parameters you need to specify on your '' |
- | • pset: the CMSSW configuration file name; | + | * '' |
- | • output_file: | + | * '' |
output is defined in TFileService, | output is defined in TFileService, | ||
there is no need to specify it on this parameter; | there is no need to specify it on this parameter; | ||
- | • datasetpath: | + | * '' |
- | • Jobs splitting: | + | * '' |
By event: only for MC data. You need to specify 2 of these parameters: | By event: only for MC data. You need to specify 2 of these parameters: | ||
total_number_of_events, | total_number_of_events, | ||
- | o specify the total_number_of_events and the number_of_jobs: | + | * specify the '' |
- | assing | + | |
- | total_number_of_events/ | + | |
- | o specify the total_number_of_events and the events_per_job: | + | |
- | assign to each job events_per_job events and will calculate the | + | |
- | number of jobs by total_number_of_events/ | + | |
- | o or you can specify the number_of_jobs and the events_per_job | + | |
- | By lumi: real data require it. You need to specify 2 of these parameters: | + | |
- | total_number_of_lumis, | + | |
- | o because jobs in split-by-lumi mode process entire rather than partial files, you will often end up with fewer jobs processing more lumis than expected. Additionally, | + | |
* specify the lumis_per_job and the number_of_jobs: | * specify the lumis_per_job and the number_of_jobs: | ||
* or you can specify the total_number_of_lumis and the number_of_jobs o lumi_mask: the filename of a JSON file that describes which runs and | * or you can specify the total_number_of_lumis and the number_of_jobs o lumi_mask: the filename of a JSON file that describes which runs and | ||
Line 152: | Line 155: | ||
* copy_data: this can be 0 or 1; if it is one you will copy your output files to a | * copy_data: this can be 0 or 1; if it is one you will copy your output files to a | ||
remote Storage Element; | remote Storage Element; | ||
- | * local_stage_out: | + | * local_stage_out: |
* publish_data: | * publish_data: | ||
* use_server: the usage for crab server is deprecated now, so by default this parameter is set to 0; | * use_server: the usage for crab server is deprecated now, so by default this parameter is set to 0; | ||
* scheduler: the name of the scheduler you want to use; | * scheduler: the name of the scheduler you want to use; |