GridStorageAccess: Difference between revisions

From T2B Wiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
THIS IS MOSTLY DEPRECATED ! PLEASE USE lcg-* COMMAND RATHER THAN srm* ONES !! (lcg-+TAB & man is your friend !)
 


This page describes how to handle data stored on grid storage.
This page describes how to handle data stored on grid storage.
=== Before starting ===
[[Image(UsefulImages:Exclamation_mark.png,height=50)]]Before being able to run these commands you first need to make a valid proxy <br>
*'''voms-proxy-init --voms cms'''


=== SRM ===
== Before starting ==
The SRMv2 interface to most grid storage completely replaces most of the EDG commands.
[[File:Exclamation-mark.jpg|left|40x30px|line=1|]] Before being able to run these commands you first need to make a valid proxy [[File:Exclamation-mark.jpg|40x30px|line=1|]] <br>
==== srm-commands ====
::: <pre>voms-proxy-init --voms cms</pre>
*''srmls'': get information on a file
 
*''srmmv'': rename a file
== GFAL ==
*''srmrmdir'': remove a directory
GFAL is a wrapper around the latest grid commands. Learning to use it means that whatever middleware requires to be used in the future, you don't need to learn new commands (like srm, lcg, etc)
*''srmmkdir'': create a directory
 
*''srmrm'': remove a file
 
*''srmcp'': copy files (still uses protocol v1!!)
<br><br>
==== usage ====
=== gfal-commands ===
*newer srm clients support transparent usage of the srm version
If you want more information on the options that can be used, please use '''man gfal-command''' !
**in case this doesn't work, please use the equivalent command as listed in the OLD usage section below
 
*<tt>srm<command> --help</tt> will print out very detailed info on how to use the commands
Here are all the commands that can be used:
**example usage can be found at the end of the help output
*''gfal-ls'': get information on a file
*an srm-url has following possible forms
*''gfal-mkdir'': remove a directory
**<tt>srm://<name_of_server>:<port>/some/path</tt>
*''gfal-rm'': removes a file. To remove an entire directory, use -r
**remote access to file/directory
*''gfal-copy'': copy files.
**this is what you will be using
 
**<tt>file:////some/path</tt>
<br><br>
**local access to file/directory 
=== Usage ===
**the 4 ''/'' are really needed
There are 2 types of file url:
==== examples ==== 
* '''Distant files''': their url is of the type srm://<name_of_server>:<port>/some/path, eg for IIHE:
*list contents of a directory ''/pnfs/iihe/cms'' on machine ''maite.iihe.ac.be''
srm://maite.iihe.ac.be:8443/pnfs/iihe/
<pre>
* '''Local files''': their url is of the type file://path_of_the_file, eg for IIHE:
  srmls srm://maite.iihe.ac.be:8443/pnfs/iihe/cms
file:///user/$USER/MyFile.root
</pre>  
 
*create a directory  
[[File:Exclamation-mark.jpg|left|40x30px|line=1|]] Be careful, the number of '''/''' is very -very- important [[File:Exclamation-mark.jpg|40x30px|line=1|]]
<pre>
 
  srmmkdir srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/stdweird/kkllddsrm
 
</pre>  
*To get a list of all distant urls for all the Storage Elements, one can do:
::<pre> lcg-infosites --vo cms se </pre>
 
 
=== Examples ===
*To list the contents of a directory ''/pnfs/iihe/cms'' :
::<pre> gfal-ls srm://maite.iihe.ac.be:8443/pnfs/iihe/cms </pre>
 
* To create a directory:
::<pre> gfal-mkdir srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/NewDir </pre>
 
*copy file from local disk to remote server  
*copy file from local disk to remote server  
<pre>
::<pre> gfal-copy file:///user/$USER/MyFile.root srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/ </pre>
  srmcp file:////bin/bash srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/stdweird/kkllddsrm/file
</pre>  
*delete a file on remote server
<pre>
  srmrm srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/stdweird/kkllddsrm/file
</pre>
*remove a directory on remote server   
<pre>
  srmrmdir srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/stdweird/kkllddsrm
</pre>   
  The directory must be empty. There is a "recursive" option, but this only applies if the directory to delete only contains empty sub-directories :
<pre>
  srmrmdir -recursive=true srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/stdweird/kkllddsrm 
</pre>


==== Bulk file transfers ====
* To copy a file from remote server to our Storage Element:
::<pre> gfal-copy srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms/store/group/comm_trigger/L1TrackTrigger/BE5D_620_SLHC6/singleMu/NoPU/reDIGI_SLHC6-TrackTrigger_muon_pgun-0499.root srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/odevroed/eosTransfer.root </pre>


There is an elegant way to run srmcp through several files. This is done using the copyjobfile option within a srmcp command.
* To delete a file on remote server
::<pre> gfal-rm srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/MyFile.root </pre>


Here are the details on how to use multiple srm instance in one command.
* To remove a directory and its entire content on remote server ?!? not working for now ?): 
Notice that the order it will process files is not necessarily the one listed in your input file.
::<pre> gfal-rm -r srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/NewDir </pre> 


Syntax:
=== Bulk file transfers ===
<pre>
srmcp -copyjobfile=datafile
</pre>
where datafile is a file where every line is a source + destination in srm url syntax as followed (in other words source and destination '''IN ONE LINE'''):


<pre>
There is an elegant way to run gfal-copy through several files. This is done using the '''--from-file''' option.
      source                                    destination
</pre>


<pre>
srm://maite.iihe.ac.be:8443/pnfs/iihe/blabla file:///$PWD/blalbla
</pre>
or, directly from a remote srm storage to our dCache:


<pre>
Syntax:
srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms/store/group/comm_trigger/L1TrackTrigger/BE5D_620_SLHC6/singleMu/NoPU/reDIGI_SLHC6-TrackTrigger_muon_pgun-0499.root srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/odevroed/eosTransfer.root
<pre> gfal-copy -f --from-file files.txt file://$PWD </pre>
</pre>
where files.txt is a file where every line is a source in srm url syntax.




Make some tests with one line in datafile and make sure the srm url is OK for both source and destination before running over several files.
Make some tests with one line in datafile and make sure the srm url is OK for both source and destination before running over several files.


If you have any issue try with <tt>-debug</tt> and try to force <tt>-streams_num=1</tt> and put this options as well <tt>-srm_protocol_version=2 -globus_tcp_port_range=21000,25000</tt>




==== copy directories from and pnfs within the IIHE ====
=== Copy directories from and to pnfs within the IIHE ===
A script to copy full directories to and from pnfs exists on the slc6 UI's:
A script to copy full directories to and from pnfs exists on the slc6 UI's:


Line 108: Line 93:




=== dCache ===
== dCache ==
Direct dcache access to files is only possible if the software supports it.  
Direct dcache access to files is only possible if the software supports it.  
PNFS (the directory structure seen under /pnfs/) is NOT a real filesystem, so normal system commands will mmostly not work.
PNFS (the directory structure seen under /pnfs/) is NOT a real filesystem: it is an '''immutable''' filesystem, and is mounted read-only.
*Commands that work
 
**ls
*Therefore, here is a list of the commands that work:
*replacemnt command:
'''ls'''
**dccp (to copy files from /pnfs to local disk or reverse)
* And replacement commands:
*ROOT has dcache support
** '''dccp''':  to copy files from /pnfs to local disk. Example:
**to open files in dcache using root, use eg
dccp dcap://maite.iihe.ac.be/pnfs/iihe
<pre>
 
  root dcap://maite.iihe.ac.be/pnfs/iihe/some/file.root
* To open files using root, use eg
</pre>
<pre> root dcap://maite.iihe.ac.be/pnfs/iihe/some/file.root </pre>


When reading out the rootfiles is rather slow or it doesn't work at all and nothing is wrong with the root file (e.g. in an interactive analysis on beo or msa) you can increase your dCache readahead buffer. Don't make the buffer larger than 50MB! To enlarge the buffer set this in you environment for csh:
When reading out the rootfiles is rather slow or it doesn't work at all and nothing is wrong with the root file (e.g. in an interactive analysis on beo or msa) you can increase your dCache readahead buffer. Don't make the buffer larger than 50MB! To enlarge the buffer set this in you environment for csh:
Line 132: Line 117:
</pre>
</pre>
See the dChache [http://www.dcache.org/ fanpage] for further reading.
See the dChache [http://www.dcache.org/ fanpage] for further reading.
=== EDG ===
Older edg-gridftp-* commands.
*uses <tt>gsiftp://<server>/path/to/file</tt> protocol for remote files
{{TracNotice|{{PAGENAME}}}}

Revision as of 15:22, 13 July 2016


This page describes how to handle data stored on grid storage.

Before starting

Before being able to run these commands you first need to make a valid proxy

voms-proxy-init --voms cms

GFAL

GFAL is a wrapper around the latest grid commands. Learning to use it means that whatever middleware requires to be used in the future, you don't need to learn new commands (like srm, lcg, etc)




gfal-commands

If you want more information on the options that can be used, please use man gfal-command !

Here are all the commands that can be used:

  • gfal-ls: get information on a file
  • gfal-mkdir: remove a directory
  • gfal-rm: removes a file. To remove an entire directory, use -r
  • gfal-copy: copy files.



Usage

There are 2 types of file url:

  • Distant files: their url is of the type srm://<name_of_server>:<port>/some/path, eg for IIHE:
srm://maite.iihe.ac.be:8443/pnfs/iihe/
  • Local files: their url is of the type file://path_of_the_file, eg for IIHE:
file:///user/$USER/MyFile.root

Be careful, the number of / is very -very- important


  • To get a list of all distant urls for all the Storage Elements, one can do:
 lcg-infosites --vo cms se 


Examples

  • To list the contents of a directory /pnfs/iihe/cms :
 gfal-ls srm://maite.iihe.ac.be:8443/pnfs/iihe/cms 
  • To create a directory:
 gfal-mkdir srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/NewDir 
  • copy file from local disk to remote server
 gfal-copy file:///user/$USER/MyFile.root srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/ 
  • To copy a file from remote server to our Storage Element:
 gfal-copy srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms/store/group/comm_trigger/L1TrackTrigger/BE5D_620_SLHC6/singleMu/NoPU/reDIGI_SLHC6-TrackTrigger_muon_pgun-0499.root srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/odevroed/eosTransfer.root 
  • To delete a file on remote server
 gfal-rm srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/MyFile.root 
  • To remove a directory and its entire content on remote server ?!? not working for now ?):
 gfal-rm -r srm://maite.iihe.ac.be:8443/pnfs/iihe/cms/store/user/$USER/NewDir 


Bulk file transfers

There is an elegant way to run gfal-copy through several files. This is done using the --from-file option.


Syntax:

 gfal-copy -f --from-file files.txt file://$PWD 

where files.txt is a file where every line is a source in srm url syntax.


Make some tests with one line in datafile and make sure the srm url is OK for both source and destination before running over several files.


Copy directories from and to pnfs within the IIHE

A script to copy full directories to and from pnfs exists on the slc6 UI's:

copyDirectoryPnfs.py
Move all files in a directory to or from pnfs
This script assumes that you copy within the IIHE
The script does not do recursive copying
Make sure you have a valid proxy, made with voms-proxy-init --voms cms:/cms/becms

Mandatory options:
--in=                 : directory to copy from
--out=                : directory to copy to
Both directories need to be complete (i.e. including the /pnfs or /user part

example:
copyDirectoryPnfs.py --out=/user/odevroed/newfile --in=/pnfs/iihe/cms/store/user/odevroed/newdir

Optional:
-h, --help             : print this help message


dCache

Direct dcache access to files is only possible if the software supports it. PNFS (the directory structure seen under /pnfs/) is NOT a real filesystem: it is an immutable filesystem, and is mounted read-only.

  • Therefore, here is a list of the commands that work:
ls
  • And replacement commands:
    • dccp: to copy files from /pnfs to local disk. Example:
dccp dcap://maite.iihe.ac.be/pnfs/iihe
  • To open files using root, use eg
 root dcap://maite.iihe.ac.be/pnfs/iihe/some/file.root 

When reading out the rootfiles is rather slow or it doesn't work at all and nothing is wrong with the root file (e.g. in an interactive analysis on beo or msa) you can increase your dCache readahead buffer. Don't make the buffer larger than 50MB! To enlarge the buffer set this in you environment for csh:

setenv DCACHE_RAHEAD 1
setenv DCACHE_RA_BUFFER 50000000

For bash:

export DCACHE_RAHEAD=true
export DCACHE_RA_BUFFER=50000000

See the dChache fanpage for further reading.