Policies: Difference between revisions

From T2B Wiki
Jump to navigation Jump to search
m (Created page with " == Policies concerning the usage of local computing resources == PageOutline *The following rules are put in place to allow a fair share of resources between the us...")
 
Line 1: Line 1:


== Policies concerning the usage of local computing resources ==
== Policies concerning the usage of local computing resources ==
[[PageOutline]]


*The following rules are put in place to allow a fair share of resources between the users. In case you violate these rules your account could be disabled.  
*The following rules are put in place to allow a fair share of resources between the users. In case you violate these rules your account could be disabled.  
Line 56: Line 54:
=== Memory usage on the grid ===
=== Memory usage on the grid ===


*To protect the grid, there is an upper memory limit per job of 1.8GB (this is larger than what is asked by CMS) for the physical and the virtual memory.
*To protect the grid, there is an upper memory limit per job of 2.0GB (this is larger than what is asked by CMS) for the physical and the virtual memory.
*If your job exceeds this limit, it will be killed by the queueing system.
*If your job exceeds this limit, it will be killed by the queueing system.
**in you crab error log, you will find an error code 271  
**in you crab error log, you will find an error code 271  
**if you use direct submission, the reason will be clearly stated in your error log
**if you use direct submission, the reason will be clearly stated in your error log
{{TracNotice|{{PAGENAME}}}}

Revision as of 11:40, 21 September 2015

Policies concerning the usage of local computing resources

  • The following rules are put in place to allow a fair share of resources between the users. In case you violate these rules your account could be disabled.
  • In case you have specific needs concerning storage or CPU please contact the site administrators on T2bSupport.

User Interface policy

  • All user interfaces are running Scientific Linux 6, except m3.iihe.ac.be which is still running SL5.
  • The following machines are for light tasks, such as:
    • Crab submission
    • small interactive root processes
    • building code
    • debugging code
m0.iihe.ac.be, m1.iihe.ac.be, m2.iihe.ac.be, m3.iihe.ac.be
  • The following machines are available for CPU-intensive and long tasks
m5.iihe.ac.be, m6.iihe.ac.be, m7.iihe.ac.be, m8.iihe.ac.be and m9.iihe.ac.be
  • In the near future we will provide a simple system to distinguish the compilation machines from the ones where interactive jobs can run.

This policy is enforced. Processes taking more than 30 minutes CPU on m1->3 will be killed by the operating system. It is unlimited on m5->m9.

Disk space usage policy

  • Users can have several locations to store their files/analysis code/final results/...
    • The /user partition on the UIs (m-machines) is limited to 500 GB per user.
    • This space should be used as working environment, eg. to checkout code, store results,... It should not be used to store large datasets.
    • The /localgrid partition on the UIs (partition is mounted on the workernodes), with its quota shared with /user, so max(/localgrid + /user = 500 GB).
    • This space serves as sandbox for input/output of jobs sent to the local batch queue.
    • The /pnfs area has a limit of 2TB per user.
    • This area should contain the sometimes large dataset needed for physics analysis.
    • In case one needs more space, please contact the site admins here.
  • Semi-Automatic removal of old files on /pnfs is done every 3 months.
    • All files not accessed in 1 year need to be explicitely un-flagged by the user in order to keep them
    • All other files CAN be marked by the user for deletion
    • Several mails will be send to remind all users to do this.
    • These mails will be send in a span of ~1month, after which the admins will proceed to the deletion of all flagged files.
  • More information is found on the deletion page: http://mon.iihe.ac.be/OldPnfsFiles
    • If you need an account on this page, please ask the admins (grid_adminNOSPAM@listserv.vub.ac.be)

Back-up procedures of files

  • The is a local backup (snapshot mechanism) of the user home directories (for more detailed info, see Backup
    • This back-up is made every day and we can go back day by day till last week. This is to address e.g. user small mistaken deletions.
    • Users are strongly advised to not solely rely on this backup. Using a versioning system (SVN or CVS) should prevent accidental removal of files and allowas a user to go back to a previous file when the file was messed up. We don't maintain a CVS repository ourselves but the CMS one should be used, more info here
  • The entire user home directories are backed up every week in a physically separated hardware. This is to address catastrophe scenario.

Memory usage on the grid

  • To protect the grid, there is an upper memory limit per job of 2.0GB (this is larger than what is asked by CMS) for the physical and the virtual memory.
  • If your job exceeds this limit, it will be killed by the queueing system.
    • in you crab error log, you will find an error code 271
    • if you use direct submission, the reason will be clearly stated in your error log