HOME > ニュース
[日本語 | English]

Request to reduce usage of /aptmp/(username)/

(Update Date: October 25th, 2023)

Target User: Users of the computing server who are using /aptmp/(username).
Cleanup Period: Until early November.
Details: Starting around mid-November, we plan to copy /aptmp/ of the computing server to the new system's /aptmp/ over a period of 4 to 5 weeks. If the number of files or disk usage of /aptmp/(username) is large, there is a possibility that file copying may not be completed during the migration period. Therefore, we kindly request users to reduce usage of /aptmp/(username)/ by early October.
If users themselves have taken a backup of /aptmp/ or if the data migration of /aptmp/ is unnecessary, please contact us at spradm@scl.kyoto-u.ac.jp.

Reducing usage of /aptmp/

  • We kindly request the removal of unnecessary files and directories.
  • For large files that you do not plan to use in the near future, please compress them using the gzip command.
    • We recommend to use the pigz command, which utilizes multiple cores for speedy gzip compression.
  • For directories containing numerous files that you do not plan to use in the near future, please compress them in tar.gz format.
    • Since the copying speed of the rsync command significantly decreases for small files of a few 10kb, we kindly ask you to compress directories with a large number of small files in tar.gz format.
    • Use the targz command, which utilizes multiple cores for fast directory tar.gz compression.

Current Status Verification

Checking Disk Usage of /aptmp/


You can check the disk usage and file count of /aptmp/(username) using the lfs-quota command.
$ lfs-quota
Disk quotas for usr ****** (uid ****):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
       /lustre/ 867585824       0       0       -   41408       0       0       -
Using the -h option displays disk usage in units such as kB, MB, GB, and TB.
$ lfs-quota -h
Disk quotas for usr ****** (uid ****):
     Filesystem    used   quota   limit   grace   files   quota   limit   grace
       /lustre/  827.4G      0k      0k       -   41408       0       0       -

Checking Disk Usage of Directories


$ du -hs (directory)
  • Example1: You can specify multiple directories using '*'.
    $ du -hs /aptmp/xxx/*
  • If there are a large number of files, the command may take some time to execute.
  • If the command takes more than one hour to execute, please consider reducing the number of files by compressing directories, etc.

Checking File Size Distribution within a Directory


$ hist_filesize (directory)
  • If there are a large number of files, the command may take some time to execute.
  • If the command takes more than one hour to execute, please consider reducing the number of files by compressing directories, etc.

Estimating File Copy Time

You can estimate the time it takes to copy files using the rsync command.
Provide the result of the hist_filesize command to the copytime command.

$ hist_filesize (directory) > hist_filesize.out
$ copytime hist_filesize.out
Alternatively,
$ hist_filesize (directory) | copytime

Searching for Files with Specified Conditions

  • Searching for files larger than 1GiB. Display file sizes as well.
    $ find (directory)/ -type f -size +1G -exec ls -lh {} \;
  • Searching for .fasta files larger than 100MiB. Display file sizes as well.
    $ find (directory)/ -type f -size +100M -name "*.fasta" -exec ls -lh {} \;
  • Searching for files larger than 1GiB that are not in .sra, .bam, or .gz format. Display file sizes as well.
    $ find (directory)/ -type f -size +1G ! -name "*.sra" ! -name "*.bam" ! -name "*.gz" -exec ls -lh {} \;
  • Searching for files smaller than 2kb (2000 bytes).
    $ find (directory)/ -type f -size -2000c
  • Searching for files larger than 1GiB and older than 365 days. Display file sizes as well.
    $ find (directory)/ -type f -size +1G -mtime +365 -exec ls -lh {} \;

Commands for File and Directory Cleanup

Using the pigz Command for Fast File Compression


$ pigz -p (number of cores) (file)
  • The pigz command performs fast gzip compression using multiple cores.
  • Files compressed with the pigz command can be decompressed using the standard gunzip command.
  • Please avoid gzip compression for .sra and .bam files commonly used in bioinformatics, as it is generally not effective.
  • Example 1: PBS job script to compress .fastq files in a specified directory
    #!/bin/sh
    #PBS -q APC
    #PBS -l select=1:ncpus=10:mem=32gb
    
    find (directory) -type f -name "*.fastq" -print -exec pigz -p 10 {} \;
    
  • Example 2: PBS job script to compress .fastq files in a specified directory with before-and-after file sizes displayed
    #!/bin/sh
    #PBS -q APC
    #PBS -l select=1:ncpus=10:mem=32gb
    
    find (directory) -type f -name "*.fastq" -exec ls -lh {} \; -exec pigz -p 10 {} \; -exec ls -lh {}.gz \;
    

Compressing a Directory Using the targz Command in tar.gz Format


$ targz -p (number of cores) (directory)
$ targz -p (number of cores) --rm (directory)  # Use the --rm option to remove (directory) after compression
  • The targz command bundles the directory using the tar command and performs fast gzip compression using the pigz command.
  • The behavior of running "targz -p 20 --rm /a/b/c/" is as follows:
    cd /a/b/                              # Move to the directory containing c/
    tar cvf - c/ | pigz -p 20 > c.tar.gz  # Compress the c/ directory
    tar atf c.tar.gz                      # Check if c.tar.gz is not corrupted
    /bin/rm -rf c/                        # Remove the c/ directory
    
    (Note) Instead of directly using tar cvf - /a/b/c/, move to /a/b/ and then compress c/. The final generated file will be /a/b/c.tar.gz.
  • Example 1: Script to Compress and Remove Directory
    #!/bin/sh
    #PBS -q APC
    #PBS -l select=1:ncpus=10:mem=32gb
    
    targz -p 10 --rm (directory)
    

Change History

Date Description
2023/08/9 Initial version
2023/10/25 Due to a schedule delay, the start date for copying is changed.