[日本語 | English]
Request to reduce usage of /aptmp/(username)/
(Update Date: October 25th, 2023)
Target User: Users of the computing server who are using /aptmp/(username).
Cleanup Period: Until early November.
Details: Starting around mid-November, we plan to copy /aptmp/ of the computing server to the new system's /aptmp/ over a period of 4 to 5 weeks.
If the number of files or disk usage of /aptmp/(username) is large, there is a possibility that file copying may not be completed during the migration period.
Therefore, we kindly request users to reduce usage of /aptmp/(username)/ by early October.
If users themselves have taken a backup of /aptmp/ or if the data migration of /aptmp/ is unnecessary, please contact us at spradm@scl.kyoto-u.ac.jp.
Reducing usage of /aptmp/↑
- We kindly request the removal of unnecessary files and directories.
- For large files that you do not plan to use in the near future, please compress them using the gzip command.
- We recommend to use the pigz command, which utilizes multiple cores for speedy gzip compression.
- For directories containing numerous files that you do not plan to use in the near future, please compress them in tar.gz format.
- Since the copying speed of the rsync command significantly decreases for small files of a few 10kb, we kindly ask you to compress directories with a large number of small files in tar.gz format.
- Use the targz command, which utilizes multiple cores for fast directory tar.gz compression.
Current Status Verification ↑
Checking Disk Usage of /aptmp/↑
You can check the disk usage and file count of /aptmp/(username) using the lfs-quota command.
$ lfs-quota
Disk quotas for usr ****** (uid ****):
Filesystem kbytes quota limit grace files quota limit grace
/lustre/ 867585824 0 0 - 41408 0 0 -
Using the -h option displays disk usage in units such as kB, MB, GB, and TB.
$ lfs-quota -h
Disk quotas for usr ****** (uid ****):
Filesystem used quota limit grace files quota limit grace
/lustre/ 827.4G 0k 0k - 41408 0 0 -
Checking Disk Usage of Directories ↑
$ du -hs (directory)
- Example1: You can specify multiple directories using '*'.
$ du -hs /aptmp/xxx/*
- If there are a large number of files, the command may take some time to execute.
- If the command takes more than one hour to execute, please consider reducing the number of files by compressing directories, etc.
Checking File Size Distribution within a Directory ↑
$ hist_filesize (directory)
- If there are a large number of files, the command may take some time to execute.
- If the command takes more than one hour to execute, please consider reducing the number of files by compressing directories, etc.
Estimating File Copy Time ↑
You can estimate the time it takes to copy files using the rsync command.
Provide the result of the hist_filesize command to the copytime command.
$ hist_filesize (directory) > hist_filesize.out
$ copytime hist_filesize.out
Alternatively,
$ hist_filesize (directory) | copytime
Searching for Files with Specified Conditions ↑
- Searching for files larger than 1GiB. Display file sizes as well.
$ find (directory)/ -type f -size +1G -exec ls -lh {} \;
- Searching for .fasta files larger than 100MiB. Display file sizes as well.
$ find (directory)/ -type f -size +100M -name "*.fasta" -exec ls -lh {} \;
- Searching for files larger than 1GiB that are not in .sra, .bam, or .gz format. Display file sizes as well.
$ find (directory)/ -type f -size +1G ! -name "*.sra" ! -name "*.bam" ! -name "*.gz" -exec ls -lh {} \;
- Searching for files smaller than 2kb (2000 bytes).
$ find (directory)/ -type f -size -2000c
- Searching for files larger than 1GiB and older than 365 days. Display file sizes as well.
$ find (directory)/ -type f -size +1G -mtime +365 -exec ls -lh {} \;
Commands for File and Directory Cleanup ↑
Using the pigz Command for Fast File Compression ↑
$ pigz -p (number of cores) (file)
- The pigz command performs fast gzip compression using multiple cores.
- Files compressed with the pigz command can be decompressed using the standard gunzip command.
- Please avoid gzip compression for .sra and .bam files commonly used in bioinformatics, as it is generally not effective.
- Example 1: PBS job script to compress .fastq files in a specified directory
#!/bin/sh
#PBS -q APC
#PBS -l select=1:ncpus=10:mem=32gb
find (directory) -type f -name "*.fastq" -print -exec pigz -p 10 {} \;
- Example 2: PBS job script to compress .fastq files in a specified directory with before-and-after file sizes displayed
#!/bin/sh
#PBS -q APC
#PBS -l select=1:ncpus=10:mem=32gb
find (directory) -type f -name "*.fastq" -exec ls -lh {} \; -exec pigz -p 10 {} \; -exec ls -lh {}.gz \;
Compressing a Directory Using the targz Command in tar.gz Format ↑
$ targz -p (number of cores) (directory)
$ targz -p (number of cores) --rm (directory) # Use the --rm option to remove (directory) after compression
Change History↑
Date |
Description |
2023/08/9 |
Initial version |
2023/10/25 |
Due to a schedule delay, the start date for copying is changed. |
|