Top tips to keep your home folder on a server tidy

All bioinformatics server users and administrators would know how easy it is to fill up our home directories with huge amounts of data, especially when you are analysing deep sequencing data on a daily basis.

Here is a list of a few useful commands and tips that can help to keep your home directory tidy.

  • Do not copy fastq files to multiple locations, create soft links instead using the following command in your working directories.

  • Always convert .sam files to .bam files

  • Zip any data files/folders that are not going to be used for next few weeks.

  • To compress a directory and all the data within it run the following command

  • Organise your home directory well.

Keep all reference sequences in one folder

Keep all indexes in one folder (this could be the same folder as the references for simplicity)

  • Always delete temporary and intermediate files and keep a log of deleted files in a text file.
  • Empty the trash folder if you use a GUI or Virtual Desktop Environment
  • Use ncdu, tree or baobab (GUI) commands to find out disk consumption

  • Find out the size of your home directory using the following du command

  • For advanced users:

As mentioned in this stackoverflow forum, if you would like to get a list of multiple copies of files in your directory use the following set of commands.

 

Bioinformatician at CVR.
http://bioinformatics.cvr.ac.uk/sejal.php
  • Quan Gu

    brilliant!