Here are the few essential awk command line scripts for next generation sequence analysis. Users need latest version of gawk to run commands with bitwise operations. Most of the Linux distributions come with gawk. However OSX users have to install it from here http://rudix.org/packages/gawk.html Count number of reads in a FastQ file awk ‘END{print NR/4}’ […]
Read MoreShort command lines for manipulation FASTQ and FASTA sequence files
- February 23, 2015
- 9 Comments
I thought it was time for me to compile all the short command that I use on a more or less regular basis to manipulate sequence files. Convert a multi-line fasta to a singleline fasta awk ‘!/^>/ { printf “%s”, $0; n = “\n” } /^>/ { print n $0; n = “” } END […]
Read More