Awk is a very powerful shell scripting tool, particularly useful for file manipulation, in data cleansing exercises etc. 

We have used awk (and nawk) extensively in many of the data cleansing exercises we've been involved in and have also used it for generating SQL updates for things like "nice case" (mixed case) format for names and addresses.

Nice Case Conversion Script

The following code snippet is for an awk script to convert names to the preferred nice case format

#------------------------------------------------------------------------------
# FILE: correctcase.sh
# Shell script case conversion... If you want it to generate SQL then simply
# have the keyfield as say field 1 and then include the appropriate:
# "UPDATE table SET field = \"" casey($fld) "\" WHERE key = \" $1 "\";"
#------------------------------------------------------------------------------
inputfile=$1

nawk -F"|" '
# Nice case convert string - note 'i' is global!!
function casey( str )
{
   ret = ""; str = tolower( str ); prev= " "
   for (i=1;i<=length(str);i++) {
       ch = substr( str, i, 1 )
       if (prev < "a" || prev > "z")
          ret = ret toupper( ch )
       else
          ret = ret ch
       prev = ch
      }
   return ret
}
BEGIN { OFS="|" }
NF > 3 {
        for (fld=3;fld<NF;fld++) { $fld = casey( $fld ); print }' $inputfile

Training Notes

As well as writing awk and other shell scripts, we also deliver training courses. Attached is a training document that we prepared (quite a long time ago) for an awk training session, please use it.

AWK Training Notes

Tags: