一、使用printf格式化输出
printf 可以非常灵活、简单地以你期望的格式输出结果。
语法:
printf "print format", variable1,variable2,etc.
printf 中的特殊字符:
printf 不会使用 OFS 和 ORS,它只根据”format”里面的格式打印数据。
printf 格式化字符:
实例1:
[root@localhost ~]# cat pri.awk BEGIN { printf "s--> %s\n", "String" printf "c--> %c\n", "String" printf "s--> %s\n", 101.23 printf "d--> %d\n", 101,23 printf "e--> %e\n", 101,23 printf "f--> %f\n", 101,23 printf "g--> %g\n", 101,23 printf "o--> %o\n", 0x8 printf "x--> %x\n", 16 printf "percentage--> %%\n", 17 } [root@localhost ~]# awk -f pri.awk s--> String c--> S s--> 101.23 d--> 101 e--> 1.010000e+02 f--> 101.000000 g--> 101 o--> 10 x--> 10 percentage--> %
printf中修饰字符:
修饰符:#[.#] 第一个数字控制显示的宽度;第二个#表示小数点后精度
– 左对齐(默认右对齐)%-15s
+ 显示数值的正负符号 %+d,0也会添加正号
$ 如果要在价钱之前加上美元符号,只需在格式化字符串之前(%之前)加上$即可
0 左边补 0 (而不是空格),在指定宽度的数字前面加一个 0,例如使用”%05s”代替”%5s”
实例2:
[root@localhost ~]# awk 'BEGIN { printf "|%6s%7.3f|\n", "Good","2.1" }' | Good 2.100| [root@localhost ~]# awk 'BEGIN { printf "|%-6s%-7.3f|\n", "Good","2.1" }' |Good 2.100 |
把结果重定向到文件:
Awk 中可以把 print 语句打印的内容重定向到指定的文件中。
实例3:
[root@localhost ~]# awk 'BEGIN{a=5;printf "%3d\n",a> "report.txt"}' [root@localhost ~]# cat report.txt 5
另一种方法使用awk -f script.awk file > redirectfile
awk脚本执行方式:
实例4:
[root@localhost ~]# cat fz.awk #!/bin/awk -f BEGIN { FS=","; OFS=","; total1 = total2 = total3 = total4 = total5 = 10; total1 += 5; print total1; total2 -= 5; print total2; total3 *= 5; print total3; total4 /= 5; print total4; total5 %= 5; print total5; } [root@localhost ~]# chmod +x fz.awk [root@localhost ~]# ./fz.awk 15 5 50 2 0
二、awk内置函数与自定义函数
数值处理函数:
rand()函数
rand()函数用于产生 0~1 之间的随机数,它只返回 0~1 之间的数,绝不会返回 0 或 1。这些 数在 awk 运行时是随机的,但是在多次运行中,又是可预知的。
实例1:产生 1000 个随机数(0 到 100 之间)
[root@localhost ~]# cat occ.awk BEGIN { while(i<1000) { n = int(rand()*100); rnd[n]++; i++; } for(i=0;i<=100;i++) { print i,"Occured",rnd[i],"times"; } } [root@localhost ~]# awk -f occ.awk 0 Occured 11 times 1 Occured 8 times 2 Occured 9 times 3 Occured 15 times 4 Occured 16 times 5 Occured 5 times 6 Occured 8 times 7 Occured 9 times 8 Occured 7 times 9 Occured 7 times 10 Occured 11 times 11 Occured 7 times 12 Occured 10 times 13 Occured 9 times 14 Occured 6 times 15 Occured 18 times 16 Occured 10 times 17 Occured 10 times 18 Occured 9 times 19 Occured 8 times 20 Occured 11 times 21 Occured 13 times 22 Occured 10 times 23 Occured 9 times 24 Occured 15 times 25 Occured 8 times 26 Occured 3 times 27 Occured 17 times 28 Occured 9 times 29 Occured 13 times 30 Occured 11 times 31 Occured 9 times 32 Occured 12 times 33 Occured 12 times 34 Occured 9 times 35 Occured 6 times 36 Occured 13 times 37 Occured 15 times 38 Occured 6 times 39 Occured 9 times 40 Occured 7 times 41 Occured 8 times 42 Occured 6 times 43 Occured 8 times 44 Occured 10 times 45 Occured 7 times 46 Occured 10 times 47 Occured 8 times 48 Occured 16 times 49 Occured 12 times 50 Occured 6 times 51 Occured 15 times 52 Occured 6 times 53 Occured 12 times 54 Occured 8 times 55 Occured 13 times 56 Occured 6 times 57 Occured 16 times 58 Occured 5 times 59 Occured 7 times 60 Occured 11 times 61 Occured 12 times 62 Occured 14 times 63 Occured 11 times 64 Occured 9 times 65 Occured 6 times 66 Occured 7 times 67 Occured 10 times 68 Occured 8 times 69 Occured 12 times 70 Occured 13 times 71 Occured 9 times 72 Occured 10 times 73 Occured 11 times 74 Occured 7 times 75 Occured 13 times 76 Occured 13 times 77 Occured 10 times 78 Occured 5 times 79 Occured 12 times 80 Occured 17 times 81 Occured 8 times 82 Occured 7 times 83 Occured 10 times 84 Occured 12 times 85 Occured 12 times 86 Occured 11 times 87 Occured 14 times 88 Occured 4 times 89 Occured 8 times 90 Occured 15 times 91 Occured 10 times 92 Occured 15 times 93 Occured 8 times 94 Occured 11 times 95 Occured 5 times 96 Occured 12 times 97 Occured 11 times 98 Occured 7 times 99 Occured 11 times 100 Occured times
注意:可见rand()函数产生的随机数重复概率很高。
srand(n)函数
srand(n)函数使用给定的参数 n 作为种子来初始化随机数的产生过程。不论何时启动, awk 只会从 n 开始产生随机数,如果不指定参数 n, awk 默认使用当天的时间作为产生随机数的 种子。
实例2:产生 5 个从 5 到 50 的随机数
[root@localhost ~]# cat srand.awk BEGIN { #Initialize the sedd with 5. srand(5); #Totally I want to generate 5 numbers total = 5; #maximun number is 50 max = 50; count = 0; while(count < total) { rnd = int(rand()*max); if( array[rnd] == 0 ) { count++; array[rnd]++; } } for ( i=5;i<=max;i++) { if (array[i]) print i;} } [root@localhost ~]# awk -f srand.awk 14 16 23 33 35
常用字符串函数:
length函数:
length([S]) 返回指定字符串长度。
实例1:length函数
[root@bash ~]# awk 'BEGIN{print length("young")}' 5
sub函数:
sub(r,s,[t]) 对t字符串进行搜索r表示的模式匹配的内容(可使用正则匹配),并将第一个匹配的内容替换为s代表的字符串。
实例1:
[root@bash ~]# awk 'BEGIN{a="geek young";sub("young","xixi",a);print a}' geek xixi #注意字符串要用引号
实例2:
[root@bash ~]# echo "geek young hahahaha"|awk ' >{sub(/\<young\>/,"xixi",$2); #正则匹配模式中字符串不加引号 >print $2}' xixi
实例3:
[root@bash ~]# echo "2008:08:08:08 08:08:08" | awk 'sub(/:/,"",$1)' 200808:08:08 08:08:08
实例4:
[root@bash ~]# cat sub.awk BEGIN { state="CA is California" sub("C[Aa]","KA",state); print state; } [root@bash ~]# awk -f sub.awk KA is California
gsub函数:
gsub([r,s,[t]]) 对t字符串进行搜索r表示的模式匹配的内容(可使用正则匹配),并全部替换为s。
实例1:
[root@bash ~]# echo "2008:08:08:08 08:08:08" | awk 'gsub(/:/,"",$1)' 2008080808 08:08:08
split函数:
split(s,array,[r]) 以r为分割符切割字符s,并将切割后的结果存至array表示的数组中第一个索引值为1,第二个索引值为2,…。
实例1:
[root@bash ~]# echo "192.168.1.1:80"|awk ' >{split($1,ip,":"); >print ip[1],"----",ip[2]}' 192.168.1.1 ---- 80
实例2:
[root@bash ~]# netstat -tan | awk ' >/^tcp\>/{split($5,ip,":"); >count[ip[1]]++} #将一个数组的值作为另一个数组的索引并自加通常用来计算重复次数 >END{for (i in count){print i,count[i]}}' 116.211.167.193 3 0.0.0.0 4 192.168.1.116 1
实例3:
[root@bash ~]# cat items-sold1.txt 101:2,10,5,8,10,12 102:0,1,4,3,0,2 103:10,6,11,20,5,13 104:2,3,4,0,6,5 105:10,2,5,7,12,6 [root@bash ~]# cat split.awk BEGIN { FS=":" } { split($2,quantity,","); total=0; for(x in quantity) total=total+quantity[x]; print "Item",$1,":",total,"quantities sold"; } [root@bash ~]# awk -f split.awk items-sold1.txt Item 101 : 47 quantities sold Item 102 : 10 quantities sold Item 103 : 65 quantities sold Item 104 : 20 quantities sold Item 105 : 42 quantities sold
substr 函数
语法:
substr(input-string,location,length)
-
substr 函数从字符串中提取指定的部分(子串),上面语法中:
-
input-string:包含子串的字符串
-
location:子串的开始位置
-
length:从 location 开始起,出去的字符串的总长度。这个选项是可选的,如果不指
-
定长度,那么从 location 开始一直取到字符串的结尾
实例1:从字符串的第 5 个字符开始,取到字符串结尾并打印出来
[root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# awk '{ print substr($0,5) }' items.txt HD Camcorder,Video,210,10 Refrigerator,Appliance,850,2 MP3 Player,Audio,270,15 Tennis Racket,Sports,190,20 Laser Printer,Office,475,5
实例2:从第 2 个字段的第 1 个字符起,打印 5 个字符
[root@localhost ~]# awk -F"," '{ print substr($2,1,5) }' items.txt HD Ca Refri MP3 P Tenni Laser
调用shell函数
双向管道 |&
awk 可以使用”|&”和外部进程通信,这个过程是双向的。
实例1:
[root@localhost ~]# cat doub.awk BEGIN { command = "sed 's/Awk/Sed and Awk/'" print "Awk is Great!" |& command close(command,"to"); #awk中同时只能存在一个管道 command |& getline tmp print tmp; close(command); } [root@localhost ~]# awk -f doub.awk Sed and Awk is Great!
说明:”|&”表示这里是双向管道。 ”|&”右边命令的输入来自左边命令的输出。close(command,”to”) – 一旦命令执行完成,应该关闭”to”进程。 command |& getline tmp –既然命令已经执行完成,就要用 getline 获取其输出。前面命令的输出会被存在变量”tmp”中。close(command) 最后,关闭命令。
system系统函数
执行系统命令时,可以传递任意的字符串作为命令的参数,它会被当做操作系统命令准确第执行,并返回结果(这和双向管道有所不同)。
实例1:
[root@localhost ~]# awk 'BEGIN{system("hostname");}' #不用加print命令 localhost.localdomain [root@localhost ~]# awk 'BEGIN{system("pwd")}' /root [root@localhost ~]# awk 'BEGIN{system("date")}' Fri Jan 20 23:57:55 CST 2017
getline函数
geline 命令可以控制 awk 从输入文件(或其他文件)读取数据。注意,一旦 getline执行完成, awk 脚本会重置 NF,NR,FNR 和$0 等内置变量。
实例1:
[root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# awk -F"," ' >{getline;print $0;}' items.txt #类似sed中n命令改变awk执行流程 102,Refrigerator,Appliance,850,2 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5
-
开始执行 body 区域时,执行任何命令之前, awk 从 items.txt 文件中读取第一行数据,保存在变量$0 中
-
getline – 我们用 getline 命令强制 awk 读取下一行数据,保存在变量$0 中(之前的内容被覆盖掉了)
-
print $0 –既然现在$0 中保存的是第二行数据, print $0 会打印文件第二行(而不是第一行)
-
body 区域继续执行,只打印偶数行的数据。 (注意到最后一行 105 也打印了 )
除了把 getline 的内容放到$0 中,还可以把它保存在变量中。
实例2:打印奇数行
[root@localhost ~]# awk -F"," '{getline tmp; print $0;}' items.txt 101,HD Camcorder,Video,210,10 103,MP3 Player,Audio,270,15 105,Laser Printer,Office,475,5
说明:
-
开始执行 body 区域时,执行任何命令之前, awk 从 items.txt 文件中读取第一行数据,保存在变量$0 中
-
getline tmp – 强制 awk 读取下一行,并保存在变量 tmp 中
-
print $0 – 此时$0 仍然是第一行数据,因为 getline tmp 没有覆盖$0,因此会打印第一行数据(而不是第二行)
-
body 区域继续执行,只打印奇数行的数据。
实例3:从其他的文件 getline 内容到变量中
[root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# cat items-sold.txt 101 2 10 5 8 10 12 102 0 1 4 3 0 2 103 10 6 11 20 5 13 104 2 3 4 0 6 5 105 10 2 5 7 12 6 [root@localhost ~]# awk -F"," '{ >print $0; >getline tmp < "items-sold.txt"; >print tmp;}' items.txt 101,HD Camcorder,Video,210,10 101 2 10 5 8 10 12 102,Refrigerator,Appliance,850,2 102 0 1 4 3 0 2 103,MP3 Player,Audio,270,15 103 10 6 11 20 5 13 104,Tennis Racket,Sports,190,20 104 2 3 4 0 6 5 105,Laser Printer,Office,475,5 105 10 2 5 7 12 6
实例4:getline 执行外部命令
[root@localhost ~]# cat get.awk BEGIN { FS=","; "date" | getline close("date") print "Timestamp:" $0 } { if ( $5 <= 5) print "Buy More:Order",$2,"immediately!" else print "Sell More:Give discount on",$2,"immediatelty!" } [root@localhost ~]# cat items.txt 101,HD Camcorder,Video,210,10 102,Refrigerator,Appliance,850,2 103,MP3 Player,Audio,270,15 104,Tennis Racket,Sports,190,20 105,Laser Printer,Office,475,5 [root@localhost ~]# awk -f get.awk items.txt Timestamp:Sat Jan 21 00:23:53 CST 2017 Sell More:Give discount on HD Camcorder immediatelty! Buy More:Order Refrigerator immediately! Sell More:Give discount on MP3 Player immediatelty! Sell More:Give discount on Tennis Racket immediatelty! Buy More:Order Laser Printer immediately!
实例5:除了把命令输出保存在$0 中之外,也可以把它保存在任意的 awk 变量中
[root@localhost ~]# cat get2.awk BEGIN {FS=","; "date" | getline timestamp close("date") print "Timestamp:" timestamp } { if ( $5 <= 5) print "Buy More: Order",$2,"immediately!" else print "Sell More: Give discount on",$2,"immediately!" } [root@localhost ~]# awk -f get2.awk items.txt Timestamp:Sat Jan 21 00:26:29 CST 2017 Sell More: Give discount on HD Camcorder immediately! Buy More: Order Refrigerator immediately! Sell More: Give discount on MP3 Player immediately! Sell More: Give discount on Tennis Racket immediately! Buy More: Order Laser Printer immediately!
awk自定义函数
格式:
function name ( parameter, parameter, … ) {
statements
return expression
}
实例1:
[root@localhost ~]
# cat fun.awk
function
max(v1,v2) {
v1>v2?var=v1:var=v2
return
var
}
BEGIN{a=3;b=2;print max(a,b)}
[root@localhost ~]
# awk -f fun.awk
3
本文永久更新链接地址:http://www.linuxidc.com/Linux/2017-02/140274.htm