利用linux命令行grep|awk在mac本上分析wordpress.log访问日志

作者: admin 分类: 命令行 发布时间: 2019-03-20 23:22  阅读: 418 views

怎么利用linux命令行查看网站的访问日志呢?

有个人网站的同学,可能会想知道有哪些IP、地址访问的自己的站。这个可以使用三方统计工具获得,如百度统计、cnzz等。如果不想用这个的话,就要看自己的动手能力了。一般wordpress的模板网站在阿里云的服务器访问日志路径在/alidata/log/nginx/access/wordpress.log路径下。先下载下来,可以用记事本打开。也可以命令行查看。

  • 先查看下前十行的日志包含什么内容
grep -v 'js\|php\|css\|png\|jpg\|login' --color=auto -m 10 wordpress.log

IP  | 时间 | 请求类型 | 请求地址| 协议版本 | 请求状态码 | 客户端版本
97.74.228.115 - - [14/Mar/2019:15:46:07 +0800] "POST /xmlrpc.php HTTP/1.1" 200 414 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; fr; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"
97.74.228.115 - - [14/Mar/2019:15:46:07 +0800] "" 400 0 "-" "-"
185.211.245.169 - - [14/Mar/2019:15:46:30 +0800] "POST /xmlrpc.php HTTP/1.1" 200 414 "http://deathearth.com/xmlrpc.php" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/535.24.77 (KHTML, like Gecko) Chrme/54.8.3571.8843 Safari/531.94"
103.130.201.5 - - [14/Mar/2019:15:47:39 +0800] "GET /wp-login.php HTTP/1.1" 200 1212 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:41 +0800] "POST /wp-login.php HTTP/1.1" 200 1622 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:42 +0800] "GET /wp-login.php HTTP/1.1" 200 1212 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:43 +0800] "POST /wp-login.php HTTP/1.1" 200 1601 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:47 +0800] "GET /wp-login.php HTTP/1.1" 200 1212 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:49 +0800] "POST /wp-login.php HTTP/1.1" 200 1636 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:51 +0800] "POST /wp-login.php HTTP/1.1" 200 1600 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"

基本都是登录页面,在看看有哪些.html后缀的地址被访问

  • 排除掉不想看的内容信息在进行查看
grep  '.html' --color=auto -m 10 wordpress.log 

97.74.228.115 - - [14/Mar/2019:15:46:07 +0800] "" 400 0 "-" "-"
47.101.204.96 - - [14/Mar/2019:15:48:33 +0800] "GET / HTTP/1.1" 200 37308 "-" "Mozilla/5.0 (Linux; Android 4.1.1; Nexus 7 Build/JRO03D))"
122.224.233.150 - - [14/Mar/2019:15:48:50 +0800] "GET / HTTP/1.1" 200 11701 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36"
123.125.71.24 - - [14/Mar/2019:15:49:15 +0800] "GET /360.html HTTP/1.1" 200 19884 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
203.208.60.79 - - [14/Mar/2019:15:52:10 +0800] "GET /page/9 HTTP/1.1" 200 8545 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
203.208.60.82 - - [14/Mar/2019:15:53:47 +0800] "GET /355.html HTTP/1.1" 200 10084 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
42.72.105.140 - - [14/Mar/2019:15:56:09 +0800] "GET /323.html HTTP/1.1" 200 19733 "https://www.google.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
203.208.60.88 - - [14/Mar/2019:16:02:52 +0800] "GET /382.html HTTP/1.1" 200 9025 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
218.2.97.91 - - [14/Mar/2019:16:10:30 +0800] "GET /499.html HTTP/1.1" 200 12012 "https://www.baidu.com/link?url=qOaUMBLZUaDyaQzUVlZ-WidqZNUH5EU3DlbInCPTpu8FkkG6PMFqIXaPsOWFDQld&wd=&eqid=90ba627100009750000000035c8a0c47" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
218.2.97.91 - - [14/Mar/2019:16:10:45 +0800] "GET /wp-content/themes/JieStyle-Two/images/favicon.ico HTTP/1.1" 200 1150 "https://www.deathearth.com/499.html" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"

 

  • 再去查看下剩余内容中只包含spider爬取记录的信息(取前10行)
grep -v 'js\|php\|css\|png\|jpg\|login' --color=auto wordpress.log |grep 'spider' -m 10

123.125.71.24 - - [14/Mar/2019:15:49:15 +0800] "GET /360.html HTTP/1.1" 200 19884 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
220.181.108.101 - - [14/Mar/2019:16:59:17 +0800] "GET /419.html HTTP/1.1" 200 9211 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
106.38.241.133 - - [14/Mar/2019:17:08:50 +0800] "GET /studyremark/page/6 HTTP/1.1" 200 7418 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
106.38.241.133 - - [14/Mar/2019:17:09:10 +0800] "GET /studyremark/page/2 HTTP/1.1" 200 9487 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
106.38.241.133 - - [14/Mar/2019:17:09:33 +0800] "GET /studyremark/page/4 HTTP/1.1" 200 9241 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
106.38.241.133 - - [14/Mar/2019:17:09:59 +0800] "GET /studyremark/page/5 HTTP/1.1" 200 9286 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
221.5.46.239 - - [14/Mar/2019:17:22:47 +0800] "GET / HTTP/1.1" 200 9809 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
123.125.71.88 - - [14/Mar/2019:17:32:37 +0800] "GET /333.html HTTP/1.1" 200 12713 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
220.181.108.156 - - [14/Mar/2019:18:05:57 +0800] "GET /333.html HTTP/1.1" 200 12596 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
220.181.108.100 - - [14/Mar/2019:18:39:17 +0800] "GET /486.html HTTP/1.1" 200 12331 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
  • 然后又只想查看第一列的IP信息
grep -v 'js\|php\|css\|png\|jpg\|login' --color=auto wordpress.log |grep 'spider'|awk -F '[ ]' '{print $1}'

106.38.241.133
123.125.71.58
123.125.71.53
106.38.241.133
123.125.71.13
180.76.15.19
180.76.15.13
180.76.15.19
180.76.15.25
220.181.108.110
106.38.241.133
...
  • 去重统计下IP个数
##日志默认以空格间隔 
awk '{a[$1]+=1;} END {for(i in a){print a[i]" "i;}}'  wordpress.log

20 47.92.133.31
25 133.237.7.82
1 116.128.128.79
2 42.236.99.130
7 162.243.27.98
4 172.104.108.109
1 117.206.245.206
23 40.83.179.173
1 202.46.58.20
11 5.188.62.5
2 89.115.238.22
  • 突然想获取今天某一个小时之内的信息
grep '' -h sed '/2019-03-20 16:00:00/,/2019-03-20 17:00:00/p' -m 10 wordpress.log 
grep: sed: No such file or directory
grep: /2019-03-20 16:00:00/,/2019-03-20 17:00:00/p: No such file or directory
97.74.228.115 - - [14/Mar/2019:15:46:07 +0800] "POST /xmlrpc.php HTTP/1.1" 200 414 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; fr; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"
97.74.228.115 - - [14/Mar/2019:15:46:07 +0800] "" 400 0 "-" "-"
185.211.245.169 - - [14/Mar/2019:15:46:30 +0800] "POST /xmlrpc.php HTTP/1.1" 200 414 "http://deathearth.com/xmlrpc.php" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/535.24.77 (KHTML, like Gecko) Chrome/54.8.3571.8843 Safari/531.94"
103.130.201.5 - - [14/Mar/2019:15:47:39 +0800] "GET /wp-login.php HTTP/1.1" 200 1212 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:41 +0800] "POST /wp-login.php HTTP/1.1" 200 1622 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:42 +0800] "GET /wp-login.php HTTP/1.1" 200 1212 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:43 +0800] "POST /wp-login.php HTTP/1.1" 200 1601 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:47 +0800] "GET /wp-login.php HTTP/1.1" 200 1212 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:49 +0800] "POST /wp-login.php HTTP/1.1" 200 1636 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
103.130.201.5 - - [14/Mar/2019:15:47:51 +0800] "POST /wp-login.php HTTP/1.1" 200 1600 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"

 

可能写的很挫,后续再补充.


   原创文章,转载请标明本文链接: 利用linux命令行grep|awk在mac本上分析wordpress.log访问日志

如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!

发表评论

电子邮件地址不会被公开。 必填项已用*标注

更多阅读