User Tools

Site Tools


apache:logs:extract_all_user_agents_from_apache_logs

This is an old revision of the document!


Apache - Logs - Extract all user agents from Apache logs

cat test.log | awk -F\" '{print $6}' | sort | uniq -c | sort -n

(where “test.log” is the access logfile you want to analyse).

Returns

51916 MetaURI API/2.0 +metauri.com
59899 Twitterbot/1.0
87819 Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
111261 Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
187812 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)
189834 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
390477 facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

The first number (bolded) is the amount of times this spider/crawler/user agent/ has accessed your site. Beware, these are not all crawlers, as the data is intermixed with actual human user traffic and other useful traffic.

NOTE: In the example above, notice that the “Facebookexternalhit” user agent accessed the site 390,477 times per month.

  • That is roughly 541x per hour. Excessive!!!.
  • On the kill list, you go!
apache/logs/extract_all_user_agents_from_apache_logs.1689595890.txt.gz · Last modified: 2023/07/17 12:11 by peter

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki