Log Analysis -tr, awk, uniq, sort

Log analysis method in addition to infringing accidents, disability is useful in situations where.

The majority of the analysis task, an important part of how I want to get rid of unnecessary data or to collect data is on.

The log contains all the messages you’ve got to have lots of data. To do this, depending on how the analysis, you can find a clue, and can fall into, there is also a bilingual labyrinth.

Therefore, when you want to analyze the logs, see the full data, rather than from the important log first logs looking for clues, they can narrow down this way out.

Depending on the nature of the case, lambing is the most important thing is time. At any of the events as they occurred, and how the system is rigged/action can know whether it is the basis.

And second, how effectively an extensive description to extract the log data is an important part, Windows command is separated the effective use.

Therefore, text-delimited log creates an environment that can effectively extract is effective to.

 
 

Get ready for analysis

In order to effectively log analysis GUIenvironment than text, CLI (command prompt) environment is suitable. However, a Windows GUI , CLI, suggesting a character of text in the filter does not generate many conditions.

So here is the command to use in Linux using, additional analysis should provide a helpful filter commands. To do this, let’s look at the available useful Linux.

 
 

Newlinux

Http://blog.naver.com/allmnet/ won the creators Inn pbi12 (Park Byung-IK)is not a personal site is no longer operating submitted by blog. Use the shell-based commands on Linux, on Windows you can use. By default, Windows 2003supports up to one, even by installing on Vista it works fine..

 
 

coLinux

Http://www.colinux.org Windows operating systems can be installed in the form of a virtual machine as a sandbox form of embedded Linux operating system written for Windows and Linux can be used at the same time. Let’s use some UNIX commands to install this operating system can be. However, Linux’s fantastic gccand g ++development environment or, if you frequently use Windows and Linux would be useful. Windows file system is also connected to, is free to access and attractive operating system.

 
 

Linux Subsystem

Windows were added to the add-in, 10from add/remove programs it is possible to use the internal functionality. Because it provides the most from Microsoft Windows compatible with well, it is useful to use.

[Content] The log you can use the UNIX tools

 
 

tr

Standard input, specify the number of characters accepted by the character into the output using the command to

Although a lot of the usual removal, conversion functions are powerful.

It can be used as follows:.

 
 

Tr [option] [string1] [string2]

[Content] tr command usage

 
 

Note that the use of the command tr file to specify the specific content is the same as changing information on all non-will be changed.

Use basic approach is as follows.

 
 

//Files converted to abcd 1234..

tr ‘1234’ ‘abcd’ < file

//Continuous characters – (hyphen)can be specified by using the.

tr [1-4] ‘abcd’

[Content] tr

 
 

There are also a few options that can be used in conjunction with providing the following features..

–d

««How to get rid of that syntax is used. For more information about how to use is as follows.

 
 

//Input of the characters ‘ [‘removed..

tr –d ‘[‘

[Contents] tr sample

-s

In addition to the specified characters will delete all first came out, it appears.

For more information about how to use is as follows.

 
 

//It is the basic output dir .

C:\Users\juhan>dir

C drive’s volume: WIN7

 Volume serial number is: 0C91-5CC4

 
 

C:\Users\juhan directory

 
 

2012-02-24 pm 01:43 <DIR>.

2012-02-24 pm 01:43 <DIR> …

2012-02-16 pm 07:41 <DIR> .cache

2011-12-29 pm 05:57 218 .recently-used.xbel

2012-02-24 am 10:28 <DIR>. VirtualBox

2012-02-16 am 10:32 <DIR> .zenmap

2011-03-23 05:11 <DIR> AcunetixScanner pm

2011-01-27 pm 06:55 <DIR> AppData

2012-02-17 am 09:16 <DIR> Contacts

2012-02-22 am 10:57 <DIR> Daum cloud

2012-02-24 pm 01:46 <DIR> Desktop

2012-02-17 am 11:04 110,246 dlllist.txt

2012-02-24 pm 01:43 <DIR> Documents

2012-02-24 pm 01:43 <DIR> Downloads

2012-02-24 pm 01:43 <DIR> Favorites

2012-02-14 pm 05:11 23,470 fciv.err

2012-02-16 am 10:01 15 get.txt

2011-09-23 am 11:37 <DIR> keel

2012-02-17 am 09:16 <DIR> Links

2012-02-24 pm 03:33 85 mm.cfg

2012-02-17 am 09:16 <DIR> Music

2011-12-17 oni 05:49 <DIR> pm

2012-02-24 pm 01:43 <DIR> Pictures

2012-02-17 am 09:16 <DIR> Saved Games

2012-02-17 am 09:16 <DIR> Searches

2012-01-05 am 11:44 16 test.txt

2011-11-10 pm 04:21 <DIR> Tracing

2012-02-17 am 09:16 <DIR> Videos

2012-01-17 pm 04:49 8,521 _viminfo

2012-02-07 am 09:00 <DIR> Naver share

7file 142,571 bytes

23dogs directory 15,368,720,384 bytes remaining

//The above information to lowercase alphabets or ‘ \ 012 ‘syntax should be converted to.Therefore, only the case will remain as shown below.

C:\Users\juhan>dir | tr -cs a-z ‘\012’

‘sers’juhan’cache’recently’used’xbel’irtual’ox’zenmap’cunetix’canner’pp’ata’onta

cts’aum’esktop’dlllist’txt’ocuments’ownloads’avorites’fciv’err’get’txt’keel’inks

‘mm’cfg’usic’oni’ictures’aved’ames’earches’test’txt’racing’ideos’viminfo’

[Content] tr transformation sample

-c

[String1] all characters not in the standard input [2]will be converted to a string. Do not delete although csand earlier, converting all the contents as shown in the example below..

 
 

//What is the difference between a sample and proceed ahead, if you do one or more of the conversion value is present, its value does not delete.

C:\Users\juhan>dir | tr -c a-z _

_____________________________________________________________sers_juhan________________________________________________________________________________________________________________________________________________cache_____ _______________________________________recently_used_xbel_____________________________________________irtual_ox____________________________________________zenmap____________________________________________cunetix_canner______ ______________________________________pp_ata____________________________________________ontacts____________________________________________aum_____________________________________________________esktop________________________ ___________________dlllist_txt____________________________________________ocuments____________________________________________ownloads____________________________________________avorites_______________________________________ ____fciv_err___________________________________________get_txt___________________________________________keel____________________________________________inks___________________________________________mm_cfg___________________ _________________________usic___________________________________________oni____________________________________________ictures____________________________________________aved__ames____________________________________________ earches___________________________________________test_txt____________________________________________racing____________________________________________ideos____________________________________________viminfo_________________ ___________________________________________________________________________________________________________________________________________________

[Content] tr transformation sample

 
 

String specifying a specific charset to use the full syntax given below can specify.

[: Alnum:] all the letters and numbers

[: Alpha:] matches any character

[: Alnum:] all the letters and numbers

[: Alpha:] matches any character

[: Blank:] all horizontal whitespace

[: Cntrl:] all control characters

[:D igit:] all numbers

[: Graph:] all displayable characters, not including spaces

[: Lower:] all lowercase

[:P rint:] all displayable characters, including spaces

[:P unct:] every sentence symbol characters

[: Space:] all horizontal and vertical whitespace character

[: Upper:] all uppercase letters

[: Xdigit:] all the 16hexadecimal digits

[= CHAR =] CHARis the same as all the characters

 
 

awk

Awkis generally similar to the data or the data used to process the files and data in the form. Clanguage-grammars full text-processing language reminiscent of a wide range of operators, and so on have a great performance. Therefore, this is a full featured, making it difficult to organize for, let’s check out the about the useful features.

(Awkdoes not provide the NewLinux. So let us take advantage of coLinux.)

The most useful feature in the field or column to log analysis can pull the ability it is possible through awk. Let’s take a look at the following example:.

 
 

# 5-fifth field to standard output (the default field separator is a blank.)

awk ‘{print $5}’

 
 

# 1second, 3-th, 5th field output.

awk ‘{print $1 $3 $5}’

[Content] awk examples of using

 
 

Awk, you can take advantage of the internal functions that are provided by the data can be reused, rather than elaborately output data can be extracted. So let’s look at some of the functions.

 
 

split

Split function is an especially efficient encoding for storing values in the value of a variable to a function, split (string, array, delimeter)is used as the delimeter , string, divided on the basis of the arrayis a function to save on.

So let’s check out through the example.

 
 

cat b.txt | awk ‘{ split($2,arr,”:”); If (arr [2] == “fruit”) print $0;} ‘

[Content] split

 
 

match

Match if the line contains a function to output the condition function, match (string, regular expression)is used as a.

 
 

cat a.txt | awk ‘{ if ( match($2,”NN”) ) print $0}’

[Content] match

 
 

substr

Substr function, the output value of a function that also conditions substr (string, starting position, [length])will be used as a starting position, string, based on the [length]provides as output.

cat a.txt | awk ‘{if (substr($2,0,2) == “NN” ) print $0}’

[Content] substr

 
 

uniq

If you duplicate the contents of a row in a row making it with one row without overlapping is mostly used with the command sort .-c option shows the number of the duplicate counts.

C:\>cat juhan

HAN

HAN

JU

JU

SEONG

seong

C:\>cat juhan | uniq

HAN

JU

SEONG

seong

 
 

C:\>cat juhan | uniq –c

2 HAN

2 JU

1 SEONG

1 seong

[Content] uniq example usage

 
 

sort

The output is sorted in descending order, or as a command, it is possible for the same number of values for counters. Reverse engineer processes, such as the ratio of the frequency of requests is useful when you want to extract.

 
 

//The original search results are as follows.

C:\>cat abcd.txt

f

F

C

E

A

D

E

C

Z

B

A

//Displays the order in which they

C:\>cat abcd.txt | sort

A

A

B

C

C

D

E

E

F

f

Z

//In reverse notation

C:\>cat abcd.txt | sort /R

Z

F

f

E

E

D

C

C

B

A

A

//Displays the number of times the same item

C:\>cat abcd.txt | sort | uniq -c

2 A

1 B

2 C

1 D

2 E

1 F

1 f

1 Z

[Content] sort example usage

 
 

Then use the command identified so far, let me analyze Web logs.

 
 

//Access to the Web, with the number of investigations about the IPwhen you want to

juhan@andLinux:~/windows$ cat ex111213.log | tr -d “[” | tr -d ] | awk ‘split($4

, dt, ” “) {print dt[1]}’ | uniq -c | sort -rn

43 127.0.0.1

37 127.0.0.1

3 127.0.0.1

3 127.0.0.1

1 s-sitename

1 s-sitename

1 s-sitename

1 s-sitename

1 Information

1 Information

1 Information

1 Information

//The IPonly when you want to log for, 4second separation IP 127.0.0.1this is a test sentence to the output.

juhan@andLinux:~/windows$ cat ex111213.log | awk ‘{ if (match($4, “127.0.0.1”))print $0 }’

2011-12-13 04:26:15 W3SVC1 127.0.0.1 POST /loginCheck.asp – 80 – 127.0.0.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+.NET4.0C;+.NET4.0E) 200 00

2011-12-13 04:26:19 W3SVC1 127.0.0.1 POST /loginCheck.asp – 80 – 127.0.0.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+.NET4.0C;+.NET4.0E) 200 00

2011-12-13 07:03:14 W3SVC1 127.0.0.1 POST /loginCheck.asp – 80 – 127.0.0.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+.NET4.0C;+.NET4.0E) 200 00

2011-12-13 07:03:18 W3SVC1 127.0.0.1 POST /loginCheck.asp – 80 – 127.0.0.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+.NET4.0C;+.NET4.0E) 200 00

2011-12-13 07:03:26 W3SVC1 127.0.0.1 POST /loginCheck.asp – 80 – 127.0.0.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+.NET4.0C;+.NET4.0E) 200 00

[Content] Weblog

 
 

You can search using the above command, 127.0.0.1and extract only the relevant access log and you can see the following result.

 


[Picture] Colinuxlog analysis, it’s easy to work can proceed.

 
 

Find content from multiple files

If the mass is collected from multiple servers through the server log if all of the various routes, oozing a directory scan is in progress, the thing is that it is inefficient.

To do this, use the findcommand with a’s you can find the subdirectory at the same time want to access you can find the log files easily.

 
 

find /weblog/ -name ‘*’ -exec grep 61.33.3.156 {} \;

/weblog/ 61.33.3.156this document output


 

Facebook Comments

Leave A Reply

이메일은 공개되지 않습니다. 필수 입력창은 * 로 표시되어 있습니다

This site uses Akismet to reduce spam. Learn how your comment data is processed.