IIS Logs are basically CSV files, except it might have several headers in one file. like the following, R is one open-source powerful analytics language, I will do one quick demo see how we can use the R to do ad-hoc analysis for IIS logs.
the IIS log files looks like the following,
the comments begin with # and the same for headers. R has the built-in data importer for CSV file. and it has a lot options as here,
FOR IIS log file, we only need to uncomment one header file, this will tell R how to parse the correct Log files and using # as the comment.char. so you may just uncomment the 4th lines above by remove the #Fields: in the line beginning.
then open you fav R IDE, I will use RStudio.
load the raw files to one list named IIS. and run names to get the column names, this will make sure it parse the file correctly.
then you can run typeof(iis) to get it’s list object, and nrow and ncol to query the record count, and column count.
Now, let’s do some basic analysis
Q1: grouping the result by response code,And plot it.
Or group by request,
Q2: get the top 5 url by request count,
Q3: Count all the .css request ,get top 10s
Q4: Combine all the logs in one folder, and put all the data together
basedir="F:/inetpub/logs/LogFiles/W3SVC1/" data=data.frame(); files=list.files(basedir,pattern="*.log") |
Q5: Get the number of request distribution by client, ( identity who request the more)
reqbyiptop10= head(sort(table(iis$c.ip),decreasing=TRUE),10) ba=barplot(reqbyiptop10,col=rainbow(length(reqbyiptop10)),ylim=c(0,max(reqbyiptop10)*1.2),ylab="req Count") |
get something like this,
ReplyDelete# Here's how to read the log file and assign the column names.
logfile = "W3C_log_filename"
logcols = read.table(logfile, header = FALSE, sep = " ", skip = 3, nrows = 1, comment.char = "")
iislog = read.table(logfile, header = FALSE, sep = " ",comment.char = "#")
logcols[,1] <- NULL
names(iislog) <- unlist(logcols[1,])