Mapping your web server visitors in Google Earth, Part 2

No Gravatar

Part 1 of this series talked at a high level about what was to be accomplished – viewing your web server visitors geographically on a map. It’s got a cool factor, but actually is quite useful for some folks.

This part of the series will show you how to take the Apache web server’s combined log format and get out the ip addresses that have visited your site. This code just parses a file, and takes the first token (first text before a space) and writes out a new line for each address. Alternatively, you could, instead of printing each line out, you could add them to a StringBuffer for better performance, then print them out. A reasonable idea would be to add them to a concrete implementation of java.util.Set so you only saw unique IP addresses. Anyhow, here’s the starter code.


import java.io.*;

public class ParseLogFile {
    public static void main(String[] args) {
        String newline = System.getProperty("line.separator");
        try {
            BufferedReader in = new BufferedReader(new FileReader(args[0]));
            BufferedWriter out = new BufferedWriter(new FileWriter(args[1]));
            String str;
            while ((str = in.readLine()) != null) {
                String[] strs = str.split(" ");
                out.write(strs[0] + newline);
            }
            in.close();
		out.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This should read in the apache log file and write out to a file with a long list of ip addresses / domain names. The above code is very simple and fairly unnecessary. You could easily use OS utilities to accomplish the same thing, and you could do this for any type of log file, really. This was just an example.

The last step is taking the output file, and then geolocating each address, and creating some KML out of the addresses. That’s gonna be saved for part 3. See you soon.

Be Sociable, Share!

Leave a Reply

Your email address will not be published. Required fields are marked *