Tommi's Scribbles

Using Ruby For FreeBSD Log Analysis And Penetration Detection

Using Ruby For FreeBSD Log Analysis And Penetration Detection
  • Published on 2023-08-14

If you're running any kind of connected servers, someone is sure to be poking at the door. While good cybersecurity posture provides a certain level of security, having visibility for what is going on is vital. While in an enterprise environment you likely operate a commercial SIEM solution, creating your own system using Ruby is not difficult.

Aggregating logs on FreeBSD 13

Security Information and Event Management systems, or SIEM systems, provide visibility on things that takes place on your system. In essence the systems work by aggregating log files from your systems, normalizing the logs, and then letting you wade through for interesting things.

So for our little custom SIEM, let's start by setting up log aggregation. I recently decided to migrate all my servers (apart from this webserver as doing a zero downtime migration is a bit more effort) from Linux to FreeBSD. One of the nice things about this migration was that log aggregation is super simple.

For a server you want to send logs from, you just add a file to /etc/syslog.d/ directory. To do this you obviously need to be root or have root-like access. You can name the file whatever works for you. I called mine client.conf. This is what goes in it:

*.*	@mylogserver

The abobe means: send all logs to mylogserver. You should obviously replace that with either your log server IP address, or if you have the nameserver or hosts entry, with the server name. You should obviously also configure so that only your servers can communicate with each other on the syslog port (default UDP 514). That is out of the scope of this write up though.

For the log aggregator, you add a file in the exact same directory. The contents are slightly different:

+mylogclient
*.* /var/log/mylogclient.log

The above means, from mylogclient, save all logs to /var/log/mylogclient.log. If you have multiple servers, you just repeat the pattern. And again, you replace the mylogclient with either the IP address or the host name of the client that is sending the logs.

In addition, you should enable the syslog daemon on the server. This is often enabled by default during install. To check, your rc.conf should have the line:

syslogd_enable="YES"

And that's it. You can obviously go more granular with your logging using syslog patterns instead of the *.* used above. The concept remains the same.

Analyzing Aggregated Logs with Ruby

Now that we have a server that aggregates all our log files, we get to the fun stuff. That is, normalizing and analyzing the log files.

For this, we'll create a simple Ruby application to serve as the base for a future super-SIEM. The fun part with syslog is that the format of the logs is pretty normalized already. As Ruby is very powerful with files and strings, we have an easy time with this.

Log format

As mentioned, syslog on FreeBSD has a pretty standard format. On my systems, I have seen two different versions:

So let's start by creating a simple class for our normalized log structure, and a simple class for holding the aggregate.

class Aggregate
attr_accessor :logs
end

class Log
attr_accessor :channel
attr_accessor :message
attr_accessor :meta
attr_accessor :process
attr_accessor :source
attr_accessor :time
end

Simple and crude, but functional. Now let's create an app class with a simple file processor.

class App
def process(f, e = nil)
return nil unless File.file?(f)
log = Aggregate.new
log.logs = Hash.new
mode = e ? "r:#{e}" : "r"
File.open(f, mode) { |content| parse content, log }
log
end

def parse(c, t)
return unless c && t
c.each_line do |line|
log = parse_line line
end
end

def parse_line(l)
line = l.chomp
parts = line.split
entry = Log.new
unless parts[3].include? "<"
entry.meta = parts[0..4]
entry.message = parts[5..].join
entry.time = entry.meta[0..2].join
entry.source = entry.meta[3]
entry.process = entry.meta[4]
else
entry.meta = parts[0..5]
entry.message = parts[6..].join " "
entry.time = entry.meta[0..2].join " "
entry.channel = entry.meta[3]
entry.source = entry.meta[4]
entry.process = entry.meta[5]
end
entry
end
end

The parse line takes advantage of the fact that for my servers, I will never have the <> in the 4th space in my log files if that denotes a source. That allows me to identify the two log entry styles I have: one with channel and one without.

The algorithm is simple enough to tweak should improvements be needed. However, you can probably tell the app doesn't currently aggregate anything. This is on purpose. How you want to store, group, index, or whatnot is your choice. But when you know what you want to do, add your final step after log = parse_line line.

Filtering and analyzing log files

To keep things simple, I will show how to add some filtering and analyzing in that same spot, right after log = parse_line line. The server log I am analyzing is not running open services apart from ssh. So I am interested in seeing where any breach attempts on the SSH service are coming from.

if log.process.include? "ssh"
ip = log.message[/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/, 1]
if ip
if t.logs.include? ip
count = t.logs[ip]
count = count + 1
t.logs[ip] = count
else
t.logs[ip] = 1
end
end
end

The process is simple. Since the normalized log files all have a field for the process, we look for any processes related to SSH. Next, we use a naive IP address regex to find any SSH log items where the message has an IP address. I then map them in the Aggregator logs hash using the IP address as key and keeping track of count by value.

For a final step, let's sort and print the final hash. For this, I'll add a quick summarize function.

def summarize(l)
puts "#{l.logs.length}"
l.logs = l.logs.sort_by {|k,v| v}.reverse
l.logs.each {|x| print x, "\n" }
end

And boom. We have a simple way to see the IP addresses that our server sees most SSH related events from.

Closing Thoughts

The purpose of this article was to show you how easy and simple it is to start building a nice custom SIEM solution using Ruby. With less than 100 lines of code, we have the ability to aggregate, normalize, and analyze system log files.

While far from complete, this blog hopefully serves as a nice foundational example on how you can bring visibility to events taking place in your systems.