Tommi's Scribbles
Using Ruby For FreeBSD Log Analysis And Penetration Detection
- Published on 2023-08-14
If you're running any kind of connected servers, someone is sure to be poking at the door. While good cybersecurity posture provides a certain level of security, having visibility for what is going on is vital. While in an enterprise environment you likely operate a commercial SIEM solution, creating your own system using Ruby is not difficult.
Aggregating logs on FreeBSD 13
Security Information and Event Management systems, or SIEM systems, provide visibility on things that takes place on your system. In essence the systems work by aggregating log files from your systems, normalizing the logs, and then letting you wade through for interesting things.
So for our little custom SIEM, let's start by setting up log aggregation. I recently decided to migrate all my servers (apart from this webserver as doing a zero downtime migration is a bit more effort) from Linux to FreeBSD. One of the nice things about this migration was that log aggregation is super simple.
For a server you want to send logs from, you just add a file to /etc/syslog.d/ directory. To do this you obviously need to be root or have root-like access. You can name the file whatever works for you. I called mine client.conf. This is what goes in it:
*.* @mylogserver
The abobe means: send all logs to mylogserver. You should obviously replace that with either your log server IP address, or if you have the nameserver or hosts entry, with the server name. You should obviously also configure so that only your servers can communicate with each other on the syslog port (default UDP 514). That is out of the scope of this write up though.
For the log aggregator, you add a file in the exact same directory. The contents are slightly different:
+mylogclient *.* /var/log/mylogclient.log
The above means, from mylogclient, save all logs to /var/log/mylogclient.log. If you have multiple servers, you just repeat the pattern. And again, you replace the mylogclient with either the IP address or the host name of the client that is sending the logs.
In addition, you should enable the syslog daemon on the server. This is often enabled by default during install. To check, your rc.conf should have the line:
syslogd_enable="YES"
And that's it. You can obviously go more granular with your logging using syslog patterns instead of the *.* used above. The concept remains the same.
Analyzing Aggregated Logs with Ruby
Now that we have a server that aggregates all our log files, we get to the fun stuff. That is, normalizing and analyzing the log files.
For this, we'll create a simple Ruby application to serve as the base for a future super-SIEM. The fun part with syslog is that the format of the logs is pretty normalized already. As Ruby is very powerful with files and strings, we have an easy time with this.
Log format
As mentioned, syslog on FreeBSD has a pretty standard format. On my systems, I have seen two different versions:
- timestamp source process message
- timestamp channel source process message
class Aggregate attr_accessor :logs end class Log attr_accessor :channel attr_accessor :message attr_accessor :meta attr_accessor :process attr_accessor :source attr_accessor :time end
Simple and crude, but functional. Now let's create an app class with a simple file processor.
class App def process(f, e = nil) return nil unless File.file?(f) log = Aggregate.new log.logs = Hash.new mode = e ? "r:#{e}" : "r" File.open(f, mode) { |content| parse content, log } log end def parse(c, t) return unless c && t c.each_line do |line| log = parse_line line end end def parse_line(l) line = l.chomp parts = line.split entry = Log.new unless parts[3].include? "<" entry.meta = parts[0..4] entry.message = parts[5..].join entry.time = entry.meta[0..2].join entry.source = entry.meta[3] entry.process = entry.meta[4] else entry.meta = parts[0..5] entry.message = parts[6..].join " " entry.time = entry.meta[0..2].join " " entry.channel = entry.meta[3] entry.source = entry.meta[4] entry.process = entry.meta[5] end entry end end
The parse line takes advantage of the fact that for my servers, I will never have the <> in the 4th space in my log files if that denotes a source. That allows me to identify the two log entry styles I have: one with channel and one without.
The algorithm is simple enough to tweak should improvements be needed. However, you can probably tell the app doesn't currently aggregate anything. This is on purpose. How you want to store, group, index, or whatnot is your choice. But when you know what you want to do, add your final step after log = parse_line line.
Filtering and analyzing log files
To keep things simple, I will show how to add some filtering and analyzing in that same spot, right after log = parse_line line. The server log I am analyzing is not running open services apart from ssh. So I am interested in seeing where any breach attempts on the SSH service are coming from.
if log.process.include? "ssh" ip = log.message[/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/, 1] if ip if t.logs.include? ip count = t.logs[ip] count = count + 1 t.logs[ip] = count else t.logs[ip] = 1 end end end
The process is simple. Since the normalized log files all have a field for the process, we look for any processes related to SSH. Next, we use a naive IP address regex to find any SSH log items where the message has an IP address. I then map them in the Aggregator logs hash using the IP address as key and keeping track of count by value.
For a final step, let's sort and print the final hash. For this, I'll add a quick summarize function.
def summarize(l) puts "#{l.logs.length}" l.logs = l.logs.sort_by {|k,v| v}.reverse l.logs.each {|x| print x, "\n" } end
And boom. We have a simple way to see the IP addresses that our server sees most SSH related events from.
Closing Thoughts
The purpose of this article was to show you how easy and simple it is to start building a nice custom SIEM solution using Ruby. With less than 100 lines of code, we have the ability to aggregate, normalize, and analyze system log files.
While far from complete, this blog hopefully serves as a nice foundational example on how you can bring visibility to events taking place in your systems.