Friday, November 29, 2013

Performance Impact of SSH on Large File Transfers

I think by now, we all know that Secure Shell (SSH) should be preferred to TELNET and FTP for a lot of reasons, but especially for the security of encrypting the session and not sending authentication in the clear. This is great for traffic over the Internet or between any two network domains where you don't know who or what may be lurking out there between point A and point B.

But within your own LAN where you can personally vouch for the integrity of every node, the overhead of encryption is stealing a little bit of your session bandwidth. This adds extra time to large file transfers (e.g. moving DVD ISO files around). Suppose your local facility rules dictate that you may not use plain old FTP and must use SFTP. In this particular use-case, we don't care about security, but are forced to use the same tools regardless. So how can we maximize their efficiency? Exactly how much is this overhead and should we be concerned? If we are concerned, can we do anything about it? Let's find out with an experiment.

When you establish an ssh/sftp/scp session between two hosts, a negotiation takes place to establish common parameters including the algorithms used for bulk data encryption and hash-based message authentication codes (HMAC). Which ones gets selected comes down to a combination of which versions of SSH are on both sides of the connection (based on which algorithms they both support) and then the order of preference. So the final choices are the most preferred algorithms that both client and server support. However, you can forcibly change this selection using configuration options.

Let's use OpenSSH for experimenting since that's all I have convenient access to (other than PuTTY on Windows). I'm using a virtual Fedora 19 (running in VirtualBox on a Windows 7 host with a Core i7-860). F19 has this version of SSH:
OpenSSH_6.2p2, OpenSSL 1.0.1e-fips 11 Feb 2013
I have another system on the LAN that is running native Ubuntu 13.04 with SSH version:

OpenSSH_6.1p1 Debian-4, OpenSSL 1.0.1c 10 May 2012
When I establish a session using default settings, the negotiation selects these algorithms. You can see this by using the -v option to get some debug output. 'kex' is short for "key exchange".
debug1: kex: server->client aes128-ctr hmac-md5 none
debug1: kex: client->server aes128-ctr hmac-md5 none
So by default, it prefers 128-bit AES for bulk encryption and MD5 for HMAC.
FYI, the 'none' at the end is the compression algorithm which is not used by default.
If you ask for it with the -C option, you get:
debug1: kex: server->client aes128-ctr hmac-md5 zlib@openssh.com
Compression would reduce the amount of data being sent, but at the expense of more time on the sending and receiving ends to compress and uncompress it. This really only buys you something if you have a very slow (low bandwidth) connection. You generally don't want this on your local LAN transfers.

Let's first measure the transfer time of a reasonably large file. I just picked a "large" (259 MB) file I had sitting around. I send it from Fedora to Ubuntu five times in a row and saw the same performance each time, about 12 seconds.

Quick side-bar... the same test using compression took 20 seconds. The file being transferred is already compressed with gzip, so more compression doesn't save much, if anything, but sure takes up more time. Definitely not helping.

So how do we change the selection of these algorithms? And what other choices are available?
The ssh man page shows that the -c option can be used to select the encryption algorithm and the -m option selects the HMAC algorithm. The available choices are documented in the ssh_config man page. You can use the command line options to change it for just the current session or you can use the config file options in $HOME/.ssh/config to permanently save your alternate choices.

The scp and sftp file transfer programs do not support the -m option to change this. We will see if they will still honor the MAC keyword in the config file or not.


So in good scientific method, let us only modify one variable at a time and see what happens.
We will start with the encryption algorithm. See the ssh_config page 'Ciphers' keyword for the choices. Be careful not to confuse it with the 'Cipher' (singular) keyword as that lists the choices for the deprecated SSH Protocol version 1.

Here are all the cipher choices that were compatible with both systems and the resulting transfer times in seconds.
aes128-ctr     12.0
aes192-ctr     13.3
aes256-ctr     14.5
arcfour256      7.2
arcfour128      7.0
aes128-cbc      7.6
3des-cbc        19.5
blowfish-cbc   7.8
cast128-cbc   12.0
aes192-cbc      8.3
aes256-cbc      8.5
arcfour            6.9
 Wow, a clear winner with 'arcfour' by 43% of the default transfer time. I'll take it. :)

Does HMAC matter? Let's find out. It turns out that you can modify the choice via the config file using the 'MACs' keyword and scp/sftp both honor it. We'll leave the encryption choice alone (back to the default aes128-ctr) for now.
hmac-md5                      12.0
hmac-sha1                      12.5
umac-64@openssh.com 12.0
hmac-sha2-256               14.0
hmac-sha2-512               14.5
hmac-ripemd160            13.8
hmac-sha1-96                 12.9
hmac-md5-96                 12.0
This clearly has a much lesser effect on the transfer time. The default choice matches the best of the other times, so we can just leave this alone.

In both algorithm choices, we see as expected that larger key/hash bit sizes for the same algorithm increase the computation time. As stated before, my use case does not require any security, so I would use 'arcfour'. If you follow the literature, however, arcfour (RC4) is considered to be somewhat weakened. The blowfish is almost as fast and considered strong. In fact, it was a finalist in the NIST AES selection process, but the Rijndael algorithm won the competition and was redubbed "AES".

This back-of-the-envelope experiment shows that in an environment where FTP is not allowed but security is not an issue you can gain considerable bandwidth back by selecting a different encryption algorithm for scp/sftp file transfers.

This post maps to CompTIA SY0-301 exam objectives 1.4 and 6.2.

Time to get started

Time to get started.

The notion of using blog posts was inspired by this post that I found while searching for security-related webinars that would earn CEUs but had low or no cost. Turns out there really aren't many (any?).  I figure it can't be too difficult to write a handful of posts. They just need to be relevant to the Security+ exam objectives. I'll be using the SY0-301 list.

But first a quick rant about the cost of all these certifications and their maintenance.
I am not an "IT guy". My actual job role has always been software developer/analyst/engineer/architect, but during my career (almost 23 years now), out of both necessity and personal interest , I have learned many of what we now collectively refer to as "IT skills". I've always been on small-ish teams and we've rarely had the luxury of someone dedicated to taking care of our IT needs. So I volunteered a lot of such effort over the years and learned all kinds of things. Computer security has always been one of my interest areas.

More recently, our team moved to a new facility that had significantly higher security standards than our previous home. We were short on staff at the time and in order for me to be permitted to keep helping out with the IT tasks, I would have to meet the same criteria as our formal IT guys, i.e. certifications. So I self-studied the CompTIA Security+ and passed the exam. And I have to maintain the certification in order to retain my administrative privileges.

Philosophically, I completely support the notion of certified individuals doing something to maintain their knowledge and skills and present some evidence of having done so. (Sometimes while driving, I think folks ought to have to retake their drivers license exam every so often...) In the case of all these IT and security certifications, however, I find a significant financial barrier. If your primary job role is one of these areas and your employer will pay for the time and expense of training and taking the exams, then that's great for you. But if you fall into my case and you're just doing it out of self-interest or "on the side" as it were, then a lot of these certifications and their maintenance are likely WAY out of your budget.

The CompTIA certifications seem to be some of the least expensive options.
There are usually some good self-study books available for less than $50 and the exam fees are $200 - $300. Not so bad.

But take a look at some of the other stuff, like the SANS, Cisco, EC Council, etc.
Sticker shock! The exam fees are $500+ and you really need to buy either their training material or take one of their training courses, which will run from many hundreds to a couple thousand dollars.
Wow. It would be easy to get cynical and say they're all just taking us for what they can. But I can also see that these certs are not desired by enough people for any kind of Wal-Mart style volume discounts to start happening.

Maybe that will change over time. We'll see. Until then, the unsupported enthusiasts and non-IT people like me will just need to look for the affordable options where we can.

Saturday, November 23, 2013

Why this blog?

I am starting this blog to earn CEU credits towards the renewal of my CompTIA Security+ CE certification. Turns out you can claim credits by authoring topic-relevant blog posts of sufficient length. You can claim up to 16 CEUs, one per post during your 3 year renewal cycle.