Ruby String XOR Optimizations
As part of the LazyRaid project I spent a non trivial amount of time trying to develop an algorithm to perform an XOR operation in ruby against two blocks of data read from files. My initial try was a pure ruby implementation and was based off of some forum posts. These worked fine for small strings but when used on blocks of data larger than a few kilobytes it took an unacceptably long time.
This next method avoids the bytes method and accesses each string char directly then converts them to their ordinal values performs the XOR and pushes it on to the string. More than a 2x speedup from the first implementation.
Note: Towards the end of preparing this post I came across the ruby gem fast_xor which is based on C like my solution. However fast_xor stands out since it runs at 5x the speed my implementation does. It's included in the benchmarks below and you should probably implement it if you're considering doing any non-trivial XOR work in your ruby application. Otherwise xor6 should be suitable for most use cases.
#Initial XOR method from forums - http://www.ruby-forum.com/topic/175539At this point I started investigating writing a C extension to give the process a boost and eventually succeeded after a night of furious hacking. As you can see in the benchmarks below this was about a 2000x speedup and I decided to move on to other features. It wasn't until after the project was finished that I started wondering what sort of speedup I could get using a pure ruby implementation and if it would even be comparable to my naive first stab at a ruby C extension. So I started trying different methods out.
#0.627 Mb/s (All times are based on a single core of my development machine - i5 750 @ 2.67GHz)
def xor1(second)
self.bytes.zip(second.bytes).map { |a,b| (a^b).chr}.join
end
This next method avoids the bytes method and accesses each string char directly then converts them to their ordinal values performs the XOR and pushes it on to the string. More than a 2x speedup from the first implementation.
#1.589 Mb/sFurther refinement involves using the zip method to combine the strings as byte arrays resulting in some marginal speedups.
def xor2(second)
s = ""
[self.size,second.size].max.times do |i|
s << ((self[i] || 0).ord ^ (second[i] || 0).ord)
end
return s
end
#1.881 Mb/sEnter the pack and unpack methods. Seemingly the quickest way to convert the string into byte arrays.
def xor3(second)
s = ""
#s.force_encoding("ASCII-8BIT")
[self.size,second.size].max.times.zip(self.each_byte.to_a,second.each_byte.to_a) do |i,a,b|
s << (a ^ b).chr
end
return s
end
#2.613 Mb/sI attempted the operation with unpack options larger than 1 byte. This works great for large data sets that have sizes evenly divisible by 8 bytes. The 0-7 bytes at the end will be ignored. Some extra code could probably be written to account for this. As the speedup is substantial and this represents the fastest ruby implementation I was able to produce.
def xor4(second)
s = []
self.unpack('C*').zip(second.unpack('C*')) {|a,b| s.push( (a||0) ^ (b||0) ) }
return s.pack('C*')
end
#13.369 Mb/sAnd finally a more robust method that properly takes into account all sizes and differing lengths of strings. This implementation is essentially the same as xor4.
def xor5(second)
s = []
self.unpack('Q*').zip(second.unpack('Q*')) {|a,b| s.push( (a||0) ^ (b||0) ) }
return s.pack('Q*')
end
#2.605 Mb/sIn the end my fastest ruby implementation was still about 2 orders of magnitude slower than the C solution. I'm sure there's more room for performance improvements and I hope someone out there takes a look at this and finds a more optimal ruby only solution. Until then I'll likely start looking to C more often when I need to do some performance optimization. You can see the code used in this post as well as the C libraries used on github at https://github.com/pcorliss/Ruby-String-XOR-Optimizations
def xor6(second)
self.length > second.length ? (string1=self;string2=second) : (string2=self;string1=second)
s = []
string1.unpack('C*').zip(string2.unpack('C*')) {|a,b| s.push( (a||0) ^ (b||0) ) }
return s.pack('C*')
end
Note: Towards the end of preparing this post I came across the ruby gem fast_xor which is based on C like my solution. However fast_xor stands out since it runs at 5x the speed my implementation does. It's included in the benchmarks below and you should probably implement it if you're considering doing any non-trivial XOR work in your ruby application. Otherwise xor6 should be suitable for most use cases.
pcorliss@hawaii:~/projects/ruby_benchmarks$ ruby benchmark.rb
Ensuring parity calculation is correct.
xor1:1d9d0030fe31415e1fa9dd8b7b782315
xor2:1d9d0030fe31415e1fa9dd8b7b782315
xor3:1d9d0030fe31415e1fa9dd8b7b782315
xor4:1d9d0030fe31415e1fa9dd8b7b782315
xor5:1d9d0030fe31415e1fa9dd8b7b782315
xor6:1d9d0030fe31415e1fa9dd8b7b782315
xorc:1d9d0030fe31415e1fa9dd8b7b782315
xor!:1d9d0030fe31415e1fa9dd8b7b782315
Performing Benchmarks
user system total real
xor1 10 times: 79.040000 0.440000 79.480000 ( 79.689313)
xor2 10 times: 31.410000 0.000000 31.410000 ( 31.467753)
xor3 10 times: 26.280000 0.240000 26.520000 ( 26.583489)
xor4 10 times: 18.690000 0.400000 19.090000 ( 19.136069)
xor5 10 times: 3.720000 0.020000 3.740000 ( 3.740085)
xor6 10 times: 18.650000 0.500000 19.150000 ( 19.191539)
xorc 10 times: 0.040000 0.000000 0.040000 ( 0.041738) #My C implementation
xor! 10 times: 0.000000 0.000000 0.000000 ( 0.011803) #fast_xor gem
xorc 1000 times: 4.370000 1.390000 5.760000 ( 5.785931) #My C implementation
xor! 1000 times: 1.160000 0.000000 1.160000 ( 1.170025) #fast_xor gem