[SOLUTION] Whiteout (#34)

ruby

    Sponsored Links

    Next

  • 1. We have to_yaml. How about to_xml?
    Hi, As I was sitting here contemplating my navel lint yet again, I was thinking to myself, "self, being able to call to_yaml on objects after require'ing the yaml package is kinda cool". Then I thought to myself, "Wouldn't it also be cool if I could call to_xml on objects after require'ing the rexml package?".* Yes, I'm aware of xmarshal, but wouldn't it be cool to have that builtin and included as part of the stdlib? I think so, and this is coming from a guy who isn't a big xml fan. I just thought it would be convenient and might just make the lives easier for other folks and/or packages, e.g. xml-rpc, soap, etc. Just a thought. Dan * You'll note I left out 'self' in the second though because, hey, this is Ruby, and 'self' can be explicit or implicit. ;)
  • 2. wikiwiki and authentication
    Looking a ruwiki (and others) it seems there is a great range of wikis to choose from, but they all seem to lack one problem. There doesn't seem to be any information floating around about how to extend any of these to use different authentication methods. I'm intending to deploy a wiki at work. This seems to be a great opportunity to get ruby in by the side door, however, to do this it will need to be able to authenticate off the existing domain server. This could be done using either via samba or pam. Having no wish to re-invent the wheel, has anyone tried this in the past? J.
  • 3. How to make combinations of an array to produce all possible expressions?
    I have an array 'conds', which contains some sub-expressions for an xpath query: conds = ["@title='Foo'", "@edition='Bar'", "@date='20040513'"] Is there an existing library that lets me construct all possible combinations like this? puts conds.<some Array extension method>.collect{|n| n.join ' and '} which produces: @title='Foo' and @edition='Bar' and @date='20040513' @title='Foo' and @edition='Bar' @title='Foo' and @date='20040513' @edition='Bar' and @date='20040513' @title='Foo' @edition='Bar' @date='20040513'

[SOLUTION] Whiteout (#34)

Postby Ryan Leavengood » Mon, 06 Jun 2005 21:50:54 GMT

his was a fun one. If I would consider anything my Ruby forte, text
processing would be it. So this was right up my alley. I learned a good
bit too. For example, Fixnum#to_s can take a radix representing the base
you want the number converted to in the String. String#to_i does the
same thing, just in the opposite direction.

I first wrote a simple binary conversion that was inspired by what I
could figure out from the original Perl ACME::Bleach (which wasn't too
much since I'm not a Perl hacker and it was somewhat obfuscated.) Then I
thought I could probably one-up that by making a ternary conversion. I
considered trying higher radixes, but found at least on my editor (Vim)
that only spaces, tabs and newlines were truly "invisible." So ternary
it was, as shown in the code below.

Since I had written two conversions, I decided to make things
interesting and randomly choose which one I used when creating the
files. That should thoroughly confuse people who try and decode any
files that have been "whited out" without knowing the code :)

Anyhow, here is the code (if this weren't a Ruby Quiz I would make this
code much more compact and obfuscated):

# Ruby Quiz: Whiteout (#34)
# Solution by Ryan Leavengood
#
# There are two ways of "whiting out", one that uses a binary
# encoding of spaces and tabs on each line (preserving the
# original newlines), and a ternary encoding that makes newlines
# part of the code and encodes any of the original newlines. The
# method of encoding is chosen at random. In theory other
# non-printable characters could be added to increase the radix
# used for encoding, but I think the best cross-platform "whiting
# out" can be had using spaces, tabs and newlines.

REQUIRE_LINE = "require 'whiteout'"

class WhiteoutBinary
attr_reader :id

WHITESPACE = " \t"
DIGITS = '01'

def initialize
@id = " \t\t"
end

def paint_on(paper)
paper.map do |line|
line.chomp.unpack('b*')[0].tr(DIGITS, WHITESPACE)
end.join("\n")
end

def rub_off(paper)
paper.map do |line|
[line.chomp.tr(WHITESPACE, DIGITS)].pack('b*')
end.join("\n")
end
end

class WhiteoutTernary
attr_reader :id

WHITESPACE = " \t\n"
DIGITS = '012'
# This allows up to 22222 ternary, which is 242 decimal, enough
# for most of ASCII
DIGIT_LENGTH = 5
RADIX = 3

def initialize
@id = " \t\t\t"
end

def paint_on(paper)
paper.join.gsub(/./m) do |c|
c[0].to_s(RADIX).rjust(DIGIT_LENGTH,'0')
end.tr(DIGITS, WHITESPACE)
end

def rub_off(paper)
paper.join.tr(WHITESPACE, DIGITS).gsub(/.{#{DIGIT_LENGTH}}/) do |d|
d.to_i(RADIX).chr
end
end
end

bottle_holder = [WhiteoutBinary.new, WhiteoutTernary.new]

if $0 == __FILE__
ARGV.each do |filename|
wo_name = "#{filename}.wo"
File.open(wo_name, 'w') do |file|
whiteout = bottle_holder[rand(2)]
paper = IO.readlines(filename)
if paper[0] =~ /^\s*#!/
file.print paper.shift
end
file.puts REQUIRE_LINE
file.puts whiteout.id
file.print whiteout.paint_on(paper)
end
File.rename(filename, filename+'.bak')
File.rename(wo_name, filename)
end
else
paper = IO.readlines($0)
paper.shift if paper[0] =~ /^\s*#!/
paper.shift if paper[0] =~ /^#{REQUIRE_LINE}/
id = paper.shift.chomp
whiteout = bottle_holde

Re: [SOLUTION] Whiteout (#34)

Postby Christian Neukirchen » Mon, 06 Jun 2005 22:29:09 GMT

Ryan Leavengood < XXXX@XXXXX.COM > writes:


And this is my solution, with no time spend on robustness or error
handling.  It is more space efficient, though, as I use base 4:


unless caller.empty?
  eval File.read($0).           # or extract from caller...
       gsub(/\A.*\0/m, '').
       tr(" \n\t\v", "0123").
       scan(/\d{4}/m).map { |s| s.to_i(4) }.
       pack("c*")
else
  require 'fileutils'
  ARGV.each { |file|
    code = File.read file
    FileUtils.copy file, file + ".dirty"
    File.open(file, "w") { |out|
      code.gsub!(/\A#!.*/) { |shebang|
        out.puts shebang
        ''
      }
      out.puts 'require "whiteout"'
      out.print "\0"
      code.each_byte { |b|
        out.print b.to_s(4).rjust(4).tr("0123", " \n\t\v")
      }
    }
  }
end


-- 
Christian Neukirchen  < XXXX@XXXXX.COM >   http://www.**--****.com/ 



[SOLUTION] Whiteout (#34)

Postby Ara.T.Howard » Tue, 07 Jun 2005 00:07:48 GMT

my solution tried to strike a balance between being readable and user friendly
(usage message, etc.) and succicntness.  the basic idea is that whiteout.rb is
a self modifying program that stores the whited-out files in it's __END__
section as yaml using the expanded path of the original source file as the
key.  this has the nice side effect that all sources remain quite readable
within the whiteout.rb __END__ section.  eg:


   jib:~/tmp > ls
   a.rb  b.rb  whiteout.rb

   jib:~/tmp > cat whiteout.rb
   #!/usr/bin/env ruby
   require 'yaml'

   this, prog, *paths = [__FILE__, $0, ARGV].flatten.map{|x| File::expand_path x}
   usage = "#{ prog } file [files]+"

   f = open this, 'r+'
   s, pos = f.gets, f.pos until s =~ /^__END__$/
   srcs = YAML::load f

   if prog == this
     abort usage if paths.empty?
     abort "#{ prog } must be writable" unless File::stat(this).writable?
     paths.each do |path|
       s, b = IO::read(path).split(%r/(^\s*#\s*!.*\n)/o).reverse.first 2
       srcs[path] = s
       open(path,'w'){|o| o.puts b, "require 'whiteout'\n"}
     end
     f.seek pos and f << srcs.to_yaml and f.truncate f.pos
   else
     eval srcs[prog]
   end

   __END__
   ---
   {}


   jib:~/tmp > cat a.rb
   #!/usr/bin/env ruby
   p 42


   jib:~/tmp > cat b.rb
   #!/usr/bin/env ruby
   p 'forty-two'


   jib:~/tmp > ruby a.rb
   42


   jib:~/tmp > ruby b.rb
   "forty-two"


   jib:~/tmp > whiteout.rb a.rb b.rb


   jib:~/tmp > cat a.rb
   #!/usr/bin/env ruby
   require 'whiteout'


   jib:~/tmp > cat b.rb
   #!/usr/bin/env ruby
   require 'whiteout'


   jib:~/tmp > ruby a.rb
   42


   jib:~/tmp > ruby b.rb
   "forty-two"


   jib:~/tmp > cat whiteout.rb
   #!/usr/bin/env ruby
   require 'yaml'

   this, prog, *paths = [__FILE__, $0, ARGV].flatten.map{|x| File::expand_path x}
   usage = "#{ prog } file [files]+"

   f = open this, 'r+'
   s, pos = f.gets, f.pos until s =~ /^__END__$/
   srcs = YAML::load f

   if prog == this
     abort usage if paths.empty?
     abort "#{ prog } must be writable" unless File::stat(this).writable?
     paths.each do |path|
       s, b = IO::read(path).split(%r/(^\s*#\s*!.*\n)/o).reverse.first 2
       srcs[path] = s
       open(path,'w'){|o| o.puts b, "require 'whiteout'\n"}
     end
     f.seek pos and f << srcs.to_yaml and f.truncate f.pos
   else
     eval srcs[prog]
   end

   __END__
   ---
   "/home/ahoward/tmp/b.rb": "p 'forty-two'\n"
   "/home/ahoward/tmp/a.rb": "p 42\n"


all and all quite fun!

cheers.

-a
-- 
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple.  My religion is kindness.
| --Tenzin Gyatso
===============================================================================




Re: [SOLUTION] Whiteout (#34)

Postby Dominik Bathon » Tue, 07 Jun 2005 09:22:43 GMT

Here is my solution. It is quite similar to all the others already posted.

I use Zlib::Deflate to compress the source file, then the bytes are  
converted to base 3 and represented by spaces, tabs and newlines.
This way the result is approx. 3 times bigger than the source.

Dominik

The code:

require "zlib"

def encode_to_ws(str)
     str=Zlib::Deflate.deflate(str, 9)
     res=""
     str.each_byte { |b| res << b.to_s(3).rjust(6,"0") }
     res.tr("012", " \t\n")
end

def decode_from_ws(str)
     raise "wrong length" unless str.length%6 == 0
     str.tr!(" \t\n", "012")
     res=""
     for i in 0...(str.length/6)
         res << str[i*6, 6].to_i(3).chr
     end
     Zlib::Inflate.inflate(res)
end

if $0 == __FILE__
     if File.file?(f=ARGV[0])
         str=IO.read(f)
         File.open(f, "wb") { |out|
             if str =~ /\A#!.*/
                 out.puts $&
             end
             out.puts 'require "whiteout"'
             out.print encode_to_ws(str)
         }
     else
         puts "usage #$0 file.rb"
     end
else
     if File.file?($0)
         str=File.read($0)
         str.sub!(/\A(#!.*)?require "whiteout".*?\n/m, "")
         eval('$0=__FILE__')
         eval(decode_from_ws(str))
     else
         raise "required whiteout from non-file"
     end
end



Similar Threads:

1.SOLUTION Whiteout (#34)

I thought of adding more bits by using other whitespace characters, but
in the end stuck with space and tab, since everything will render them
as whitespace (whereas almost any other whitespace I tried showed up as
garbage in one editor/reader or another). Also, I didn't use linefeed
nor carriage return, since non-binary transmission from one system to
another could potentially break the file (ie, eol conversion).

Anyway, here's my solution.  I tried to keep it short and simple but
should still be easily understood.


#!/usr/bin/env ruby

Bits  = '01'
Blank = " \t"

def shebang(f)
    f.pos = 0 unless f.gets =~ /^#!/
end

def confuse(fname)
    File.open(fname, File::RDWR) do |f|
        shebang f
        f.pos, data = f.pos, f.read
        f.puts "require '#{File.basename($0, '.*')}'"
        f.write data.unpack('b*').join.tr(Bits, Blank)
        f.truncate f.pos
    end
end

def clarify(fname)
    File.open(fname, File::RDONLY) do |f|
        shebang f
        f.gets # skip require 'whiteout'
        eval [f.read.tr(Blank, Bits)].pack('b*')
    end
end

if __FILE__ == $0
    ARGV.each { |fname| confuse fname }
else
    clarify($0)
end

2.[QUIZ] [Solution] Whiteout (#34)

Hi,


Here's my solution for Ruby Quiz #34. It's my first one :)


Well, it's pretty simple..

To find out whether the file was run directly or required:
- if __FILE__ == $0

When whiteout is run directly, it does the following for each ARGV:
- Leaves the shebang intact
- Adds the "require 'whiteout'"
- Converts the rest of the file to whitespace

The conversion to whitespace is done like this:
- Chars in like "\n" and "\r" are ignored
- Each byte is converted to its bit representation
- So we have something like 01100001
- Then, it is converted to whitespace
- 0 results in a " " (space)
- 1 results in a "\t" (tab)

When whiteout was required:
- Opens the file of $0
- Skips to after the require line
- Decodes the whitespace to code
- Runs the code with eval


I don't like the opening of $0 and the eval part.
Also, the encoding to whitespace could be made more efficient in size by 
adding more whitespace characters to the "code table".

I'm curious to see the other, probably cleaner solutions.

Bye,
  Robin

PS: Here's the code:


#!/usr/bin/ruby


#
# This is my solution for Ruby Quiz #34, Whiteout.
# Author::  Robin Stocker
#


#
# The Whiteout module includes all functionality like:
# - whiten
# - run
# - encode
# - decode
#
module Whiteout

   @@bit_to_code = { '0' => " ", '1' => "\t" }
   @@code_to_bit = @@bit_to_code.invert
   @@chars_to_ignore = [ "\n", "\r" ]

   #
   # Whitens the content of a file specified by _filename_.
   # It leaves the shebang intact, if there is one.
   # At the beginning of the file it inserts the require 'whiteout'.
   # See #encode for details about how the whitening works.
   #
   def Whiteout.whiten( filename )
     code = ''
     File.open( filename, 'r' ) do |file|
       file.each_line do |line|
         if code.empty?
           # Add shebang if there is one.
           code << line if line =~ /#!\s*.+/
           code << "#{$/}require 'whiteout'#{$/}"
         else
           code << encode( line )
         end
       end
     end
     File.open( filename, 'w' ) do |file|
       file.write( code )
     end
   end

   #
   # Reads the file _filename_, decodes and runs it through eval.
   #
   def Whiteout.run( filename )
     text = ''
     File.open( filename, 'r' ) do |file|
       decode = false
       file.each_line do |line|
         if not decode
           # We don't want to decode the "require 'whiteout'",
           # so start decoding not before we passed it.
           decode = true if line =~ /require 'whiteout'/
         else
           text << decode( line )
         end
       end
     end
     # Run the code!
     eval text
   end

   #
   # Encodes text to "whitecode". It works like this:
   # - Chars in @@char_to_ignore are ignored
   # - Each byte is converted to its bit representation,
   #   so that we have something like 01100001
   # - Then, it is converted to whitespace according to @@bit_to_code
   # - 0 results in a " " (space)
   # - 1 results in a "\t" (tab)
   #
   def Whiteout.encode( text )
     white = ''
     text.scan(/./m) do |char|
       if @@chars_to_ignore.include?( char )
         white << char
       else
         char.unpack('B8').first.scan(/./) do |bit|
           code = @@bit_to_code[bit]
           white << code
         end
       end
     end
     return white
   end

   #
   # Does the inverse of #encode, it takes "white"
   # and returns the decoded text.
   #
   def Whiteout.decode( white )
     text = ''
     char = ''
     white.scan(/./m) do |code|
       if @@chars_to_ignore.include?( code )
         text << code
       else
         char << @@code_to_bit[code]
         if char.length == 8
           text << [char].pack("B8")
           char = ''
         end
       end
     end
     return text
   end

end


#
# And here's the logic part of whiteout.
# If it was run directly, whites out the files in ARGV.
# And if it was required, decodes the whitecode and runs it.
#
if __FILE__ == $0
   ARGV.each do |filename|
     Whiteout.whiten( filename )
   end
else
   Whiteout.run( $0 )
end


3.[QUIZ] Whiteout (#34)

4.[SUMMARY] Whiteout (#34)

Does this library have any practical value?  Probably not.  It's been suggested
in the Perl community that hacks like this are a good minor deterrent to those
trying to read source code you would rather keep hidden, but it must be stressed
that this is no form of serious security.  Regardless, it's a fun little toy to
play with.

It was mentioned in the discussion that Perl, where ACME::Bleach comes from,
includes a framework for source filtering.  It can be used to make modules that
modify source code much as we are doing in this quiz.  Perl's Switch.pm is a
good example of this, but ironically ACME::Bleach is not.

That naturally leads to the question, can you build source filters in Ruby? 
Clearly we can build ACME::Bleach, but not all source filters are as simple I'm
afraid.  Consider this:

	#!/usr/local/bin/ruby -w

	require "fix_my_broken_syntax"

	invalid++

Now the thought here is that fix_my_broken_syntax.rb will read my source, change
it so that it does something valid, eval() it, and exit() before the invalid
code is an issue.  Here's a trivial example of fix_my_broken_syntax.rb:

	#!/usr/local/bin/ruby -w

	puts "Fixed!"
	exit

Does that work?  Unfortunately, no:

	$ ruby invalid.rb 
	invalid.rb:5: syntax error
	invalid++
	         ^

Ruby never gets to loading the library, because it's not happy with the syntax
of the first file.  That makes writing a source filter for anything that isn't
valid Ruby syntax complicated and if it is valid Ruby syntax, you can probably
just code it up in Ruby to begin with.

Except for whiteout.rb, our version of ACME::Bleach.

You can't build Ruby constructs out of whitespace alone, so some form of source
filtering is required.  Luckily, we can get away with the approach described
above for this source filter, because a bunch of whitespace (with no code) is
valid Ruby syntax.  It just doesn't do anything.  Ruby will skip right over our
whitespace and load the library that restores and runs the code.

Most people took this approach.  Let's examine one such example by Robin
Stocker:

	#!/usr/bin/ruby

	#
	# This is my solution for Ruby Quiz #34, Whiteout.
	# Author::  Robin Stocker
	#

	#
	# The Whiteout module includes all functionality like:
	# - whiten
	# - run
	# - encode
	# - decode
	#
	module Whiteout

	  @@bit_to_code = { '0' => " ", '1' => "\t" }
	  @@code_to_bit = @@bit_to_code.invert
	  @@chars_to_ignore = [ "\n", "\r" ]

	  #
	  # Whitens the content of a file specified by _filename_.
	  # It leaves the shebang intact, if there is one.
	  # At the beginning of the file it inserts the require 'whiteout'.
	  # See #encode for details about how the whitening works.
	  #
	  def Whiteout.whiten( filename )
	    code = ''
	    File.open( filename, 'r' ) do |file|
	      file.each_line do |line|
	        if code.empty?
	          # Add shebang if there is one.
	          code << line if line =~ /#!\s*.+/
	          code << "#{$/}require 'whiteout'#{$/}"
	        else
	          code << encode( line )
	        end
	      end
	    end
	    File.open( filename, 'w' ) do |file|
	      file.write( code )
	    end
	  end
	
	  # ...

First, we can see that the module defines some module variables, which are
really used as constants here.  Their contents hint at the encoding algorithm
we'll see later.

Then we have a method for managing the transformation of the source into
whitespace.  It starts by opening the passed file and reading the code
line-by-line.  If the first line is a shebang line, it's saved in the variable
code.  Next, a "require 'whiteout'" line is added to code.  Finally, all other
lines from the file are appended to code after being passed through an encode()
method we'll examine shortly.  With the contents read and transformed, the
method then reopens the source for writing and dumps the modifications into it.

The next method is the reverse process:

	  # ...
	
	  #
	  # Reads the file _filename_, decodes and runs it through eval.
	  #
	  def Whiteout.run( filename )
	    text = ''
	    File.open( filename, 'r' ) do |file|
	      decode = false
	      file.each_line do |line|
	        if not decode
	          # We don't want to decode the "require 'whiteout'",
	          # so start decoding not before we passed it.
	          decode = true if line =~ /require 'whiteout'/
	        else
	          text << decode( line )
	        end
	      end
	    end
	    # Run the code!
	    eval text
	  end
	
	  # ...

This method again reads the passed file.  It skips over the "require 'whiteout'"
line, then copies the rest of the file into the variable text, after passing it
through decode() line-by-line.  The final line of the method calls eval() on
text, which should now contain the restored program.

On to encode() and decode():

	  #
	  # Encodes text to "whitecode". It works like this:
	  # - Chars in @@char_to_ignore are ignored
	  # - Each byte is converted to its bit representation,
	  #   so that we have something like 01100001
	  # - Then, it is converted to whitespace according to @@bit_to_code
	  # - 0 results in a " " (space)
	  # - 1 results in a "\t" (tab)
	  #
	  def Whiteout.encode( text )
	    white = ''
	    text.scan(/./m) do |char|
	      if @@chars_to_ignore.include?( char )
	        white << char
	      else
	        char.unpack('B8').first.scan(/./) do |bit|
	          code = @@bit_to_code[bit]
	          white << code
	        end
	      end
	    end
	    return white
	  end

	  #
	  # Does the inverse of #encode, it takes "white"
	  # and returns the decoded text.
	  #
	  def Whiteout.decode( white )
	    text = ''
	    char = ''
	    white.scan(/./m) do |code|
	      if @@chars_to_ignore.include?( code )
	        text << code
	      else
	        char << @@code_to_bit[code]
	        if char.length == 8
	          text << [char].pack("B8")
	          char = ''
	        end
	      end
	    end
	    return text
	  end

	end
	
	# ...

The comments in there detail the exact process we're looking at here, so I'm not
going to repeat them.

Note that @@char_to_ignore contains "\n" and "\r" so they are not translated. 
The effect of that is that line-endings are untouched by this conversion.  Some
solutions used such characters in their encoding algorithm.  The gotcha there is
that any line-ending translation done to the modified source (say FTP through
ASCII mode) will break the hidden code.  Robin's solution doesn't have that
problem.

Here's the code that ties all those methods into a solution:

	# ...
	
	#
	# And here's the logic part of whiteout.
	# If it was run directly, whites out the files in ARGV.
	# And if it was required, decodes the whitecode and runs it.
	#
	if __FILE__ == $0
	  ARGV.each do |filename|
	    Whiteout.whiten( filename )
	  end
	else
	  Whiteout.run( $0 )
	end

Again, the comment saves me some explaining.

That was Robin's first solution to a Ruby Quiz, but I never would have known
that from looking at the code.  Thanks for sharing Robin!

Obviously, a conversion of this type grossly inflates the size of the source. 
Around eight times the size, to be exact.  A couple of solutions used zlib to
control the expansion, which I thought was clever.  By compressing the source
and then encoding() (and using a base three conversion) Dominik Bathom got
results around three times the inflation instead.

Ara.T.Howard took a different approach, using whiteout.rb as a database to store
the trimmed files.  That was a very interesting process, demonstrated well in
the submission email.  The advantages to this approach would be no inflation
penalty and the code stays readable (just not in the original location).  The
disadvantage I see is that it requires the exact same library to be present both
at encoding and decoding, which probably makes sharing the altered code
impractical.

As always, my thanks to all who gave this little diversion an attempt.  I'm sure
we'll see tons of whitespace only code on RubyForge in the future, thanks to our
efforts.

Tomorrow begins part one of our first two-part Ruby Quiz.  Stay tuned...


5.[OT] Access blocked to RubyForge for 203.123.134.34

6. [oneliners] prime genrator in 34 bytes

7. Access blocked to RubyForge for 203.123.134.34

8. Troops kill 34 insurgents in Afghanistan



Return to ruby

 

Who is online

Users browsing this forum: No registered users and 31 guest