Archive for the 'Geeky' Category

Shrink

Thursday, October 12th, 2006 at 9:21 pm

I recently wrote a small program that really made me realise how important it is to try and get all your requirements defined before you start to code. I needed to shrink a text file to no more than 80 characters wide so I thought I’d write a small perl script to do this for me. The text file had paragraphs split on the “%” symbol. On the face of it the requirements seemed really straightforward:

  • Read in a text file with sentences longer than 80 characters.
  • Output a text file with sentences no longer than 80 characters, maintaining the the “%” between paragraphs.

So i thought about the following algorithm,

  1. Pass two arguments to the program; first the file F you want to split and second the new length NL, you want the sentences not to exceed.
  2. Take the text file and read it into an array A, putting each paragraph into an element in A.
  3. Loop through A and put each element of A into a new array B.
  4. Loop through B and print each character to stdout. Maintain a counter X as you go.
  5. If X reaches 80 print out a newline (\n).
  6. After printing the last character in B, print a “%”.

The problem with this as I found was it indiscriminately split words in sentences in order to maintain the NL character limit. So I thought you probably need to check whether character at NL is a space. If it is not, backtrack through the array until you find one, then output a newline.

This introduced a new problem though; if you did not find a space at position NL but found one at say NL-10, you would have to start you search for you next newline at NL+(NL-10) not at NL+NL. So, I had to maintain a way of remembering where you cut off the previous sentence and using that the tell you where to start looking for your next NL. Here is the main loop of the program:

#array holds whole text file
foreach(@array){
	$i = 0;
	@small_array = {};
	$prev_end_of_line = 0;
	#read the current paragraph into an new array
	@small_array = split(//,$_);
	foreach(@small_array){
		#option 1: space found at standard position
		if(($small_array[$prev_end_of_line+$new_length] =~ /s/)&&$found!=1){
			$prev_end_of_line = $prev_end_of_line+$new_length;
			$found = 1;
		}
		elsif($found!=1){
			#start backtracking
			for($j=($prev_end_of_line+$new_length)-1; ;$j){
				#option 2: space found at position $j
				if($small_array[$j] =~ /s/){
					$prev_end_of_line = $j;
					$found = 1;
					last;
				}
			}
		}
		print $_;
		$i++;
		if($i eq $prev_end_of_line){
			print \n;
			$found = 0;
		}
	}
	print \n%\n;
	$found = 0;

}

Once I had solved that I started experimenting with NL’s other than 80, and found that if you entered a really low one like 2 it broke the program. It seemed I had made another oversight where I had not allowed for the possibility where there was no space between NL and the start of the line. I decided in order to deal with this I would have the simply split the array at NL regardless of whether is cut up a word or not. This is when the real fun and games started. I thought in order to deal with sentences without space I thought I could check if we had reached the start of the current line, if we had that meant there was no space and just cut out losses and put the newline in at the default position:

#start backtracking
for($j=($prev_end_of_line+$new_length)-1; ;$j){
	#option 2: space found at position $j
	if($small_array[$j] =~ /s/){
		$prev_end_of_line = $j;
		$found = 1;
		last;
	}
	#option 3: space not found
	elsif($j==$prev_end_of_line){
		$prev_end_of_line = $prev_end_of_line+$new_length;
		$found = 1;
		last;
	}
}

For the longest time I could not work out why this would not work. But eventually I realised due to the way I backtracked up to the start of the line, looking for a space, I had not realised if I was dealing with a very small NL and there was a space at the start of the line, there was a chance $j was be incorrectly set. Instead what I needed to do was this:

#start backtracking
for($j=($prev_end_of_line+$new_length)-1; ;$j){
	#option 3: no space found
	if($j==$i&&$i>0){
		$prev_end_of_line =  $prev_end_of_line+$new_length;
		$found = 1;
		last;
	}
	#option 2: space found at position $j
	if($small_array[$j] =~ /s/){
		$prev_end_of_line = $j;
		$found = 1;
		last;
	}
}

I hope my slap dash review of the code has not been entirely un-educational. You may get a better picture by downloading the full source and trying it out for yourself. Any problems or comments, please let me know!

Update

As seems to be the way with perl, there is always more than one way to do something. And after consulting with a perl guru, it seems substr()ing is the way forward when doing anything with strings. This script is a far more elegant solution than mine but was not half as much stress fun as to create!

Posted in Geeky
by Hopkins

Firefox

Thursday, November 17th, 2005 at 5:11 pm

firefox eating IEAfter setting up a friend’s new Dell computer last night, I was overjoyed to see that it came preinstalled with Firefox! This has renewed my motivation to improving people’s Internet experience and prompted me to put a ‘really big annoying’ link to downloading firefox, at the top of this website. It is only visible to people who use IE.

Update 19/01/2006
I’ve since removed this, if you want you can still see it here.

Posted in Geeky
by Hopkins

Quotation Plugin

Thursday, August 18th, 2005 at 7:41 pm

I have written a plugin for Wordpress that allows you to display a random quote-of-the-day on your site. There does already appear to be a few quote-like plugins available but they all required you to manually create or add the quotations yourself. My plugin has a clear advantage in that it is self-contained. It pulls quotes from an RSS feed supplied by quotationspage.com.

In my opinion, the standards of quotes you get from this site are of a very high quality. Sometimes funny, sometimes political, sometimes philosophical but always witty! The plugin, once installed will select quotes at random from a possible twelve, which are updated once every twenty-four hours. Read the rest of this entry »

Posted in Geeky
by Hopkins

Googling

Sunday, June 12th, 2005 at 12:46 pm

Bored as I was, I began experimenting what search queries could be used to find this site. Just click the links to Google and you’ll find me at the top (or second/third). I found some really good ones but they only showed up on about page 3-4 on the google results, so they don't count!

If you can find any other good ones, do share!

Posted in Geeky
by Hopkins

Stitch up!

Monday, May 2nd, 2005 at 5:28 pm

money manOver the years, I have found myself a victim of the bright and flashy, lighted machines known to those in the trade as "fruities". You'll find them scattered throughout just about every watering hole in the land and I myself have on more occasions than I care to admit have wasted a small fortune on them. Read the rest of this entry »

Posted in Geeky
by Hopkins