<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>/var/tmp   </title>
    <link>http://www.vartmp.com/blog</link>
    <description>Android, Linux, FLOSS etc.</description>
    <language>en</language>

  <item>
    <title>Gnome terminal resize info on Ubuntu 13.04 - raring</title>
    <pubDate>Wed, 22 May 2013 20:43:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2013/05/22#20130522</link>
    <description>&lt;P&gt;

Ubuntu disabled the resize info tooltip for Gnome Terminal once again, 
and has once again changed their convulted method to restore it.
&lt;P&gt;
In this iteration of Ubuntu:&lt;P&gt;

1) &quot;sudo aptitude install compiz-plugins&quot; &lt;BR&gt;
(or &quot;sudo apt-get install compiz-plugins&quot; if you don't have any 
aptitude.&lt;P&gt;
2) &quot;sudo aptitude install ccsm&quot;&lt;P&gt;
The run &quot;ccsm&quot;.  In the filter search, search for &quot;Resize Info&quot;.  The 
box is unchecked with its default tooltip turned off.  Check the box.  
Compiz will then freeze up a little for a few seconds and then go back 
to normal.  You now have Gnome terminal resize info tooltip enabled.</description>
  </item>
  <item>
    <title>VPSs, Nagios, .com domains</title>
    <pubDate>Wed, 17 Apr 2013 11:25:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2013/04/17#20130417</link>
    <description>
&lt;P&gt;

My revenues have been between $900 and $1425 over the past four months, 
so in January I decided to splurge and get VPS instances from two 
providers.
&lt;P&gt;

I read online about what people thought.  A lot of people liked Linode 
so I went with them.  For $20 a month I get 2 TB outbound transfer, 24 
gigs of storage, a priority CPU and a share of eight others, and 1 gig 
of RAM.  In January that was 512MB of RAM and 200 gigs of transfer, but 
there has been competition in the VPS space.

&lt;P&gt;

Rackspace seemed popular as well.  People were less enthused, but it 
was deemed OK.  So I got a VPS with them.  With the lowest price &quot;cloud 
server&quot; you get 20 gigs disk, 1 virtual CPU, and 512MB RAM.  Pricing is 
$16.06 a month but does not include traffic.  With 32-33 gigs going out 
it is $20 a month.  I send out less than 1 gig a month so I am charged 
around $16.18.  Of course, these policies determine how I use the 
servers.  I served 33 gigs of data from Linode in March.

&lt;P&gt;

I'm running Debian 6.0 on both servers.  I run Debian because - what 
else am I going to run?  I've worked with Debian since Vincent Yesue 
introduced Debian to me back in the mid 1990s.  I'm familiar with it.  
I run Ubuntu on my desktop so I'm familiar with dpkg.  I could run 
Fedora or CentOS (can't afford Red Hat at this stage) but Debian seemed 
fine enough.

&lt;P&gt;

I decided to set up a Nagios instance on my desktop and watch 
Dreamhost, Bluehost, Rackspace and Linode.  I knew how flaky Dreamhost 
was, now I really know. Any how, I've been slowly shifting everything 
to the VPSs.

&lt;P&gt;

I run BIND 8 on both VPSs for primary and secondary DNS.  I also run 
Apache on both VPSs.  Rackspace is the front end web site.  Linode I 
use for serving epub files, and also to handle search queries.  So I 
run MySQL on Linode as well.

&lt;P&gt;

Last week, Nagios said Linode was slow.  So I began culling down memory 
usage on Apache, BIND and MySQL.  Nagios still said it was slow.  So I 
began timing web page gets from other locations, and Linode was fine.  
The connection from my ISP to Linode was just slow for a few hours.  
It's probably better I tuned it any how.

&lt;P&gt;

I had some domain name ideas while doing this, so I signed up with 
Namecheap and got some domain names.  I will probably be holding most 
of my domain names there hence forth.  The number of dot com names 
registered are in the hundreds of millions.  It keeps going up.  I 
remember back in 1996 when names like proof.com were still 
unregistered, I missed snapping that up by a few days.  Someone just 
e-mailed me offering to sell me a domain name for $350,000.

&lt;P&gt;

So I saw some a domain I wanted expiring.  I used snapnames.com to 
scoop it up.  And I got it.  So now I have bookmarkflood.com.  Most of 
the domains I have are either connected to books or bookmarks.

&lt;P&gt;

I want to improve my programming knowledge, more specifically Java, 
more specifically Android.  But programming in general as well.  
Besides, Android is not all about Java - a lot of what I've been doing 
with Android has been C and C++ apps using the NDK.  Or server side 
programs - usually Perl so far.

&lt;P&gt;

I've been reading Structure and Interpretation of Computer Programs.  I 
have been taking my time to go through it.  Right now I am on section 
1.2.3.</description>
  </item>
  <item>
    <title>Mobile rising, Windows falling</title>
    <pubDate>Tue, 16 Apr 2013 19:52:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2013/04/16#20130416</link>
    <description>&lt;!DOCTYPE HTML&gt;
&lt;html&gt;

&lt;head&gt;  
	&lt;script type=&quot;text/javascript&quot;&gt;
	window.onload = function () {
		var chart = new CanvasJS.Chart(&quot;chartContainer&quot;,
		{
			zoomEnabled: true,
			title:{
				text: &quot;Browser OS's seen on Wikipedia&quot;,
			},
			axisY2:{
				valueFormatString:&quot;0 %&quot;,
				
				maximum: 1,
                                minimum: 0,
				interval: .1,
				interlacedColor: &quot;WhiteSmoke&quot;,
				gridColor: &quot;LightGray&quot;,      
	 			tickColor: &quot;Silver&quot;,								
			},
                         theme: &quot;theme2&quot;,
			legend:{
				verticalAlign: &quot;bottom&quot;,
				horizontalAlign: &quot;center&quot;,
				fontSize: 14,
				fontFamily: &quot;Lucida Sans Unicode&quot;

			},
			data: [
			{        
				type: &quot;line&quot;,
				lineThickness:3,
				axisYType:&quot;secondary&quot;,
				showInLegend: true,           
				name: &quot;Windows&quot;, 
				dataPoints: [
				{ x: new Date(2009, 3), y: 0.895 },
				{ x: new Date(2010, 2), y: 0.8705 },
				{ x: new Date(2011, 2), y: 0.8178 },
				{ x: new Date(2012, 2), y: 0.7338 },
				{ x: new Date(2013, 2), y: 0.5573 },


				]
			},
			{        
				type: &quot;line&quot;,
				lineThickness:3,
				showInLegend: true,           
				name: &quot;iOS&quot;,
				axisYType:&quot;secondary&quot;,
				dataPoints: [
				{ x: new Date(2009, 3), y: 0.0093},
				{ x: new Date(2010, 2), y: 0.0139 },
				{ x: new Date(2011, 2), y: 0.0348 },
				{ x: new Date(2012, 2), y: 0.0746 },
				{ x: new Date(2013, 2), y: 0.2519 },


				]
			},
			{        
				type: &quot;line&quot;,
				lineThickness:3,
				showInLegend: true,           
				name: &quot;Android&quot;,        
				axisYType:&quot;secondary&quot;,
				dataPoints: [
				{ x: new Date(2009, 3), y: 0.0003 },
				{ x: new Date(2010, 2), y: 0.0013 },
				{ x: new Date(2011, 2), y: 0.01 },
				{ x: new Date(2012, 2), y: 0.0346 },
				{ x: new Date(2013, 2), y: 0.0619 },


				]
			},
			{        
				type: &quot;line&quot;,
				lineThickness:3,
				showInLegend: true,           
				name: &quot;Mac&quot;,        
				axisYType:&quot;secondary&quot;,
				dataPoints: [
				{ x: new Date(2009, 3), y: 0.0605 },
				{ x: new Date(2010, 2), y: 0.0693 },
				{ x: new Date(2011, 2), y: 0.0773 },
				{ x: new Date(2012, 2), y: 0.0841 },
				{ x: new Date(2013, 2), y: 0.0671 },


				]
			},


			]
		});

chart.render();
}
&lt;/script&gt;
  &lt;script type=&quot;text/javascript&quot; 
src=&quot;/assets/js/canvasjs.min.js&quot;&gt;&lt;/script&gt;
&lt;body&gt;
&lt;P&gt;
Alexa lists Wikipedia as the &lt;a 
href=http://www.alexa.com/siteinfo/wikipedia.org&gt;6th most popular&lt;/a&gt; 
web site in the world.

&lt;P&gt;

One nice thing about Wikipedia is Wikimedia data analyst Erik Zachte 
gives a detailed &lt;a 
href=http://stats.wikimedia.org/wikimedia/squids/SquidReportOperatingSystems.htm&gt; 
public summary&lt;/a&gt; of Wikipedia's web traffic.  We have been hearing 
about the rise of mobile technologies like iOS and Android, and the 
problems Windows has been having, and that is well illustrated on 
Wikipedia.  Windows browser share was at 55.73% last month, down from 89.5%
four years ago.
&lt;P&gt;

	&lt;div id=&quot;chartContainer&quot; style=&quot;height: 400px; width: 80%;&quot;&gt;
	&lt;/div&gt;
&lt;/body&gt;


&lt;/html&gt;
</description>
  </item>
  <item>
    <title>2012 in review</title>
    <pubDate>Thu, 03 Jan 2013 20:34:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2013/01/03#20130103</link>
    <description>&lt;P&gt;
Well, I have had some small success with Android this year.  Here are my 
month-to-month earnings:
&lt;P&gt;
&lt;img src=/blog/images/2012.png&gt;
&lt;P&gt;

I made $747.30 from my Android apps in November, then that number jumped 
to $1234.78 for December.  From December 25th to 28th I made over $62 
every day.  I did not expect that to continue in the short term and it 
has not, today I made about $40 on Android.

&lt;P&gt;

One reason how much money I make on it is important is it is 
self-perpetuating.  The more I make on Android, the more time I can 
devote to programming Android apps.

&lt;P&gt;

</description>
  </item>
  <item>
    <title>Amazon EC2, Bluehost, Dreamhost </title>
    <pubDate>Fri, 14 Dec 2012 20:39:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2012/12/14#20121214</link>
    <description>&lt;P&gt; 

&lt;h5&gt;Dreamhost&lt;/h5&gt;

I have been hosting on Dreamhost since 2005.  For a $10 a month web 
hosting service, I have been happy.&lt;P&gt;

Actually, I seem to have been grandfathered in with the monthly $10 
rate.  I started with a one-year plan in 2005 then moved to monthly.  I 
pay monthly - $10 a month.  It appears the monthly rate is now $11 a 
month with a $50 setup fee.  Yearly is $10 a month, two-year is $9 a 
month.

&lt;P&gt;

Problems over the years...NTP was off by a little on my host, but an 
e-mail to support fixed that.  A few times the host was completely 
unreachable - web servers down, not reachable by ssh.  An e-mail fixed 
that.  Sometimes my web logs would become unreadable or stop rotating, 
an e-mail would fix that.

&lt;P&gt;

The main problem I have faced is load averages.  I have seen over two 
hour periods the 15 minute load average staying above 200 - peaking at 
261.  This on a machine with 4 processor cores.  Of course, the machine 
slows to an absolute crawl when this happens.  I have limited access to 
their machine's /proc directory, so I have no idea what causes these 
surges.  I would say high load averages are my main concern with 
Dreamhost.  As I type, the 15-minute load average is over 18.  The 
machine has 4 processor cores.  This has been a problem on Dreamhost 
since I signed on - in 2005.

&lt;p&gt;

One disconcerting thing with Dreamhost is it seems the concern has gone 
down from the techs over the years.  In 2005 and 2006, support jumped on 
problems.  As time went on, support does not respond to high load 
issues, or tells me factually incorrect information about what a load 
average is.  When I can't access my web logs the way I have for years, 
but doing a cat, the message is more or less &quot;just live with it&quot; 
(thankfully, I can once again cat my web logs).

&lt;P&gt;
&lt;h5&gt;Bluehost&lt;/h5&gt;

I just signed up with Bluehost in October.  I opened the account for the 
use of one of my Android apps.  Which might sound expensive for one app, 
but that app makes enough every four days to pay for a year's worth of 
Bluehost service.
&lt;P&gt;
With Dreamhost, html directories are all separate, which I like.  With 
bluehost, they're all piled on top of one another, which I dislike.  
&lt;P&gt;
Bluehost also does not let me run cron jobs.  You have to go to the web 
interface and schedule jobs.  I understand in a sense, why they do this, 
they don't want my account tied to the machine, but if you're going to 
virtualize scheduled jobs to the web, why not try to virtualize cron as 
well?  I mean, this problem was solved with Unix in the 1970s, why are 
we going backwards?
&lt;P&gt;
I served out 23 gigs worth of files via Bluehost in November without 
much complaint, so so far, so good.
&lt;P&gt;
&lt;h5&gt;Amazon EC2&lt;/h5&gt;

&lt;P&gt;

I wanted to pull some epub's from Gutenberg.org.  So I signed up with 
EC2.  They make a small credit card withdrawal and also call your phone 
and you have to type a PIN.  In less than an hour, I was all signed up.  
I spun up a free micro-server on US East and connected to Gutenberg.  IP 
blocked on the first try!  Obviously someone before me).  So I 
terminated that instance.  So I spun up one from a non-US location.  
Success!  I pulled down a few hundred epub's.  Now I'm up to date.  I 
sent an e-mail to Gutenberg.org a month ago and never heard back.

&lt;P&gt;

Anyhow, EC2 seemed cool.  I have heard people talk about how cheap web 
hosting (and databases, and application servers) is getting, but I 
didn't really get how cheap.  Or understand how easy it is to dial up an 
account from a small web server to one handling many hits.  EC2's pay 
for what you use service, with reasonable prices, works great for me.  
I'm sure I will be looking more at it in the future.</description>
  </item>
  <item>
    <title>Processing large data files</title>
    <pubDate>Fri, 12 Oct 2012 03:39:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2012/10/12#20121012</link>
    <description>
&lt;P&gt;

I guess noticing this thing shows how little I know about programming, 
but I have now seen this come at me in two different directions and am 
now more aware of it.

&lt;/P&gt;&lt;P&gt;

The thing I am talking about is when I am processing a large data file 
with a program.  The large data file is in a certain data format. So 
initially, since it is theoretically the easy way, I try to download the 
entire file into memory into a data structure which fits that data 
format.  Once that is done, I start processing that data structure.  
Usually what I'm doing is in one way or another, translating the data 
from the form it is in, into another type of data structure, and 
outputing the new data structure.

&lt;/P&gt;&lt;P&gt;

The problem is you have this huge data structure in memory, and are 
trying to manipulate portions of it into another data structure, and it 
just takes up a lot of resources.  The work gets done, but it is too 
slow.  Sometimes all of this memory use starts memory paging, and then 
the machine slows to a crawl.

&lt;/P&gt;&lt;P&gt;

My first encounter with this is when I wrote a Java program for my 
blunder suite of tools - pgn2fen.  I would take a PGN (Portable Game 
Notation) file that was 7 megs or so, load it into memory, and then 
convert every move of every game in that PGN into a FEN (Forsyth–Edwards 
Notation) data structure, which represents a chess board position.

&lt;/P&gt;&lt;P&gt;

Initially, I would load the file as a linked list of Strings, and then 
send that entire list as a parameter to various classes.  As the program 
began coming together, I made a big improvement in &lt;a 
href=http://blunderchess.git.sourceforge.net/git/gitweb.cgi?p=blunderchess/blunderchess;a=commit;h=f3b84f7cd259e1e6243438a642b29c6b58542c3b&gt;the 
code&lt;/a&gt;.  Now I created a second shorter linked list alongside the 
large linked list.  I would then slice off a piece of the list, like a 
slice of salami or a banana, and send that slice around to the other 
classes.  The large linked list was rid of the data as soon as it was 
sliced, and the smaller linked list with the slice itself was discarded 
once it was processed.  I would then go on to slice off the next part of 
the large linked list, and repeat.  The code looks like this:
&lt;/P&gt;&lt;P&gt;

&lt;pre&gt;
            for (i=0; i &lt; gameSize; i++) {
                shortList.add(longList.getFirst());
                longList.removeFirst();
            }
            g.doOneGame(shortList);
            shortList.clear();
&lt;/pre&gt;

&lt;/P&gt;
&lt;P&gt;
This change made the program over ten times faster.
&lt;/P&gt;&lt;P&gt;

I recently faced a similar problem.  This time it was with a Perl script 
translating an RDF file into an XML data structure.  In this case, my 
machine would start swapping and take hours to process the file.  Maybe 
not surprising that it had a larger effect on the machine, as the PGN 
files were usually less than 10 megs, and this data file is over 240 
megs.  With my desktop GUI, as well as the RDF data structure, necessary 
operations and new data structure, my 4 gigs got swamped and my machine 
started paging.  After a few hours the process was done, but I wanted to 
look into if there was a way to deal with this.

&lt;/P&gt;&lt;P&gt;

Again, if resources are infinite, it's always programatically easier to 
just load the entire data structure, do the processing, and output the 
new data structure.  But resources are not infinite.  Something I 
certainly learned doing over a decade of systems 
administration professionally.

&lt;/P&gt;&lt;P&gt;

In this case I switched from using the CPAN XML::Simple module, to using 
CPAN's XML::Twig module.  From XML::Twig documentation:&lt;/P&gt;
&lt;P&gt;

[XML::Twig] allows minimal resource (CPU and memory) usage by building 
the tree only for the parts of the documents that need actual 
processing....One of the strengths of XML::Twig is that it let you work 
with files that do not fit in memory (BTW storing an XML document in 
memory as a tree is quite memory-expensive, the expansion factor being 
often around 10).  To do this you can define handlers, that will be 
called once a specific element has been completely parsed...Once the 
element is completely processed you can then flush it, which will output 
it and free the memory. You can also purge it if you don't need to 
output it.

&lt;/P&gt;

Which is what I do.  RDF elements I need I grab with a handler, process, 
and then purge.  RDF elements I do not need I have the handler purge 
immediately.

&lt;/P&gt;&lt;P&gt;

The processing now take much, much less memory.  It finishes much faster 
as well.  A lot of the time is probably taken by the instant-purging of 
RDF elements that will never be processed.
&lt;/P&gt;&lt;P&gt;

Any how, I now see I have run into the same problem twice.  It was 
solved more or less the same way both times - I processed the large, 
original data structure one element at a time, and when I was done 
processing that element I would remove it from memory and go on to the 
next element.  Not the easiest way to do things programatically, but a 
necessity with large data files and limited resources.
&lt;/P&gt;
</description>
  </item>
  <item>
    <title>Getting Scheme to do REPL in Emacs</title>
    <pubDate>Fri, 12 Oct 2012 02:34:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2012/10/12#20121012</link>
    <description>
&lt;P&gt;

I took a college course last year, half of which was learning Lisp and 
functional programming.  I don't feel I learned that much about either 
Lisp or functional programming in the course.  I had taken a previous 
course with the same instructor in graphing theory where I felt I did 
learn a lot.  Especially in subsequent courses where I had to learn tree 
data structures and the like.

&lt;/P&gt;&lt;P&gt;

Anyhow, I decided to take another crack at Lisp and functional 
programming.  Some of the great and/or successful programmers have a 
fondness for Lisp and recommend it, even if you don't see it around much 
any more.  As Paul Graham says about his usage of Lisp, &quot;Everyone else 
was writing their software in C++ or Perl. But we also knew that that 
didn't mean anything. If you chose technology that way, you'd be running 
Windows.&quot;

&lt;/P&gt;&lt;P&gt;

Structure and Interpretation of Computer Programs is often touted as a 
must-read book.  When I first browsed through it a few years ago it 
seemed confusing.  I'm not sure why that is, when I look at it now it 
mostly seems simple and clear.  I'm still reading the first of the five 
chapters.  They're very heavy on the &quot;interpretation&quot; part of their 
title, going into evaluation and eval etc.  It's not yet clear to me why 
they're emphasizing this so much, but perhaps I'll understand as I read 
through the book.

&lt;/P&gt;&lt;P&gt;

My college course used Common Lisp.  I understand CL is more of the 
real-world one, with more libraries, but also more cruft and less 
simplicity.

&lt;/P&gt;&lt;P&gt;

Scheme is simpler, more elegant, and easier to understand.  Scheme 
defines functions with the symbol define.  CL defines functions with 
the symbol defun.  That alone tells you a lot about the dialects.

&lt;/P&gt;&lt;P&gt;

One thing I like about Scheme is it seems to have a small number of 
primitive expressions, with a few more derived/library expressions built 
on those primitive expressions.  I like this simplicity.  While these 
Scheme expressions deal with abstraction and things like that, it 
reminds me of how almost all number-theoretic functions on the natural 
numbers all derive from three primitive functions - constant, successor 
and projection, and by doing the composition and primitive recursion 
operations on those functions.  And the computations that can't be done 
with these three functions and two operations are rather offbeat, like 
George Cantor's ones, which do little other than disprove you can't do 
every natural number computation with those rules.

&lt;/P&gt;&lt;P&gt;

I also like that Scheme clearly marks non-functional procedures and 
forms with exclamation points.

&lt;/P&gt;&lt;P&gt;

Especially since Lisp is not heavily used nowadays, it seems obvious to 
me that people should first learn Scheme as that seems the best language 
to learn in.  If they want to do some real world stuff and feel CL has 
better libraries or whatnot, they can then shift to CL.

&lt;/P&gt;&lt;P&gt;

So anyhow I've been going through SICP.  The initial expressions could 
mostly be done with the Scheme interpreter guile.  It does not have 
readline support by default like CL does, so I put into my .guile 
profile:
&lt;/P&gt;&lt;P&gt;

(use-modules (ice-9 readline))&lt;br&gt;
(activate-readline)

&lt;/P&gt;&lt;P&gt;

As the programs became more complex, I wanted a more sophisticated REPL.  
Emacs seems to be what Lisp programmers use.  I am not well-acquainted 
with emacs, even though I first started using it twenty years ago!  I 
usually use vi, or nano, or Gnome gedit, or Eclipse's editor, or the 
like.  Anyhow, doing elisp with Emacs is easy enough, but using Scheme 
is a little bit more work.  I spent some time looking at it today and 
got it put together.  Oddly, there's not really one place on the web 
which tells you how to do this.

&lt;/P&gt;&lt;P&gt;

In my emacs init file I now have:
&lt;/P&gt;&lt;P&gt;

(setq scheme-program-name &quot;guile&quot;)&lt;br&gt;
(global-set-key (kbd &quot;C-x C-e&quot;) 'scheme-send-last-sexp)
&lt;/P&gt;&lt;P&gt;

I also have:
&lt;/P&gt;&lt;P&gt;

(split-window-vertically)
&lt;/P&gt;&lt;P&gt;

Just so I don't have to do &quot;Control-x -&gt; 2&quot; when I start Emacs.  If I 
start using Emacs more for editing, perhaps I'll comment that line out. 

&lt;/P&gt;&lt;P&gt; 

So I click the bottom window, type &quot;Escape-x&quot; and then 
&quot;run-scheme&quot;.  Then I click the top window and start typing in 
expressions.  I usually do &quot;Control-x Control-e&quot; after each one to 
evaluate it.  It evaluates in the bottom window which runs guile.  I had 
the scheme-program-name set to scm and was running that for a bit, but 
switched to guile.  Don't know much about either aside from that both 
seem to be copyrighted by the FSF, but the FSF seems to push guile more, 
and also guile has a nice (help) facility. 
&lt;/P&gt;&lt;P&gt; 
Anyhow it is running 
well enough for now.  I'd like to improve my Scheme REPL environment a 
little more, but it is working OK for now.
&lt;P&gt;</description>
  </item>
  <item>
    <title>Android, and porting C++ and OpenGL via the JNI</title>
    <pubDate>Wed, 20 Jun 2012 17:06:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2012/06/20#20120620</link>
    <description>&lt;P&gt;

I've been interested in the idea of porting free software to Android 
since I started working with Android.  The first free software programs 
I considered doing an Android port of were written in Java.  The reason 
I looked at Java programs first is Android seems to have a slight 
preference for Java over C and C++.&lt;/P&gt;
&lt;P&gt;
When investigating various Java programs for potential ports, I realized 
that porting the UI portions of the programs over, particularly ones 
that used Java graphical libraries such as awt or swing, would be 
difficult.  Android does not implement these graphical libraries.&lt;/P&gt;
&lt;P&gt;

So then I began investigating free software Java libraries.  One popular 
one which caught my eye was Jackcess, which could read Microsoft Access 
database files.  I wrote a little Android UI wrapper around the library, 
and within a few days was able to release &lt;a 
href=http://play.google.com/store/apps/details?id=com.panaceasupplies.android.panaceadb&gt;Panacea 
Database&lt;/a&gt;.  Since its release, I have added more functionality to the 
program.  I still have not tapped all of the library's functionality, such as 
for database creation.&lt;/P&gt;

&lt;center&gt;&lt;h3&gt;OpenGL&lt;/h3&gt;&lt;/center&gt;

&lt;P&gt;
The idea of porting C and C++ free software programs to Linux, 
especially ones using &quot;OpenGL&quot; family graphics, has been in the back of 
my mind for a while.  An informative conversation I had with Katie from 
&lt;a href=http://www.goldenhammersoftware.com&gt;Golden Hammer Software&lt;/a&gt; 
at the 2011 Android Developer Labs pushed me along this route as well, 
not just in learning about the technical aspects of porting C++ apps to 
Android, but seeing how it was feasible.&lt;/P&gt;

&lt;P&gt; 
When you're looking at doing OpenGL work on Android, one of the 
important things to know is that Android does not do OpenGL.  Android 
handles OpenGL ES, which is a library which only handles a subset of 
what OpenGL does.  OpenGL ES does not have all of the features that 
OpenGL does.  For example, OpenGL ES does not handle OpenGL begin and 
end commands.  You can not directly specify rectangles on OpenGL ES like 
you can on OpenGL.  And so on.
&lt;/P&gt;

&lt;P&gt; Apple iOS uses an implementation of OpenGL ES as well.  Porting C or 
C++ code which uses OpenGL ES from iOS to Android (or vice versa) is not 
that hard.  This in fact is what Golden Hammer Software did.  Porting 
Windows or Linux code that uses a full OpenGL library to Android is a 
much more difficult enterprise.
&lt;/P&gt;

&lt;center&gt;&lt;h3&gt;SDL&lt;/h3&gt;&lt;/center&gt;

&lt;P&gt;

Porting a C or C++ program that directly links to a full OpenGL library 
to Android is going to be a little bit of work.  This brings us to the 
Simple DirectMediaLayer (SDL) library.  The Simple DirectMedia Layer is 
a cross-platform multimedia library designed to provide low level access 
to UI elements of a program (audio, keyboard, mouse, joystick, 3D 
hardware via OpenGL, and 2D video framebuffer).

&lt;/P&gt;&lt;P&gt;

Many programs that directly depend on the SDL library have no direct 
dependencies on OpenGL - the programs use SDL to mediate access to the 
needed lower-level backend libraries.

&lt;/P&gt;&lt;P&gt;

Most programs that depend on SDL were written to depend on the SDL 1.2 
or lower library.  SDL has being rewritten since version 1.3, and is not 
backward compatible with 1.2.  Here, we are only concerned with SDL 1.2 
and lower, which is what the majority of the software out there uses.  
There is a &lt;a 
href=http://github.com/pelya/commandergenius&gt;unofficial&lt;/a&gt; port of SDL 
1.2 to Android, which was mostly done by Sergii Pylypkeno (pelya).

&lt;/P&gt;&lt;P&gt;

Pelya has ported 13 SDL games to Android and put them up on Google Play.  
One of the apps, OpenTTD, has had over 100,000 downloads so far, and 
another, Free Heroes 2, also has had over 100,000 downloads.  FH2 
currently has a rating of 4.2 out of 5, so people seem to be happy with 
the port.  With these games done, pelya has said he is finished porting 
any more games, but he is still maintaining the SDL 1.2 library for 
Android.

&lt;/P&gt;&lt;P&gt;

His library has its own unique little build system.  I am developing on 
an Ubuntu GNU/Linux desktop, and am comfortable with using the command 
line if need be, so it is fine with me.

&lt;/P&gt;&lt;P&gt;

The way he sets things up with his build system, he has the jni 
directory with various libraries a lot of sdl applications will need, 
such as of course sdl itself, the freetype library, the jpg library, the 
physfs library, and other such libraries.  Among these he has an 
application sub-directory named application.  Within it is a link called 
src which points to the application being ported within it - such as 
OpenTTD, or Free Heroes 2, or whatever.

&lt;/P&gt;&lt;P&gt;

I started off by trying to build every application he had within that 
application directory.  He suggested to try ballfield first, and that is easy
to compile and test.  Grafx2, Jooleem, Kobodeluxe, Milkytracker, 
Openal-demo, Opentyrian, Regression, Simplemixer, Test32bpp and 
Testmultitouch all worked OK.  Others failed before compiling for 
various reasons, or did launch but were still broken - perhaps I needed 
to tweak the settings more.

&lt;/P&gt;&lt;P&gt;

He published then unpublished Jooleem.  I thought it was pretty cool and 
e-mailed him saying I wanted to release it, but was there some unknown 
reason he unpublished it to Google Play.  He said there wasn't, so I did 
some work on it, then published it.  He may have been right - the game 
does not have a high download rate, nor does it have a high retention 
rate compared to other SDL ports I did later.

&lt;/P&gt;&lt;P&gt;

Having some experience with working with the stuff he ported, especially 
Jooleem (which I now call Bubble Boxem), I decided to try porting a game 
that pelya had not tried yet.  Circus Linux was a small and simple 
program that used the SDL library, so I decided to port that.  I 
succeeded in porting it as well.

&lt;/P&gt;&lt;P&gt;

Much of what is needed is in pelya's instructions.  First you want to 
compile the program.  The instructions explain how to do that.  If there 
is a data subdirectory, it should be zipped up, moved to AndroidData as 
the instructions explain, split if necessary and if split, the original 
data.zip removed.  You want an icon.png file for the program icon.  Then 
once you get it compiling, you want it to run.  If nothing appears on 
the screen, __android_log_write and __android_log_print can help.  Start 
at the beginning of main(), looking for output in logcat, then continue 
until you find the first problem.  Then the second one.  At some point, 
hopefully, the program will load.

&lt;/P&gt;&lt;P&gt;

Why SDL programs won't compile or run can differ from program to 
program, but I've found common themes.  The first four listed are the 
most important to remember.

&lt;ul&gt;

&lt;li&gt;The C++ code says to run in hardware mode instead of software mode 
when Android can not do so.&lt;/li&gt;

&lt;li&gt;Looking for directories with configuration files and graphics can be 
another problem, you have to set it up properly.&lt;/li&gt;

&lt;li&gt;Check if it is looking for defines in a config.h file.  These 
defines will have to be set properly for Android.  Also look for similar 
defines not in the config.h files, like a define for LOCALE or the 
like.&lt;/li&gt;

&lt;li&gt;The SDL_SetVideoMode call might have parameters Android can not 
handle&lt;/li&gt;

&lt;li&gt;Pelya's framework script does not compile C++ files with a suffix of 
cc instead of cpp.&lt;/li&gt;

&lt;li&gt;Stuff from iostream like cout and cerr do not work out of the box.  
Neither do XM audio files&lt;/li&gt;
&lt;/ul&gt;
&lt;P&gt;
The above list covers every problem I've had so far with compiling or 
getting a screen to come up on Android. 
&lt;P&gt;
Now that something is coming up on the screen, you may want to consider 
replacing SDL_UpdateRect calls with SDL_Flip calls, or you may get some 
gibberish on the screen.  The SDL port does not currently handle 
SDL_UpdateRect calls well.
&lt;/P&gt;

&lt;P&gt;You also want to make sure the volume button is working when in the 
SDL app.  if you want to use it, make sure it is not redefined as a key.  
Explicitly listening to KeyEvent.KEYCODE_VOLUME_DOWN or 
KeyEvent.KEYCODE_VOLUME_UP and manually implementing adjustVolume also 
works.
&lt;/P&gt;

&lt;P&gt;Another consideration is the keyboard, and seeing visible text on the 
screen.  With pelya's framework, text appears in an EditText (which I 
sometimes move around on the screen, change colors of etc.)  You can 
have a keyboard pop up on the screen and so forth.  It is something to 
think about&lt;/P&gt;

&lt;P&gt;Sometimes the game just needs the arrow keys, and maybe a few more 
keys.  Pelya's framework has mechanisms to deal with this.  I use one 
such mechanism in my Ice Blocker game, when a player wants to switch 
from horizontal to vertical (or vice versa).&lt;/P&gt;


&lt;center&gt;&lt;h3&gt;Future plans&lt;/h3&gt;&lt;/center&gt;

&lt;P&gt;
So far I have ported six games to Android using pelya's Android SDL 
library.  I am looking to see if there are any more good free software 
SDL apps to port over.  Most of the games I've ported were primarily 
mouse-based games - they are now touch-based games.  So the aesthetics 
have not changed that much for those particular games.  In addition to 
this, most of the games I've ported have had a fairly simple graphical 
library dependency - on SDL.  In the future I might port games with more 
of a keyboard (or arrow key) dependency.  I also might port games which 
have more of a direct OpenGL dependency.
&lt;/P&gt;

&lt;P&gt; I also am interested in expanding the existing games I have.  I am 
interested in doing more work through the Java/C++ JNI bridge in the 
games I have already done.  I also am thinking about how to handle 
different languages and internationalization.  Android's bionic library 
can not handle locale.  This means gettext and it's portable object (po) 
and message object (mo) files do not work out of the box.  Garen 
Torikian has been nice enough to give me some advice about this, and I 
might do translations in something along the lines of how he did it in 
&lt;a href= 
http://github.com/gjtorikian/Neverball-ME/tree/sdl_android/project/jni/application/neverball/android&gt;Neverball 
ME&lt;/a&gt; &lt;/P&gt;</description>
  </item>
  <item>
    <title>Some success...&lt;P&gt;</title>
    <pubDate>Tue, 27 Mar 2012 22:56:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2012/03/27#20120327</link>
    <description>
So I have had a little bit of success.  In December of 2011, I was on 
one ad network, Admob, and made $6.66 for the whole month from them.  I 
am now on three ad networks (Admob/Adsense, Millennial Media, Inmobi), 
and I made $7.62 from them in the past two days, more than the whole 
month of December.  I would like to increase that in the future, but for 
now, $100 a month coming in is great.  Of course, I want to roll as much 
of that as I can back into the business.
&lt;P&gt;

The breakthrough happened in late January.  I have written Android apps 
from scratch like Bouncer and Love Poems, and I ported an open source 
Java library to Android with Panacea Database.  Looking at a 
full-fledged open source Android project, FBReaderJ, I noticed some 
modifications I could make to it to improve it, and that would be for an 
audience without much overlap with the existing FBReaderJ audience.  
FBReaderJ is GPL licensed, which worries some people, but myself less.  
Anyhow, I released my version of the app, &quot;Free books to download &amp; 
read&quot; on January 24th.  By the last day of January, 2425 installs a day 
were happening, by February 5th, 11000 installs a day were happening.  
Daily installs ranged from over 8000 to over 11000 a day until February 
20th.  The install rate is still over 2000 a day.  As is normal, the 
active installs in percent has been going down over time, but it is 
still over 35%.  It currently has over 119600 active device installs.  
There is currently one ad - right before someone goes to a book - it has 
been requested from 13000 to 23000 times a day over the course of the 
past two weeks.

&lt;P&gt;

Having had success with modifying an open source project, I doubled 
down, and on February 12th I released a modified version of OI File 
Manager, another open source Android project.  I chose it because it was 
open source, because I had thought of doing a file manager for a while, 
and because it had a wide appeal - it is not a niche product like 
Panacea Database or Bouncer, many people can find it useful.  I wanted 
to release another app with wide appeal to ride the wave of Book Reader.  
And it did so, it has over 4239 active device installs, which for my 
five apps is second to only Book Reader.  And has been achieved in six 
weeks, while I have been working on apps like Bouncer for ten months.

&lt;P&gt;

I do have my eye on one more Android open source project, but I have 
turned back to doing an original project.  It uses Andengine, but is 
actually an app, not a game.  It is original as far as I know, nothing 
else on Android does it in the manner mine will, which is much better 
than the handful of existing ones that are related to this app.  I have 
to see how much work I am going to do on it before releasing it.  It is 
more toward a niche product than a general one, but it is not a small 
niche.  Anyhow, much work to be done on it, although I already have a 
decent prototype for one implementation of it.

&lt;P&gt;

Book Reader was making over $20 a day when the downloads were first 
flying.  Also, I had an ad on the page seen when the app was opened for 
the first time, which I now do not have - although I may put that back.  
Anyhow, I rolled $100 of that Admob money into ads.  While I was running 
my ads, Admob dropped their minimum ad bid to $0.01 a bid.  So I dropped 
my bid to that.  The money went mainly to buying ads in Brazil for the 
File Manager.  Ads seem to boost downloads from the target market, even 
when they're not running, don't know all the variables which cause that 
although I can guess some of them.  Anyhow, I know have over 1000 active 
users from Brazil for File Manager that I probably would not have had 
any how.  Were they worth ten cents a head?  Well, the initial buys were 
overpriced before Admob's price drop.  Also, it was something of a test.  
Also, I want to roll my profits back into the business and couldn't 
think of a better thing to spend it on.  Even with that $100 spent, I'm 
still getting over $350 from February Admob profits for Book Reader.  
Those kind of dollars came from the initial pop, I'm now more at the 
$100 a month level, as I said before.  Although if I had more ads in the 
Book Reader app, I could probably make more.  Although I want to avoid 
having ads over the actual book, as that is annoying.

&lt;P&gt;

In terms of running Admob ads - you can choose the devices to target, 
the SDK version, the country (and sometimes more specific location), 
whether to target mobile, wifi or both, gender and age group.  Transfers 
of $50 or over from money I was owed to running ads gets you a small 
bonus of free ads.  Each campaign is $10 a day minimum.  Minimum bid 
nowadays is 1 cent a click.  You can see conversion rates for app 
installation for app download ads.

&lt;P&gt;

The annoying part for Admob is the approval process.  First you have to 
get approved to be able to transfer money from your balance to ad 
campaign budget.  Then campaigns have to be approved.  After I was 
approved for balance to budget transfers, I transferred $50 and 
submitted a campaign.  A week later it still sat unapproved, so I sent 
them an e-mail, then it was approved.  Contrast this to Millennial 
Media, who approved a campaign for me recently within hours.  You'd 
think Admob would be more responsive to me wanting to give them my 
money.

&lt;P&gt;

So on that Millennial Media campaign - I noticed a few days ago that the 
paltry sum I made in February from Millennial Media had been put into my 
balance.  The sum was paltry because I was not even signed up with 
Millennial in the beginning of February.  Anyhow, I took the dollar or 
two and put it into a campaign in Norway for File Manager.  It was 
approved within hours, which was the good part.  One downside was the 
minimum 5 cent bid - 5 times what Admob does.  Also the targetting is 
not as precise for kinds of device and such.  You can target to country 
though, which I did.  I wonder if &quot;Android&quot; goes out to Kindles, Nooks 
and the like, I hope not as it would be wasted money.  Anyhow, my $1.20 
daily budget was filled and I got 24 clicks.  I'll probably do a bigger 
one next month for MM when my March money clears, maybe for different 
countries.  Another nice thing about MM is I'm not stuck with $10 a day 
campaigns!  But unlike Admob, MM keeps the money you earn for two months 
plus instead of one month plus, so I may as well roll the money back 
into ads.&lt;P&gt;

I signed up for Inmobi as well, but you have to talk to them or 
something to get approved to transfer money from balance to budget.  
It's not worth it at this point.&lt;P&gt;

I also might do Adsense for mobile ads.  I'll have to see.  I should get 
the $350+ by the middle of next month, so I have some ideas for the 
money.  I might spend some money for a contractor to do some work on 
Book Reader - which I plan on using myself and sending back to FBReaderJ 
as well.
&lt;P&gt;

I had used Admob as my sole ad network prior to January.  One reason I 
chose them is they were known to be reliable about sending checks - in 
fact, they already sent me one last year.  Also, they have a low check 
sending threshold - if you make $20 in a month, which I'm now easily 
doing.  They also send the money within one month plus.  If I made money 
on ads on January 1st, or January 30th, that money would get sent to me 
on March 1st and would arrive, usually around March 15th in Paypal.  For 
Millennial Media and Inmobi, the amount of time is longer.
&lt;P&gt;

But anyhow I wanted other ad networks.  For the sake of redundancy for 
one - if there was some problem with Admob, I'd still have two other 
sources of income.  Also, perhaps I'd get some better deals, or extra 
functionality, which I have gotten.  Also, I like the idea of keeping 
some competition open for the ad networks - it benefits developers to 
have a few competing ad networks out there.  I read a report which said 
the top Android developers usually have as the top four packages 
includes - Adwhirl, Admob, Inmobi and Millennial Media.  That dovetailed 
with what I had heard already so I went with Inmobi and Millennial 
Media.
&lt;P&gt;

Inmobi seems to do everything manually, and even over the phone.  My app 
approval seemed to be in limbo until an e-mail back and forth.  Then I 
had a phone conversation, where the rep said they wanted me to push up 
the number of requests I was getting as they thought it was too low.  
This conversation happened a month ago.  I said my Book Reader got a lot 
of hits so submitted that.  It was pending, then they said they wanted 
more info on my address etc., so I put that in and it is still pending.  
Not that I mind much, I submitted the app at their urging, to some 
extent.  As I said before, to be able to transfer earnings balance to an 
ad budget requires manual intervention as well.  Well, Admob and 
Millennial Media are more responsive without hassle, so I'll deal with 
them more in terms of buying and selling ads for the time being.  Inmobi 
is still the primary target for File Manager ads though, with MM and 
then Admob as fallback, and 80% of traffic is directed to Inmobi via 
Adwhirl right off the bat.  Aside from responsiveness, I'd need to make 
$1.67 a day from Inmobi to get a monthly check from them, and right now 
that is more like 28 cents a day, so I haven't even hit that minimum yet 
with them (or Millennial, which is about $1.03 a day).

&lt;P&gt;

I suppose eCPM, RPM, CTR, etc. are important in differentiating ad 
networks, but one overriding thing is fill rates.  Admob and Adsense 
integration has been increasing as time goes on, other than it taking a 
day for clicks, CTR, eCPM and revenue to update (but not impressions or 
fill rate), the two are very integrated.  And for normal apps, the fill 
rate for this is usually over 98%, if not 99%.  As opposed to this, 
Inmobi has had a 21-54% fill rate for me over the past two weeks.  
Millennial, which is getting a fraction of the direct File Manager 
traffic Inmobi gets, but which does get its run off, has had a 77-86% 
fill rate for the past 9 days.  The major slackoff from them is for 
countries like Brazil and Poland, they don't have the presence Google 
can afford there yet.  But for the US, France, Germany, Japan etc., 
their fill rates have been on par with Admob's.  With Adwhirl, lower 
fill rates are not as big a deal, but it takes seconds for Adwhirl to 
miss an Inmobi ad, and the Millennial ad, and then maybe even an Admob 
ad before putting up an Admob &quot;Adwhirl&quot; ad, and by that time the 
Activity with the ad may have been clicked off.</description>
  </item>
  <item>
    <title>Happy New Year&lt;P&gt;</title>
    <pubDate>Sun, 01 Jan 2012 19:53:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2012/01/01#20120101</link>
    <description>
My New Year's started out the right way, one of my apps, &lt;a 
href=https://market.android.com/details?id=com.panaceasupplies.android.panaceadb&gt;Panacea 
Database&lt;/a&gt; crossed the 5000 download mark.  It's kept to a 40%+ 
active/net install base as well, hopefully with some of the updates 
coming down the pike it will maintain, or even improve, that percentage.</description>
  </item>
  <item>
    <title>Profit</title>
    <pubDate>Thu, 17 Nov 2011 18:58:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/11/17#20111117</link>
    <description>&lt;P&gt;

Looked at &lt;a href=http://www.admob.com&gt;Admob&lt;/a&gt; today, I finally pushed 
past $25 in payments from my Android applications.  $25 was the one-time 
fee I paid to get on Android Market.  So I've made $25.16 from my three 
mobile apps so far, and am now 16 cents in the black.  Admob sends you 
money when you hit $20 for a month, so in December I should be getting a 
check for October and before.  In addition to the Admob money, Samsung 
was also nice enough to give me a free $500 value 10.1 inch tablet to 
write tablet-sized apps on.  And with my latest update of Bouncer out 
this morning, all three of my apps now handle &quot;extra-large&quot; displays, as 
Android calls them.&lt;P&gt;

I was contemplating that I'm now in the black this morning, and felt 
good about it.  My thought in terms of my business of putting out 
Android apps revolves around having no recurring capital costs, and if 
at all possible, no capital costs at all.  Particularly in terms of some 
web page that an app must contact that I'd have to pay $10 a month or so 
for.  Right now I just code the app, push it to Android Market, and 
collect the ad money.  Aside from the slow wear on my keyboard, mouse, 
screen etc., the only expense is my time.&lt;P&gt;

I wrote a framework for a spreadsheet, and did a number of spreadsheet 
features for it.  Then I worked on getting pre-2007 Excel files onto it, 
which I did.  Then I worked on getting Excel 2007 and 2010 (.xlsx) files 
onto it - and got stuck.  There are two possible paths to fixing this, 
an easier one of I can get things down to less than 65,536 methods, and 
a harder one if I can't.  I took a shot at the easier path, and that 
just might not be possible, as I got rid of a lot of methods.  I may be 
able to pare down a few more.  If not I'll have to go on the harder 
route.  Anyhow, I &lt;a 
href=https://github.com/dennis-sheil/android-spreadsheet&gt;put the code 
up&lt;/a&gt; on Github.&lt;P&gt;

A month ago, I finished rewriting the layout of Panacea Database for all 
major (and minor) device sizes and screen densities.  Then I added a 
feature to remember the last file opened.  I did some testing and QA on 
the last file feature, but perhaps not enough, as it seems there have 
been some crashes since then which probably pertain to that.  Which I am 
looking into.  People seem to want column sorting, which I can work on 
implementing.  I might throw in some SQLite stuff, depending on how easy 
it would be.
&lt;P&gt;
So all of my apps have decent layouts for all major (and most minor) 
devices, which I am happy about.  So now I am on to my new apps, as well 
as fixing bugs and implementing new features in Panacea Database.</description>
  </item>
  <item>
    <title>Another Android application</title>
    <pubDate>Sun, 09 Oct 2011 22:11:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/10/09#20111009</link>
    <description>&lt;P&gt;

I released another Android application - &lt;a 
href=https://market.android.com/details?id=com.panaceasupplies.android.lovepoems&gt;Love 
Poems&lt;/a&gt;.  It took off initially - by the fourth day there were 442 
downloads, with 280 of them active installs.  But then that slope of 
adoption leveled off, it fell in the Market rankings etc.  Not sure what 
hurt it - I did an update allowing users to increase or decrease the 
text size, while someone gave the app a two rating.  It then sunk in the 
Market rankings and downloads leveled off.  A few days later I released 
an update with a few more poems, and also adjusted the text sizes a 
little.  I will do updates in the future, in terms of both poems and 
display tweaking.
&lt;P&gt;

Android is continuing to gain market share.  Here is the browser usage 
seen from various mobile operating systems, according to the web logs of 
the Internet's 7th most trafficked site, Wikipedia:&lt;P&gt;
&lt;img src=/blog/images/chart201110.png&gt;
&lt;P&gt;

As the chart shows, the iPhone and iPad are doing well, as are Android 
smartphones.  Windows Phone 7 is moribund - it only is 0.04% of traffic.  
There is more Android Honeycomb traffic on Wikipedia (0.05%) then 
Windows Phone.  I guess we'll see how they do with Windows 8 and Mango 
which is supposed to launch in 2012, but they are way behind Apple and 
Google.  The modern tablet market is newer than the smartphone market, 
so maybe they'll have a shot at competing there.  I downloaded Windows 8 
preview and developer kit and had a look at it.  Their Store is free for 
developers, although applications are approved first.

&lt;P&gt;

I'm currently developing a fourth app.  Won't reveal all details until 
it's released, but it uses Fragments and the ActionBar.  Android's 
compatibility package does backward compatibility for Fragments but not 
ActionBar, so I am using Jake Wharton's ActionBar Sherlock for backward 
compatibility in ActionBar usage.  I have that all implemented already 
actually.  I haven't done all the happy stuff you can do with tablets 
and Fragments yet, we'll see about that, it's not an essential element 
to the project, but with all the usage of ActionBar and Fragments, 
redesigning it to do that will be easier.  This new app may use SQLlite 
as well, so I may be looking into SQLlite.

&lt;P&gt;

I was invited to the Android Developer Lab in New York on August 24th.  
It was good - I met some interesting people, and they pointed us in the 
direction of where Android is going, which helps me point my development 
in that direction.

&lt;P&gt;

I've been doing a bit of work on Panacea Database's layout.  I moved a 
lot of stuff into XML.  I'm using scale-independent pixels and 
density-independent pixels as much as possible, as well as adjusting the 
size of buttons by layout weight and that sort of thing.

&lt;P&gt;

One thing I've been doing - I change how many rows I display when 
fetching rows from the database, and the scale-indepedent pixel text 
size of the display, depending on what screen size I have, what 
orientation I am in, and to some extent, how many dpi are on the 
display.  The way I've been doing this is putting a &quot;gone&quot; TextView in 
the XML, and from my code, reading the number of rows to display from 
that.  Not sure if its best practices, but it works - if I find a better 
way I'll do that.</description>
  </item>
  <item>
    <title>Android development</title>
    <pubDate>Sat, 09 Jul 2011 22:59:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/07/09#20110709</link>
    <description>&lt;P&gt;
According to Alexa.com, Wikipedia is currently the 7th most trafficked 
web site.  They are also one of the few large web sites to allow 
everyone glimpses of their web log analysis.  I mention this in a 
previous blog post.  In December 2010, Android devices made up .078% of 
Wikipedia's web traffic.  At the end of May 2011 (June numbers are not 
done yet) that was up to 1.16%.  So Android traffic on Wikipedia 
increased about 48% in six months.&lt;P&gt;

Actually, the six month increase of about 48% from December to May was 
more-or-less matched by the one month increase from November 2010 to 
December 2010, which was a 47% increase in traffic.  I guess a lot of 
people got Androids in their Christmas stocking, or next to their 
Hanukkah dreidels...
&lt;P&gt;

So anyhow, I released my second Android application, &lt;a href= 
http://market.android.com/details?id=com.panaceasupplies.android.panaceadb&gt;Panacea 
Database&lt;/a&gt;, on June 11th.  I definitely followed the Release Early, 
Release Often philosophy for this one - I got the idea for it on June 
7th, and by June 11th it was published. &lt;P&gt;

I guess another party writing a nice Java library, which someone else 
posted a bug report, which was subsequently fixed, seven months before, 
that fixed all the Android bugs, helps.  Thanks Miha Pirnat, wherever 
you are!&lt;P&gt;

So what it does is iterates table rows and does searches for Microsoft 
Access style files on Android.  Or Microsoft Access 2000 to 2007.  With 
a lot of Access 2010 working.  I actually just sent a patch in to the 
library people to fix a bug.  Or implement a kludge to get around the 
bug anyhow - until I'm interested in dealing with Attachment data types, 
they'll have to write a fix.&lt;P&gt;

So both my apps have passed through the 500 download point.  Bouncer has 
a 41% active/total install ratio, Panacea Database has a 57% install 
ratio.  Why is that?  Well to quote a critic on the Android Market, 
Silas, &quot;Move to SD card!!&quot;  The app has a lot of PNG's and JPG's and is 
3.8MB.  Maybe I will move some of that to the SD card, who knows?  It's 
an issue I have to figure out how to deal with.&lt;P&gt;

My Admob revenue for the last week is 79 cents, $1.52 for the week 
before that, and $1.28 for the week before that.  My first goal is $100 
a month in revenues.  Whether that be by ads, sales or whatever, it does 
not matter.&lt;P&gt;

Initially I thought of just tossing out apps left and right and seeing 
what stuck.  But you put an app out and you have to maintain it.  And 
I'm just one person.  For now anyhow.  I don't want lots of one-star 
ratings for my apps on Android Market.  The lowest I've gotten were two 
three-star ratings for Panacea Database.  One wanted me to fix the bug 
where next-lines in a text data type would make a button disappear.  
I've partially patched that already, and have a full patch for that 
(hopefully) that I will release, oops, I mean publish, soon.</description>
  </item>
  <item>
    <title>A Guide for the Android Developer Guide</title>
    <pubDate>Mon, 20 Jun 2011 00:15:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/06/20#20110620</link>
    <description>
I wrote &lt;a href=http://www.vartmp.com/dev/android-development.html&gt;A Guide 
for the Android Developer Guide&lt;/a&gt; which attempts to translate Googlese 
into English</description>
  </item>
  <item>
    <title>Bouncer, my first Android application</title>
    <pubDate>Tue, 31 May 2011 14:13:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/05/31#20110531</link>
    <description>
&lt;P&gt;

So, I have published my first Android &lt;a 
href=https://market.android.com/details?id=com.panaceasupplies.android.bouncer&gt;app&lt;/a&gt; 
(the concept for which someone else described to me).  What have I 
learned about Android development and such since then?

&lt;P&gt;

My first (unpublished) Android app was heavy on ListView.  It was a tree 
of ListView's really - the top ListView went into sub-trees of 
ListView's, until a leaf/node on the bottom was reached, which might be 
something else.  I filled out the onCreate method, and an 
onListItemClick method.

&lt;P&gt;

The first screen of my new app was initally going to be a GridView.  I 
then gave up on that.  I then created two activities which could go back 
and forth to one another via clicks (listened to with OnClickListener) 
via Intents.  Then I had them pass information to one another in the 
Bundles.  So now I can pass messages to my sub-trees via the Bundles, 
and they can be separate Activities.
&lt;P&gt;

Having dropped the Gridview, I tried out the TableLayout, which I 
eventually went with.  So now I had my grid-like table of letters on the 
first screen, able to pass which letter was pressed via a bundle in the 
Intent to another Activity.  I used Buttons for these letters.

&lt;P&gt;

I then wanted there to be a tab on the front screen, with the table of 
buttons in the primary tab, but with people able to tab over to the 
&quot;About&quot; tab.  So I made the first activity a TabActivity, and opened the 
Activity with the table with an Intent.

&lt;P&gt;

I then wanted to change the color of the buttons, but found out it was 
not all that simple, and learned about 9-patch drawables and the like.  
So I created my own buttons, which needed their corner rounding to be 
specified and the like.

&lt;P&gt;

Google suggests you put an End User License Agreement in the 
application.  There is a standard class to do this, so I put it on the 
application.

&lt;P&gt;

Ultimately, I want my app to cover all 50 of the US states, as well as 
the District of Columbia (Washington D.C.)  Currently, it covers 46 of 
the 50.  I had the current ID for 46 of the states, at this point in 
development I started putting up older licenses that may still be valid.

&lt;P&gt;

Most of this time I was designing for a high density, normal size screen 
in a vertical position.  About 17% of people using Android's use medium 
density however.  Also, some people flip from vertical to horizontal 
mode, I even encourage this flipping in the application when the full 
image is about to come on the screen.  So I did some work on making it 
at least function with medium density setups, and for high density 
setups when viewed horizontally.  I get the display metrics, and then 
call different layouts depending on what the metrics are.

&lt;P&gt;

When to release is always an open question.  &quot;Release early, release 
often&quot;, agile development and so forth is the popular credo, and I agree 
with it for most applications.  On the other hand, you can't release too 
early, especially since Android Market has a rating system and so forth.  
But at this point, I felt I had enough, and the last four holdout states 
it didn't look like I would get anything from them in the next few days, 
so I decided 46 was enough to be useful, that layout looked decent for 
most phones, and was at least usable for almost all phones.  So I 
released.

&lt;P&gt;

One thing I did not do when releasing was release the initial version 
with ads.  Why?  Because Admob wants to know where it is on Google 
Market to give you an ad code, and I had nothing up there yet.  I later 
realized I had misunderstood due to my unfamiliarilty with all of this, 
I could have put an ad in the initial version.  Within a few hours of 
publishing version 1.0.0, I released 1.0.1, which contained Admob ads.

&lt;P&gt; 

It's been 28 hours since I released the initial version, and 15 hours 
since I released the version with ads.  Thusfar I have had 78 downloads 
of the app from Android Market, and have had 55 ad impressions served.

&lt;P&gt;

In subsequent versions I plan to improve the application.  I will work 
to get the four missing states, and the District of Columbia.  I will 
put more information about identification.  I might put a bubble up 
announcing updates, but I wouldn't want it to be too annoying.  I also 
have some kludgey stuff in the layout files which hopefully I can clean 
up, as I learn the Android API better these things can be more smooth.

</description>
  </item>
  <item>
    <title>Android</title>
    <pubDate>Fri, 22 Apr 2011 20:05:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/04/22#20110422</link>
    <description>&lt;P&gt;
I have been looking over Android's API and have been writing an Android 
application with Eclipse.&lt;P&gt;
Android use has started to take off in the past months.  I have looked 
at various metrics, one I like is &lt;a href=
http://stats.wikimedia.org/archive/squid_reports/2011-03/SquidReportOperatingSystems.htm&gt;from&lt;/a&gt; 
the Internet's 8th most trafficked sites, Wikipedia.  It shows the 
growth of Android use over the past six months:&lt;P&gt;
&lt;img src=/blog/images/phones.png&gt;&lt;P&gt;
The graph y-axis is the percentage of all browsers coming in - mobile, 
desktop and whatnot.  X-axis is the time period of usage - the past six 
months.  The OS versions are listed in the key, although &quot;Mobile other&quot; 
is a catch-all. 
&lt;P&gt;
In October 2010, 0.47% of all hits to Wikipedia came from Android 
phones.  In March 2011, 0.98% of all hits to Wikipedia came from Android 
phones.  So that has more than doubled within the past six months.</description>
  </item>
  <item>
    <title>Evince, Ubuntu, python, etc.</title>
    <pubDate>Wed, 23 Mar 2011 00:17:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/03/23#20110323</link>
    <description>
&lt;P&gt;

I have been corralled into doing some programming in python.  So at one 
point I decide to write a do-while loop and learn - python has no 
do-while loops.  Terrific.
&lt;P&gt;
My &lt;a href= 
http://git.gnome.org/browse/evince/commit/?id=2bcc52f832a05b0d4145940cb501143f2530ad56&gt;patch&lt;/a&gt; 
made it into Evince 2.91.92, I'm officially a Gnome contributor, yay.  I 
patched a bug while chasing down another bug.  Carlos couldn't reproduce 
it - I wonder when it manifests itself.  The bug crept in in December, 
and not many people are running evince released since then, so the pool 
to try to reproduce it is limited.  Carlos fixed up my patch so that it 
wouldn't cause problems going in.  I still have to fix that original 
bug.  Actually, I already did, but the fix is trivial, and I want to 
look over my code again to make sure it's decent.
&lt;P&gt;
I also &lt;a href=https://bugs.launchpad.net/evince/+bug/708404&gt;patched&lt;/a&gt;
the evince package for the upcoming Ubuntu 11.04.  It was a suggested 
backport of a commit.  Again, my patch had to be massaged in.  I changed 
the Ubuntu documentation for patches so as to point to the complete 
method of doing a patch.
&lt;P&gt;
I know people make use of git branches, but I never really used it until 
recently.  It is very handy, especially if you're doing a lot of work on 
something.  I will surely be using it in the future more.</description>
  </item>
  <item>
    <title>Blunder's PGN-to-FEN converter nearing completion</title>
    <pubDate>Thu, 20 Jan 2011 02:40:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/01/20#20110120</link>
    <description>&lt;P&gt;

The minor re-design, or major refactoring, of 
&lt;a href=http://blunderchess.sf.net&gt;Blunder's&lt;/a&gt; PGN-to-FEN 
converter was finished three days after my last blog post about it.  It 
went very well, the new code which replaced the old code is more 
abstract and flexible, looks better and works better.  Funny how these 
things go together - it seems good coding practices solve a lot of the 
headaches of coding and things begin working automagically.

&lt;P&gt;

I mentioned problems in Lutz Tautenhahn's PGN-to-FEN converter in my 
last blog post.  After writing it, I decided to e-mail him a bug report.  
Within 14 hours he fixed the bug and posted new code, which I tested, 
and both problems were fixed.  So Lutz's converter is now working 
without problem, as far as I can see.

&lt;P&gt;

I've fixed many things in the PGN-to-FEN converter since the 
redesign/refactor.  I check in every (?) manner if a move would put the 
king in check.  I now handle many (all?) en passant scenarios.  I also 
now deal with PGNs where a FEN position in the middle of a game is 
given, and where the subsequent moves are from that position (i.e. we 
start in the middle of the game).  I made other changes as well.

&lt;P&gt;

I just made my most satisfying commit since the redesign/refactoring.  
It was the fruit of other commits before.  First I began marking games 
on the linked list as I went, not all in the beginning (which caused an 
initial delay when parsing large PGNs with many games).  Then I pushed 
code into the Game class that I had wanted to push there for a while.  
All of this allowed me to do the latest commit.

&lt;P&gt;

I was reading the entire PGN into a linked list in the PGN class, and 
then pushing the entire linked list into other classes like Game.  As 
Game only needs one game, I created a second, short linked list with 
only one game, and pushed that to Game.  As the original data on the 
first, long list is no longer needed, I removed it.  I am always dealing 
with the head of the linked list in these cases.  Anyhow, this process 
of dealing only with the head of the larger linked list, and shrinking 
it as the program goes, has made the program over ten times faster.

&lt;P&gt;

So what more is there to do?  People have their own bizarre 
implementations of the PGN format.  I handle many of them, but there are 
a few more out there I might do.  All of the code is working, but I 
might clean up some of it so that it is easier to read and cleaner.  I 
also might work on a user interface other than running the jar file from 
the command line.  I might also discover some edge cases of the en 
passant sort that I am not dealing with.  I have tested tens of 
thousands of games, and have been looking over FIDE chess rules, the 
specifications and so forth, so I don't think there will be much more of 
this type.  The program is in decent enough shape right now, I guess I 
just want to deal with a few more of the oddball PGN implementations, 
and fix up the UI a little bit, before I feel this is fully formed in 
its first generation.  But it is working pretty well as it is.</description>
  </item>
  <item>
    <title>Refactoring...</title>
    <pubDate>Tue, 04 Jan 2011 21:30:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2011/01/04#20110104</link>
    <description>&lt;P&gt;

Since I have a long way to go before becoming a good programmer, I 
sometimes refer to Code Complete, The Mythical Man-Month and the like to 
keep me on the right track.

&lt;P&gt;

I think I have reached that point, of throwing away the first one 
built, with the &lt;a href=&quot;http://blunderchess.sf.net&quot;&gt;Blunder&lt;/a&gt; PGN to 
FEN chess translation component I have been programming for the past 
month.

&lt;P&gt;

To be honest with myself, I foresaw these design problems back when I 
originally did the design.  I knew I would have to deal with many of the 
things I am dealing with now way back when I was doing the original 
design (although not totally - checking that a piece is pinned to the 
king is more important than I thought it would be, if I thought of it 
all).  The thing is, designing the program with all of that in mind 
would be &quot;boring&quot;.  It would be too abstract initially, it wouldn't DO 
anything until quite a lot of the program was coded.  The way I 
programmed this, it worked right off the bat - at least with the first 
PGN I used as a basis.  It translated the first ply of the first move 
correctly, and then the next ply of the first move, then the first ply 
of the second move and so on.  After that all worked, I tried another 
PGN.  As I sought to get it working for my various PGNs, I added more 
and more functionality to the program.

&lt;P&gt;

The method functionality I need now seems rather abstract, or at least 
more abstract than the functionality I have now.  &quot;Check to see if piece 
(rook or queen) is pinned to king horizontally&quot;, &quot;Check to see if piece 
(bishop or queen) is pinned to king diagonally&quot;, and so on.  Things are 
a little more abstract than I'd like, but if I try to keep things very 
specific, I will have much, much more coding to do.

&lt;P&gt;

The program currently does over 95% of PGNs correctly, but there are too 
many possible corner cases to deal with.  The functionality that deals 
with plies (half-moves), which is most of the program, has to be 
rewritten.

&lt;P&gt;

The main thing I focused on with the initial design was the data 
structures.  I did change things around a bit, especially the Board 
class, which is my half-way class between the translation of the PGN to 
FEN.  I also realized while programming that I needed a Move class.  
When functionality got to where over nine out of ten PGNs parsed, I 
wanted to do PGN files that had multiple games within it - and thus a 
Game class was created as well.

&lt;P&gt;

One nice thing is, aside from the edge cases I have to redesign for, my 
PGN to FEN converter has some aspects that are superior to the two other 
converters I've found out there - Lutz Tautenhahn's PGN-to-FEN converter 
and 7th Sun Green Light Chess's pgn2fen.exe program for Windows (or 
Linux, with WINE).  Tautenhan's program I tested out more - I saw two 
problems - one, castling ability which is disabled due to a rook move is 
re-enabled if the rook moves back to the square.  I'm fairly sure this 
is not legal with FIDE rules.  Secondly, if a pawn move results in pawn 
promotion, Tautenhan's converter does not reset the half-move clock due 
to the pawn move, but in fact increments it.  I believe this is not the 
case with FIDE rules, but am less sure.  As far as the Green Light chess 
converter, I have not looked at it as much as Tautenhan's, but I do know 
it does not mark en passant squares in the FEN.

&lt;P&gt;

Blunder's converter marks en passant squares, disables castling 
availability properly, and resets the halfmove clock on all pawn moves - 
even pawn promotions, which I believe is the correct behavior.  Now I 
just have to redesign and abstract the methods that deal with converting 
a ply to a new configuration for my Board object.  Which is most of the 
methodology for the program.  I might tinker a little more with the data 
structures, perhaps making them a bit more robust.</description>
  </item>
  <item>
    <title>Blunder, Chess, Java, Architecture and Construction</title>
    <pubDate>Mon, 06 Dec 2010 13:25:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/12/06#20101206</link>
    <description>
So, I put Blunder &lt;a href=http://blunderchess.sourceforge.net&gt;up on&lt;/a&gt;  
Sourceforge.&lt;P&gt;
Blunder is a suite of chess-related tools.  Primarily, it helps you go 
over your games, and see where you made mistakes or missed 
opportunities.  You keep looking at the boards where you made your 
biggest and/or most recent mistakes, and keep testing that you now know 
what to do correctly.  Most chess teachers say this is one of the main 
ways to improve your game, and with Blunder it is automated.
&lt;P&gt;

Anyhow, the program has been out for almost a year, particularly the 
main LAMP (Linux, Apache, MySQL, PHP) component.  However, one necessary 
component has been converting files in PGN format (records of games) to 
FEN format (pictures of individual boards at a set point).  I give 
pointers how to do this, but have not been happy with any of the 
existing tools, and have begun writing my own one in Java, with GPL 
version 3 licensing.  This was the impetus to put it on Sourceforge 
actually.

&lt;P&gt;

As I said, Blunder is functional already, particularly the LAMP package 
for going over games.  One necessary component for that to work is PGN 
to FEN conversion, for which there are tools out there.  I am unhappy 
with them, so I am writing my own in Java.  If any Java developers want 
to send git patches, I'd be happy to get them.  This second package 
within the Blunder project is in pre-alpha right now.

&lt;P&gt;

While this has all been done pretty loosely, I decided to try for a 
little bit stricter good practices in the pre-construction part of the 
project.  I have to report - it worked out very well!  I began by 
cheating on the good practices a little - I coded a method that read the 
file into an array.  It was just a detail I didn't want to bother with 
once requirements and architecture was done as I'd want to get right 
into the construction beyond that first.
&lt;P&gt;
My requirements were:
&lt;br&gt;
Read in a pgn (from say, a file), output a series of FENs 
for every move, or a specific FEN for one move.  I might tweak the input 
or output requirements later, but the middle part, converting one to the 
other, will remain the heart of the program.
&lt;P&gt;
I then did architecture.  I sketched out the major classes, their 
responsibilities and their interactions.  Initially there were three 
classes - Pgn, Board and Fen.  I thought about it and realized Pgn 
should have a helper class, Move, and Pgn would have an array of Move 
objects.  Board is primarily an array of characters representing the 
board, and Fen is an output String representing the FEN.  I think it was 
helpful thinking about all of this beforehand more than I usually would 
have.  It saved time in the long run.  Every minute I spent doing this 
right off the bat probably saved a multiple of itself so far.


&lt;P&gt;

One mistake I made is instead of making the Board array something that 
would be intuitive to me, I tried to fit its data structure to the other 
existing data structures.  I thought this would make &quot;less work&quot;.  The 
problem is, Board's data structure then became inscrutable to me, and I 
had to bend my mind to figure out what it was, and kept making mistakes.  
I then decided to rearrange Board's data structure to something I could 
intuitively understand, and then use methods to do the conversion 
between it and the other two major data structures.  This has worked out 
much better for me.

&lt;P&gt;

Most of the work left to get the program from pre-alpha to alpha is 
doing the logic (methods) for the various chess moves.  I already have 
methods for PawnMoveNoCapture and KnightMoveNoCapture.  My next method 
will probably be PawnMoveWithCapture - a move where the pawn captures a 
piece.  The program needs methods for all the various moves - Queen move 
(capture and no capture), Bishop move (capture and no capture), Castling 
(Queenside or Kingside) and so on.  This will be the bulk of the work to 
get the program into alpha.
&lt;P&gt;

I am plowing ahead with those methods right now.  There is some code 
duplication within existing methods, but my concern is not with that but 
code duplication between methods - I already created a method to convert 
the letters from the Pgn moves to the numbers the array in Board uses, 
which both existing chess move methods use.  I would like to complete 
moves for all the pieces, when capturing or not.  Anyone who wants to 
send in git patches for these Java methods should feel free, the two 
existing methods can serve as a base.
&lt;P&gt;
You can grab it from the &lt;a 
href=http://sourceforge.net/scm/?type=git&amp;group_id=381224&gt;project's git 
page&lt;/a&gt; on Sourceforge.</description>
  </item>
  <item>
    <title>&lt;b&gt;Linux desktop/smartphone penetration&lt;/b&gt;</title>
    <pubDate>Mon, 22 Nov 2010 21:15:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/11/22#20101123</link>
    <description>
&lt;p&gt; This Wikipedia &lt;a href=http://en.wikipedia.org/wiki/Usage_share_of_web_browsers&gt;article&lt;/a&gt;
tells you the share of web browsers from different sources,
but clicking through the links you can see what penetrations
OS's running web browsers have as well.  These web
sites give an accounting from their logs of what the OS's
are for the people they're serving pages to.
&lt;P&gt;
W3counter has 1.49% running Linux and 0.25% running Android
in October 2010.&lt;P&gt;
Clicky gives a daily tally, which is 1.25% for Linux today,
and has
been hovering around 1.25% for the past weeks.&lt;P&gt;
Statcounter has 0.78% running Linux since September.  Not
sure what they're counting as Linux or why their Linux count
is so much lower than the others&lt;P&gt;
Most interesting is Wikimedia, which really &lt;a href=http://stats.wikimedia.org/archive/squid_reports/2010-10/SquidReportOperatingSystems.htm&gt;breaks
down&lt;/a&gt; the statistics.  They sample 1/1000 of their logs,
so every hit they show can be assumed to be multiplied by
about 1000.  They count Linux, for which they include
Android, as 2.04%.  The breakdown is 0.75% Ubuntu, 0.47%
Android, 0.07% SUSE, 0.06% Fedora, 0.05% Debian, and by the
time it gets to Gentoo it is down to 0.02%.  Red Hat, CentOS
and &quot;Linux Motor&quot; (whatever Wikimedia means by that) comes
up with the rest.  There's even a breakdown of the different
Ubuntu, Fedora and Android versions.  Cool.  It gives you a
general idea of what the penetration rate is any way.</description>
  </item>
  <item>
    <title>Epdfview patch</title>
    <pubDate>Wed, 17 Nov 2010 08:47:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/11/17#20101117</link>
    <description>
My &lt;a href=http://trac.emma-soft.com/epdfview/changeset/357&gt;patch&lt;/a&gt; 
got commited to the epdfview trunk, cool.</description>
  </item>
  <item>
    <title>Ubuntu and user-focus</title>
    <pubDate>Wed, 03 Nov 2010 23:25:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/11/03#20101104</link>
    <description>
There's a lot of chatter about Canonical, and Unity and the Gnome shell 
and all of that.  There's one thing I love about Ubuntu though and 
that's the user focus.
&lt;P&gt;
I am installing to a KVM Maverick Meerkat 10.10 from the ISO.  It gives 
the option to allow network updating while it copies files from CD to 
disk - smart, save the user some time later, very thoughtful.  It also 
pops up a slideshow (which is browsable) showing features of Ubuntu 
while it is copying from the CD to the disk - nice, if the user is in a 
situation where he doesn't have much to do while waiting for install to 
finish, show him or her the system features and educate them about it&lt;P&gt;
There is a division of labor in all enterprises.  I run the servers, 
sometimes I write the code, I investigate problems.  I don't normally 
think about user desktop Linux experience much, except in an abstract 
way, such as that PDF backend library support could be better so that 
people could render their PDFs better.  It's good there are people out 
there who do.</description>
  </item>
  <item>
    <title>Building GNOME with jhbuild, a.k.a. pain</title>
    <pubDate>Tue, 02 Nov 2010 21:27:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/11/02#20101102</link>
    <description>&lt;P&gt;
Trivia: Who said less than a month ago, &quot;Getting a jhbuild to finish is next to impossible&quot;.&lt;P&gt;
Answer: &lt;a href=http://blogs.gnome.org/otte/2010/10/06/doing-it-wrong/&gt;Benjamin Otte&lt;/a&gt;.  The #1 commiter to gtk+ in the last 1500 commits or so.  The #1 commiter to cairo in the last 500 commits or so.&lt;P&gt;
The problem isn't jhbuild so much, although moduleset options could probably be cleaned up a little bit more.  It is with broken stuff in GNOME, or which GNOME depends on.  Luckily for me, &lt;a href=http://www.vartmp.com/blog/images/evince_dependencies.png&gt;my build tree&lt;/a&gt; is not that large.&lt;P&gt;
Poppler is not alone in my .jhbuildrc in ignoring gobject introspection stuff during a build any more - welcome pango!&lt;P&gt;
Then gtk+ won't build.  It was failing on a dependency to fontconfig, which was broken by a commit on October 6th.  Or at least fontconfig's pkg-config metadata hint file was broken, for a number of people who use standard build options (like me), causing gtk+ not to build.  I've emailed the person who made the commit.&lt;P&gt;
I won't go into stuff higher up on the chain that depends on gtk+.  Needless to say there's brokenness.</description>
  </item>
  <item>
    <title>&lt;strong&gt;Epdfview&lt;/strong&gt;&lt;P&gt;</title>
    <pubDate>Sat, 30 Oct 2010 13:13:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/10/30#20101030</link>
    <description>
&lt;p&gt; So, I'm happy to have my jhbuild building poppler and its
dependencies off their latest commit, and then epdfview and
evince on top of that.  Of course, if anything is broken
anywhere down the chain, things fall apart.  I've turned off
a lot of the gobject introspection for now....
&lt;P&gt;
Epdfview is more lightweight than evince, with a few less
dependencies, so I often use it when testing.  Epdfview
currently compiles against the latest of its dependencies
(thankfully no big breakage in gtk+ or glib, as sometimes
happens) and can load some PDFs.  But a number of PDFs it
crashes on.  Gdb says:&lt;P&gt;
&lt;pre&gt;
Program received signal SIGSEGV, Segmentation fault.
[...]
__strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31
31      ../sysdeps/x86_64/multiarch/../strlen.S: No such
file or directory.
        in ../sysdeps/x86_64/multiarch/../strlen.S
(gdb) bt
#0  __strlen_sse2 () at
../sysdeps/x86_64/multiarch/../strlen.S:31
#1  0x00007ffff772f502 in g_strdup (str=0x1 &amp;lt;Address 0x1 out
of bounds&amp;gt;)
    at gstrfuncs.c:101
#2  0x000000000040ad46 in
ePDFView::IDocument::setLinearized(char*) ()
#3  0x0000000000411680 in
ePDFView::PDFDocument::loadMetadata() ()
&lt;/pre&gt;

&lt;p&gt; Hmmm.  It took me a little time to figure out why this was
breaking every now and then.  I am compiled against the
latest glib commit - is someone messing with g_strdup or
something?  Eventually, I tracked it down to a &lt;a 
href=http://cgit.freedesktop.org/poppler/poppler/commit/?id=d4a6c17255821925906c17b79b88eebed9edfee1&gt;commit&lt;/a&gt;
in poppler from September 17th.  From the message I guess
they knew it would break the API -
&quot;PopplerDocument:linearized is now a boolean value rather
than string, so this commit breaks the API again.&quot;&lt;P&gt;
So that's simple enough.  I changed the gchar's to
gboolean's, and made some other little changes, and sent a
patch in to jordi at emmasoft, so maybe it will get applied.
 My version is working anyhow...</description>
  </item>
  <item>
    <title>jhbuild, evince/poppler etc.</title>
    <pubDate>Tue, 26 Oct 2010 22:58:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/10/26#20101026</link>
    <description>

So, in the GNOME-and-fd.o(freedesktop.org)-verse, there are a few things I 
want to run from the latest updates - evince, poppler and cairo.  Which 
means I want to run the latest versions of their dependencies as well.  
So I decided to use &lt;a 
href=http://live.gnome.org/Jhbuild&gt;jhbuild&lt;/a&gt; to build it all.  Last 
month, GNOME 
developer Andr&amp;eacute;  Klapper &lt;a 
href=http://blogs.gnome.org/aklapper/2010/09/19/jhbuilding-gnome-3-0-no-fun&gt;wrote 
in his blog&lt;/a&gt; about how little fun it is to build GNOME from the 
latest commits via jhbuild.  Perhaps, but I finally did it - a 
subset of GNOME anyhow.&lt;P&gt;

The default jhbuild moduleset is gnome-3.0, but that builds some of the 
stuff I'm focused on from tarball's, which is supposed to be deprecated 
in jhbuild now anyhow.  So I remove gnome-3.0 from my .jhbuildrc and put 
all of the devel modulesets into my 
.jhbuildrc.  But some of the dependencies were missing - they were in 
the non-devel modules.  So I put all of those into my own moduleset.  As 
my moduleset is local, I set use_local_modulesets to True - even if 
thats not necessary, I git pull from jhbuild before I run a jhbuild, so 
why not do that?  I also put

&lt;pre&gt;
module_autogenargs['evince'] = autogenargs \
                             + ' --disable-nautilus '
&lt;/pre&gt;
into my .jhbuildrc to avoid those headaches with evince.  I skip a 
number of modules people recommend to put in skip, like mozilla, 
although I don't believe they're dependencies in my chain.  I also add a 
few pkgconfig path's to .jhbuildrc, on advice from the net.  Of 
course, I also install the packages on Ubuntu that the jhbuild web site 
recommends for Ubuntu 10.10.&lt;P&gt;
Incidentally, here are the jhbuild dependencies for evince (and 
epdfview):&lt;P&gt;

&lt;img src=/blog/images/evince_dependencies_small.png&gt;

&lt;P&gt;
Color code for nodes: green are packages in jhbuild &quot;devel&quot; modulesets,
red are packages in jhbuild &quot;non-devel&quot; modulesets,
brown (libgcrypt and libgpg-error) are also in jhbuild &quot;non-devel&quot; modulesets
and they are tarballs there,
purple are packages in jhbuild &quot;devel&quot; modulesets which other packages might
have a hidden dependency to which is not shown in the current jhbuild
modulesets.  Finally blue are non-GNOME packages that no GNOME module is
dependent on, but which are themselves dependent on some GNOME modules.&lt;P&gt;
I made the above dependency tree with graphviz, a tool which makes doing
such dependency charts really easy.&lt;P&gt;
Everything went pretty swimmingly until I started to reach the top of the
chain.  Poppler busted on some GObject introspection stuff - I installed
gobject-introspection as a jhbuild module and updated the poppler gir
include from Gdk-2.0 to Gdk-3.0 and it went sailing along.&lt;P&gt;
Next up - gtk+ 3.0 broke.  This happened to me a few days before, when I was
taking my first stab at jhbuild.  At that time, I looked at the recent gtk+
code, saw the stuff breaking had changed recently, and did a hard git reset
of gtk+ to a commit from 48 hours before - it installed fine.  This time the
commit done was the last one.  I went on GNOME's IRC network and tracked down
the developer who made the bad commit, he fixed it and I was sailing along
again.&lt;P&gt;
So now I get to evince.  A few days ago I had some problems with deprecated
combo box calls that had been removed from the dependency libraries, but
there were patches for that in bugzilla.  After patching that, this time I
get an error that a set_scroll_adjustments call is failing.  I look in gtk+ and
see that they have been mucking with scrolling there recently, and figure it
is due to that.  I disable the call and compile.  Evince comes up and I can
look around, but it hangs on loading anything.&lt;P&gt;
I check poppler's test programs and they are working.  So I encapsulate a
lightweight PDF viewer that depends on poppler and gtk+, epdfview, into
my personal jhbuild moduleset and build it against these libraries.
Epdfview comes up, and displays PDFs etc. fine.  Ultimately, epdfview and
evince are dependent upon almost entirely the same libraries, except evince
depends on three more icon-related ones.  And epdfview is working.  So either
evince is broken, or some library it depends on has changed, meaning...
evince is broken, for the moment.  But Epdfview works.  </description>
  </item>
  <item>
    <title>poppler rendering</title>
    <pubDate>Fri, 22 Oct 2010 18:48:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2010/10/22#20101022</link>
    <description>
On Ubuntu, the default method of reading PDFs is with evince, which uses 
the poppler library as its backend for PDFs.&lt;P&gt; 

The &lt;a href=http://www.mta.info/nyct/maps/busqns.pdf&gt;bus map&lt;/a&gt; PDF for 
my area takes 16 seconds to load on my computer.  It is on Ubuntu, on a 
64-bit desktop system with an AMD Athlon(tm) II X2 240 2.8GHz processor 
with two cores, and four gigs of RAM.  The bus map PDF is 751K.  This 
seems far too long.  Epdfview, which also uses poppler, takes 8 seconds 
to render the PDF.  Adobe Reader 9.3.4, on the other hand, takes less 
than 4 seconds, and I'll use that as a benchmark here of what the render 
time should be. &lt;P&gt;

So I looked into it.  Instead of grabbing all of the necessary source 
and compiling with gcc's -pg flags for gprof, I compiled an uncompress 
kernel and ran oprofile which rendering the PDF.  It didn't show 
everything fully in the oprofile report, so I downloaded the necessary 
debug symbol packages for poppler, cairo etc.&lt;P&gt;

I rendered the PDF with evince, with epdfview, and with poppler's 
poppler-glib-demo in a local poppler library cloned from the latest git 
commit which I compiled manually.  For all of them, oprofile pointed to 
poppler being the library dominating the processor.&lt;P&gt;

So with the Ubuntu package of debug symbols for poppler installed, I had 
the oprofile report look at poppler.  It showed three methods dominating 
processor time - in order they were - TextBlock::isBeforeByRule1, 
TextPage::coalesce and TextBlock::visitDepthFirst.  This was the case 
for all three programs using the poppler library backend.&lt;P&gt;

So I start hacking around with the coalesce method in poppler, when I 
come across this line:&lt;P&gt;
sortPos = blk1-&gt;visitDepthFirst(blkList, i, blocks, sortPos, visited); 
&lt;P&gt; 

I look at the visitDepthFirst method and see it is doing topological 
sorts on the data.  So I comment out the above line within coalesce&lt;P&gt;

//     sortPos = blk1-&gt;visitDepthFirst(blkList, i, blocks, sortPos, visited);

&lt;P&gt; 

recompile and run it.&lt;P&gt;

So, when rendering the aforementioned PDF with poppler's test program 
poppler-glib-demo WITH the visitDepthFirst method in coalesce, the 
fastest rendering I get is 8 seconds.  When I render it without that 
line, the fastest render I get is less than 3 seconds.  Removing this method 
more than halves my render time. &lt;P&gt;

But perhaps this behavior was due to the PDF itself being unusual.  I 
did a quick search through Google for other PDFs.  As I had this problem 
with a (partial) city map, I looked for other city maps.  I randomly 
found a PDF with a &lt;a 
href=http://www.ratp.info/picts/touristes/photos/plan%20paris-touriste.pdf&gt;map 
of Paris&lt;/a&gt;.  I tested its rendering in poppler-glib-demo, with and 
without the call to the visitDepthFirst method in coalesce.  Fastest 
render time with the method was 3.76 seconds, fastest without it was 2.07 
seconds.  So this random map PDF did not have as significant improvement 
as my map PDF, but this call, which was not doing anything visibly to 
improve the map display, was adding more than 50% more time to the program.


&lt;P&gt; As I said, from my cursory look, there was no difference between the 
displayed map which called visitDepthFirst, and the one which did not 
call it.  I saw what the code did and the comment, for more information 
on its purpose and so forth I began digging through the logs.  I saw 
that the code came from commit f83b677a8eb44d65698b77edb13a5c7de3a72c0f 
on November 12, 2009.  In November 2009, Brian Ewins made a series of 
commits whose purpose was to improve text selection in tables.  This 
particular commit changed the method of block sorting to &quot;reading order&quot; 
via a topological method.  Aside from the comments in the git commits, 
these November 2009 column selection commits were &lt;a 
href=http://lists.freedesktop.org/archives/poppler/2009-November/005266.html&gt;discussed&lt;/a&gt; 
on the poppler mailing list, &lt;a 
href=http://bugs.freedesktop.org/show_bug.cgi?id=3188&gt;as well as&lt;/a&gt; in 
Bugzilla. &lt;P&gt; 

I revert poppler back to commit 345ed51af9b9e7ea53af42727b91ed68dcc52370 
and compile epdfview against it.  Then I revert to poppler commit 
f83b677a8eb44d65698b77edb13a5c7de3a72c0f and compiled epdfview against 
that as well.  Commit 345e... is two commits before f83b...  When I run 
epdfview, using either poppler library as a backend, the version which 
is two commits earlier, 345e... runs in epdfview in less than half the 
time that commit f83b... runs in.  Commit f83b... more than doubles the 
time to run, with no noticable improvement in anything for that 
application of using poppler (displaying a map via epdfview). &lt;P&gt;

I mentioned much of this on Freenode IRC channel #poppler, and how 
removing the visitDepthFirst method from coalesce improved rendering 
time enormously.  Someone, I believe it was Andrea Canciani, looked at 
the method and said the two nested loops looked wrong.&lt;P&gt;

There are two for loops in visitDepthFirst.  I put a counter on the 
inside one to see how many times it ran on my 751K PDF.  It ran over 196 
million times!  For every bit in my 751K PDF, the inner for loop ran 32 
times.  Not only that, if the TextBlock data structure blk3 is not equal 
to either blk1 or blk2, the inner for loop will make not one but two 
calls to the isBeforeByRule1 method.  No wonder my map is rendering 
slow.&lt;P&gt;

So this is where it stands now.  16 seconds seems too long for Ubuntu's 
default PDF viewer to load my local bus map - especially when in my 
hacking around I have gotten it to display in a second and a half or so.  
Whatever that topological commit from November 2009 fixed in terms of 
selecting text from tables, it has more than doubled the rendering time 
of some PDFs, especially PDFs with maps.  The solution would be a change 
that would keep the fixes to text selection in tables, but still have 
something near the speed of rendering prior to that commit.  I will look 
into this more when I have the time.</description>
  </item>
  <item>
    <title>Android install</title>
    <pubDate>Wed, 30 Dec 2009 21:51:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/12/30#20091230</link>
    <description>
I am installing the Android SDK.
For Linux, it suggests the Eclipse IDE, at least version 3.4.  I use 
Eclipse 3.5, &quot;Galileo&quot;.  So I am installing the plug-in and get (in an 
almost impossible to read output box)
&lt;P&gt;&lt;pre&gt;
Cannot complete the install because one or more required items could not 
be found.
  Software being installed: Android Development Tools 
0.9.5.v200911191123-20404 (com.android.ide.eclipse.adt.feature.group 
0.9.5.v200911191123-20404)
  Missing requirement: Android Development Tools 
0.9.5.v200911191123-20404 (com.android.ide.eclipse.adt.feature.group 
0.9.5.v200911191123-20404) requires 'org.eclipse.wst.xml.ui 0.0.0' but 
it could not be found
&lt;/pre&gt;&lt;P&gt;

What this translates to is Android is dependent on another plug-in.  So 
I go to install the webtools/wst xml plug-in, but it needs an EMF 
plugin.  Then it needs a GEF plugin.  Finally it will accept the 
webtools/wst plugin.  Then the Android plugin can be installed.  This 
sounds easy, but between Eclipse's junky and non-intuitive GUI and 
Android's documentation not mentioning their plugin had dependencies, it 
was not.</description>
  </item>
  <item>
    <title>I have become interested in the poppler library lately (and one of the </title>
    <pubDate>Tue, 22 Dec 2009 02:44:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/12/22#20091222</link>
    <description>programs that depends on it, evince).  Poppler is a PDF rendering 
library.  I have been looking through Ubuntu's bug tracking system on 
launchpad, and people have been complaining how they run Evince on a PDF 
file and it crashes, or at least it doesn't display the file right away.
&lt;P&gt;

The particular bug that was reported through Ubuntu that I am looking at 
is #497175 .  The user tried to use evince to look at his PDF&lt; but it 
did not work.  Text that should have been displayed was not displayed.  
He said Xpdf did work on the PDF file, displaying all text.  I 
downloaded the sample PDF file, and saw indeed it displayed the text 
with Xpdf and not evince 2.28.1 (using poppler 0.12.0) on Ubuntu 9.10.  
I tried displaying it with evince 2.22.1.1 based on the poppler 0.6.4 
library.  That worked.  So I figured some time between those early, 
working evince/poppler versions, and the more recent evince/poppler 
versions which broke for the bug reporter (and myself), something must 
have changed that broke this.&lt;P&gt;
I wasted time in two respects looking at this.  One is I looked at both 
evince and poppler.  From my kibitzing of evince and poppler over the 
past months, I have seen over and over that most reported bugs on Ubuntu 
dealing with PDFs and evince are due to bugs in poppler, not evince.  So 
trying different evince versions was a waste of time.
&lt;P&gt;

The second was how I dealt with poppler versions.  I knew poppler 0.6.4 
worked and poppler 0.12.0 didn't, so I downloaded 
poppler 0.10.0 (as the bug was reported recently, I leaned towards a 
more recent poppler version), then compiled evince against it, ran it, 
saw it worked, then began manually downloading other poppler versions, 
compiling them, compiling evince against it, testing it and so on.  
Eventually I saw that poppler 0.11.1 worked and poppler 0.11.2 did not.
&lt;P&gt;
However, I was doing a lot of unnecessary work.  Poppler uses a git 
repository.  I heard about git when it was announced in 2005, and I have 
checked out code from git repositories, and have browsed some git source 
trees over the web, but I have never looked much into it.  Git has a 
cool feature called &quot;bisect&quot;.  Poppler has each release version tagged 
with the release name.  So what I could have done was a git bisect - 
marking 0.6.4 as a good version, and 0.12.0 as a bad version.  Git would 
have bisected all the commits between these two tags.  I would test it 
to see if it was good or bad.  If it was bad, it would bisect at the 25% 
mark between 0.6.4 and 0.12.0, if it is good, it would bisect at the 75% 
mark between 0.6.4 and 0.12.0.  You keep bisecting until you get to the 
bad commit.
&lt;P&gt;
I am doing this now, and am down to my last test.  I will mark it good 
or bad, after which we will know which commit caused this problem.
&lt;P&gt;
[...]
&lt;P&gt;
There, I'm done.  Commit ad26e34bede53cb6300bc463cbdcc2b5adf101c2 broke 
it.  Changes to the CairoOutputDev.cc file.  Before that commit, the 
text displays, after the commit, it does not.  I changed the Ubuntu bug 
report and reported it to the poppler upstream.
&lt;P&gt;
</description>
  </item>
  <item>
    <title>poppler bug</title>
    <pubDate>Sat, 19 Dec 2009 13:50:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/12/19#20091219</link>
    <description>
I am looking at bug &lt;a 
href=http://bugs.launchpad.net/ubuntu/+source/poppler/+bug/436197&gt;436197&lt;/a&gt; 
on the Ubuntu section of Launchpad.  The bug is in the poppler library, 
and usually gets evoked by the evince application.  I am able to 
duplicate it.  The bug is a segmentation fault when evince tries to open 
certain PDF files, or tries to open certain pages in those PDF files.  
There are several bug duplicates since this problem has been hitting a 
number of people.  The bug has also been &lt;a 
href=http://bugs.freedesktop.org/show_bug.cgi?id=24332&gt;reported&lt;/a&gt; to 
poppler.  Launchpad has several PDF files which will reproduce the 
problem.
&lt;P&gt;
The segmentation fault happens when the TextWord constructor is called.  
The reason the segmentation fault happens is because the curFont object 
has not been created.  So without doing much investigation, I simply 
created the curFont object if it did not exist, and then called a 
related method.  This seemed to solve the problem, the program stopped 
crashing and the problem pages were displayed seemingly normally (a 
cursory look shows the problem pages displaying normally, but it is 
possible some portion of the page is displayed improperly).
&lt;P&gt;
&lt;pre&gt;
git diff TextOutputDev.cc
diff --git a/poppler/TextOutputDev.cc b/poppler/TextOutputDev.cc
index 442ace2..9686cc1 100644
--- a/poppler/TextOutputDev.cc
+++ b/poppler/TextOutputDev.cc
@@ -1988,6 +1988,11 @@ void TextPage::beginWord(GfxState *state, double 
x0, double y0) {
     rot = (m[2] &gt; 0) ? 1 : 3;
   }
 
+  if (!curFont) {
+    curFont = new TextFontInfo(state);
+    fonts-&gt;append(curFont);
+  }
+
   curWord = new TextWord(state, rot, x0, y0, charPos, curFont, 
curFontSize);
 }
&lt;/pre&gt;
&lt;P&gt;

However, this is really just a hack.  I don't have much of an 
understanding of how the poppler library works or how evince works.  The 
Poppler people point out that this segmentation fault is not tripped on 
pdftotext, which also uses the poppler library.  This is correct, it 
does not seem to.  Then again, evince is calling the 
poppler_page_render() call in the poppler library, and pdftotext does 
not seem to do that.  Thus, what that ultimately adds up to is 
questionable.&lt;P&gt;
Right now I am exploring the Gfx class, as backtrace (and following the 
program logic) shows that the Gfx class is utilized between the call to 
poppler_page_render() and the failed construction of the curWord object 
of the TextWord class.  Setting the printCommands boolean to true shows 
debugging information so I am looking at that.
&lt;P&gt;
What usually happens with the above patch is that the beginWord method 
is called many times, with one instance where no curFont object exists 
(and thus a segmentation fault would happen).  I do not know much about 
the evince code or these libraries, so I am looking into all of this, 
seeing if I can come up with anything better than the above hack.  It is 
pretty clear this is a poppler problem though - even if these pdf's are 
messed up, they don't crash PDF displayers that don't use the poppler 
library.  The same goes for if evince is not doing something right with 
Cairo before handing it off to poppler.  If this is happening 12 calls 
within poppler, it points to poppler being the problem.
</description>
  </item>
  <item>
    <title>I have a Portal Document Format (PDF) file, which has a series of pages </title>
    <pubDate>Wed, 21 Oct 2009 23:04:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/10/21#20091022</link>
    <description>that evince, the default Ubuntu (and Gnewsense) Gnome PDF reader crash 
on when they are opened to.  Xpdf can read the pages just fine 
however.&lt;P&gt;
This lead me to look over Ubuntu's launchpad web site, which I began 
browsing.&lt;P&gt;
(I saw that file-roller, the default Ubuntu/Gnewsense Gnome 
archive file application was crashing with segmentation violations a 
lot.  There was not much non-automatic information about this however, 
aside from some people saying the problem was not always reproducible, 
but only happened sometimes.  Due to this, and due to it using a lot of 
heavy GTK/GDK stuff that I don't know, I moved on.)&lt;P&gt;

I wanted to look at my evince crash a little more carefully, but I was 
still running Intrepid Ibex (Ubuntu 8.10) whereas most people were 
reporting the problem on Jaunty Jackalope (Ubuntu 9.04) or even beta 
versions of Karmic Koala (Ubuntu 9.10 - beta), the release version of 
which is supposed to be coming out in eleven days.  Well, this indicates 
the problem has been around for a while, and is still around.  So I 
upgraded to Jackalope.  I was a little uneasy about whether to go to the 
Koala beta, but then I plunged in.&lt;P&gt;

One thing I noticed, which was not around so much on Ubuntu's Hardy 
Heron (8.04), is apport, a window which pops up when an application 
crashes and says it will automatically report it to Ubuntu if people 
want.  This popped up for me when evince crash and I sent in the bug.  
Later, I marked it as a duplicate of a similar one.  Launchpad makes a 
slight effort to try to let you see if it's a duplicate while reporting, 
but that question can be a little complex, and the process doesn't deal 
with that.  So I reported it, and then marked it as a duplicate 
later.&lt;P&gt;

The Poppler PDF rendering library was partially implicated in the crash, 
so I downloaded the dpkg for ePDFView, which also uses that library.  
ePDFView also crashes on these pages.  So I reported &lt;a 
href=https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/455026&gt;that&lt;/a&gt; 
as a bug to Ubuntu via apport.  Stacktrace shows pretty much the same 
thing happening, they're both crashing in the JPEG 6.2 library, the call 
from which can be traced back, via the same route, to a 
Page::displaySlice call in the poppler library.  So it looks like 
the poppler library (or possibly even the jpeg library) is at fault.
&lt;P&gt;
</description>
  </item>
  <item>
    <title>jedit</title>
    <pubDate>Thu, 27 Aug 2009 11:30:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/08/27#20090827</link>
    <description>
My patch to jedit was &lt;a 
href=http://jedit.svn.sourceforge.net/viewvc/jedit?view=rev&amp;revision=16080&gt;committed&lt;/a&gt;, cool.</description>
  </item>
  <item>
    <title>As I said yesterday, I've taken 36 hours of a Java programming 101 class </title>
    <pubDate>Sat, 25 Jul 2009 21:58:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/07/25#20090725</link>
    <description>and decided to see if I could put any of it to use.  I believe I have.
&lt;P&gt;
At first I just wanted to see what a real Java program looked like.  So 
I downloaded the latest jEdit source from Sourceforge.  jEdit is the 
sixth most all-time active project on Sourceforge, has had millions of 
downloads, and is written in Java.  Using ant to compile it was easy 
enough, and I did a cursory look through the code.
&lt;P&gt;
As they say, the best way to learn code is to try to change something.  
I looked through the bug list for open bugs that were not assigned.  Bug 
&lt;a 
href=http://sourceforge.net/tracker/?func=detail&amp;aid=2808363&amp;group_id=588&amp;atid=100588&gt;2808363&lt;/a&gt; 
looked interesting so I took a look at that.  As Sergey Zhumatiy states, 
the file he uploaded to Sourceforge does hang jEdit when one scrolls 
down to the line jEdit has trouble with (the line doing 
transliteration).&lt;P&gt;
I read through the rest of the thread and repeated some of what the 
other posters did - I did a thread dump and got the same result as Dale 
Anson .  Denis Dzenskevich simplified the problem by yanking all 
of the relevant classes, methods, regular expressions etc. and putting 
them into a short Java file, duplicating the problem, and I ran the 
program and it hung for me as well.  Matthieu Casanova noted the line 
jEdit was choking on from the uploaded file and mentioned that the 
regular expression used was in the perl.xml file.  Denis Dzenskevich 
chimed in again, noting a geometric progression in processing with a 
scale of 2 for every new character processed.  He notes he does not know 
Perl but posits that perhaps the regular expression could be simplified.
&lt;P&gt;
The first thing I did was tried to simplify the &quot;in the wild&quot; code that 
jedit was stumbling on.  I cut out extraneous lines, then I changed the 
file type from Perl module (pm) to Perl executable (pl), then I 
simplified the expression even more to where I was translating the a's 
in the word banana to b's (banana -&gt; bbnbnb).  A comment of a few words 
at the end of the transliteration line still had jEdit stumbling.
&lt;p&gt;
With this simple line failing, I began to suspect that Denis Dzenskevich 
was right with regards to the regular expression.  I read Sun's 
information about the Pattern class, and then about the Matcher class.  
I read Perl documentation about transliteration and the like.  I also 
found a very helpful &lt;a 
href=http://www.javaworld.com/javaworld/jw-09-2007/jw-09-optimizingregex.html?page=1&gt;Javaworld&lt;/a&gt; 
article about out-of-control regular expressions using the 
java.util.regex package.
&lt;P&gt;
I realized that the regular expression was using a greedy quantifier 
within the transliteration statement's second set of curly brackets.  If 
the regular expression was going to match, this was completely 
pointless, so I added a question mark to the end of the quantifier, 
changing it to a reluctant quantifier.  My Java test program (based on 
Denis Dzenskevich's test program) began working for my test perl files.  
I did an ant compile of jEdit with the new perl.xml file and suddenly 
jEdit was able to easily load all those test perl files it had been 
hanging on - it could even easily scroll through the original in the 
field file that had stumbled across the bug, the one Sergey Zhumatiy had 
uploaded to Sourceforge.
&lt;P&gt;
I also tested Perl files which did use backslashes improperly in the 
second set of brackets on transliteration lines.  Still the same 
problem.  So the bug is still there, but it has been minimized somewhat, 
instead of stumbling over all kinds of Perl transliteration lines, even 
proper ones that work, it now only stumbles over lines of Perl where 
transliteration is done and backslashes are used improperly in the 
second set of curly braces (if they're used at all - you can do 
transliteration with forward slashes in Perl).  So my &lt;a 
href=http://sourceforge.net/tracker/?func=detail&amp;aid=2827234&amp;group_id=588&amp;atid=300588&gt;patch&lt;/a&gt; 
partially fixes the problem anyhow.
</description>
  </item>
  <item>
    <title>I have taken 36 hours of Java programming 101 in the past few weeks, and </title>
    <pubDate>Fri, 24 Jul 2009 00:00:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/07/24#20090724</link>
    <description>decided to see if I could put any of it to use.&lt;P&gt;

I decided to look at how jEdit, Sourceforge's 6th most active project of 
all time was put together.  To get the latest version from their 
subversion version control server I did a:&lt;p
&gt;
svn co https://jedit.svn.sourceforge.net/svnroot/jedit/jEdit/trunk jEdit
&lt;p&gt;
I then ran &quot;ant&quot; in the jEdit directory to build a jedit.jar file.  
&quot;java -jar jedit.jar&quot; started jEdit up.  All of this is fairly simple to 
heavy Java users I'm sure, and I've dealt with this stuff on some level 
for years, but I am new to Java programming so I am learning by even 
doing this simple stuff.
&lt;br&gt;
So jEdit comes up.  I browse the source tree to see how a Java project 
like this is put together.  Of course, one of the best ways to do more 
hands-on learning is to dive into the code, and one of the best ways to 
do that is to look at recent open and unassigned bugs, so I did that.
&lt;br&gt;
Bug &lt;a 
href=http://sourceforge.net/tracker/?func=detail&amp;aid=2808363&amp;group_id=588&amp;atid=100588&gt;2808363&lt;/a&gt; 
has jEdit freezing when it tries to open a file named test.pl that 
consists of the following lines:&lt;p&gt;
#!/usr/bin/perl&lt;br&gt;
$a = &quot;banana&quot;;&lt;br&gt;
print &quot;$a\n&quot;;&lt;br&gt;
$a =~ tr{a}{b};    # I like the letter a but not b translate all&lt;br&gt;
print &quot;$a\n&quot;;&lt;br&gt;
&lt;P&gt;
The line it chokes on is the one with the comment, in fact, if this file 
consisted of just that one line, jEdit still freezes.  It appears jEdit 
will eventually parse this line, but only after many minutes (possible 
even hours, days etc.)  So far, this is my small contribution to the bug 
solve, that even a file with that one line mentioned above will freeze 
jEdit - they were using the real-world example before, which is too 
unwieldly.
&lt;P&gt;
As someone in the bug thread points out, a thread dump (kill -3 to the 
process number running &quot;java -jar jedit.jar&quot;) has AWT-EventQueue-0 
showing that things are stuck in the system class Pattern.  The hand-off 
from user-class land to system is the user-defined class 
TokenMarker, the handleRule method.  I did this thread dump myself.  As 
I said before, I am using the latest (as of July 23rd, 2009) code in 
their Subversion server.&lt;P&gt;
Also as is pointed out in the bug thread, the code involved is not just 
in the TokenMarker.java file, but in the perl.xml file.  jEdit has 
&quot;modes&quot; to edit different types of files.  So that when you edit PERL, 
in the default jEdit PERL mode, the words if, for and my are displayed 
as dark blue.  Scalars and arrays are green, and so on.  The perl.xml 
file contains the following XML relevant to this -&lt;br&gt;&lt;xmp&gt;
                &lt; !-- tr/// transliteration --&gt;
                &lt;SEQ_REGEXP TYPE=&quot;MARKUP&quot; HASH_CHAR=&quot;tr&quot;
AT_WORD_START=&quot;TRUE&quot;&gt;tr\s*\{.*?[^\\]\}\s*\{(?:.*?[^\\])*\}[cds]*&lt;/SEQ_REGEXP&gt;
&lt;/xmp&gt;&lt;br&gt;
So we have the above as the regular expression being matched against.  
The string being matched against is the line&lt;p&gt;
$a =~ tr{a}{b}; # I like the letter a but not b translate all
&lt;P&gt;
We can actually remove everyting before the t in tr, but I am not sure 
that would still be a valid PERL statement (although it would run in 
PERL without a problem).  So we leave it.  Someone put a sample program 
invocating all of this on the jEdit page.  I will put it here, except 
changing the string to my simpler one here.

&lt;P&gt;
import java.util.regex.Pattern;&lt;br&gt;
public class FaultyRegex {&lt;br&gt;
 public static void main(String[] args) {&lt;br&gt;
  final String str = &quot;tr{a}{b};    # I like the letter a but not b 
translate all&quot;;&lt;br&gt;
final String regex = 
&quot;tr\\s*\\{.*?[^\\\\]\\}\\s*\\{(?:.*?[^\\\\])*\\}[cds]*&quot;;&lt;br&gt;

System.out.println(Pattern.compile(regex).matcher(str).lookingAt());&lt;br&gt;
}&lt;br&gt;
}&lt;P&gt;

This program hangs for far too long.  To go back to another point 
someone on the thread made, every character added to the end of the 
comment increases the execute time of the lookingAt method 
geometrically, with a scale factor of 2.  If I run the test file:&lt;P&gt;
&lt;pre&gt;
import java.util.regex.Pattern;
public class Testy {
public static void main(String[] args) {
String str = &quot;tr{a}{b}; # I like &quot;;
final String regex = 
&quot;tr\\s*\\{.*?[^\\\\]\\}\\s*\\{(?:.*?[^\\\\])*\\}[cds]*&quot;;
for (int i=0;i&lt;1000;i++) {
long start = System.currentTimeMillis(); // start timing
System.out.println(Pattern.compile(regex).matcher(str).lookingAt());
long stop = System.currentTimeMillis(); // stop timing
System.out.println(&quot;TimeMillis: &quot; + (stop - start)); // print execution 
System.out.println(&quot;String length is &quot; + str.length());
System.out.println(&quot;String is &quot; + str);
str = str + &quot;z&quot;;
 }
}
}
&lt;/pre&gt;
I get the output:
&lt;pre&gt;
true
TimeMillis: 17
String length is 19
String is tr{a}{b}; # I like 
true
TimeMillis: 3
String length is 20
String is tr{a}{b}; # I like z
true
TimeMillis: 9
String length is 21
String is tr{a}{b}; # I like zz
true
TimeMillis: 9
String length is 22
String is tr{a}{b}; # I like zzz
true
TimeMillis: 23
String length is 23
String is tr{a}{b}; # I like zzzz
true
TimeMillis: 40
String length is 24
String is tr{a}{b}; # I like zzzzz
true
TimeMillis: 71
String length is 25
String is tr{a}{b}; # I like zzzzzz
true
TimeMillis: 134
String length is 26
String is tr{a}{b}; # I like zzzzzzz
true
TimeMillis: 266
String length is 27
String is tr{a}{b}; # I like zzzzzzzz
true
TimeMillis: 843
String length is 28
String is tr{a}{b}; # I like zzzzzzzzz
true
TimeMillis: 1055
String length is 29
String is tr{a}{b}; # I like zzzzzzzzzz
true
TimeMillis: 2171
String length is 30
String is tr{a}{b}; # I like zzzzzzzzzzz
true
TimeMillis: 4541
String length is 31
String is tr{a}{b}; # I like zzzzzzzzzzzz
true
TimeMillis: 8796
String length is 32
String is tr{a}{b}; # I like zzzzzzzzzzzzz
true
TimeMillis: 17571
String length is 33
String is tr{a}{b}; # I like zzzzzzzzzzzzzz
true
TimeMillis: 35689
String length is 34
String is tr{a}{b}; # I like zzzzzzzzzzzzzzz
true
TimeMillis: 81209
String length is 35
String is tr{a}{b}; # I like zzzzzzzzzzzzzzzz
true
TimeMillis: 150282
String length is 36
String is tr{a}{b}; # I like zzzzzzzzzzzzzzzzz
&lt;/pre&gt;
As a line that is 42 characters long takes over 150 cpu seconds to 
parse, a line approaching 80 characters would take over 150,000 cpu 
years to parse.  Which is far too long.
&lt;P&gt;
So where is the problem?  Actually, I wanted to nail it down even more, 
so I did the test:
&lt;pre&gt;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Testy {
public static void main(String[] args) {
String str = &quot;tr{a}{b}; # I like &quot;;
final String regex = 
&quot;tr\\s*\\{.*?[^\\\\]\\}\\s*\\{(?:.*?[^\\\\])*\\}[cds]*&quot;;
Pattern p = Pattern.compile(regex);
for (int i=0;i&lt;1000;i++) {
long start = System.currentTimeMillis(); // start timing
Matcher m = p.matcher(str);
long mid = System.currentTimeMillis(); // stop timing
System.out.println(&quot;TimeMillis1: &quot; + (mid - start)); // print execution
System.out.println(m.lookingAt());
long stop = System.currentTimeMillis(); // stop timing
System.out.println(&quot;TimeMillis2: &quot; + (stop - start)); // print execution
System.out.println(&quot;String length is &quot; + str.length());
System.out.println(&quot;String is &quot; + str);
str = str + &quot;z&quot;;
 }
}
}
&lt;/pre&gt;
The tail of the output which looked like this:
&lt;pre&gt;
TimeMillis1: 0
true
TimeMillis2: 2330
String length is 30
String is tr{a}{b}; # I like zzzzzzzzzzz
TimeMillis1: 0
true
TimeMillis2: 4454
String length is 31
String is tr{a}{b}; # I like zzzzzzzzzzzz
TimeMillis1: 0
true
TimeMillis2: 8807
String length is 32
String is tr{a}{b}; # I like zzzzzzzzzzzzz
TimeMillis1: 0
&lt;/pre&gt;
So clearly, it is Matcher's lookingAt method where it is spending all of 
its time.</description>
  </item>
  <item>
    <title>gocr </title>
    <pubDate>Thu, 02 Apr 2009 15:53:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/04/02#20090402</link>
    <description>
Cool, my &lt;a 
href=http://sourceforge.net/tracker/?func=detail&amp;aid=1556112&amp;group_id=7147&amp;atid=307147&gt;patch&lt;/a&gt; 
to GOCR which deals with distinguishing a's from d's &lt;a 
href=http://sourceforge.net/project/shownotes.php?release_id=671937&amp;group_id=7147&gt;made 
it in&lt;/a&gt; the 0.47 release.
</description>
  </item>
  <item>
    <title>gocr, ocr</title>
    <pubDate>Sat, 24 Jan 2009 21:46:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2009/01/24#20090124</link>
    <description>
With regards to all things OCR, I did a &lt;a 
href=http://sourceforge.net/tracker/index.php?func=detail&amp;aid=1556112&amp;group_id=7147&amp;atid=307147&gt;patch 
for GOCR&lt;/a&gt; in 2006.  GOCR would see the letter 'a' when the letter was 
actually 'd'.  There were two reasons for this:
&lt;p&gt;

1) Sometimes there would be a serif at the top of the 'd'. GOCR would 
examine a 'd' and be looking for a straight up-and-down line segment to 
the right side and two horizontal arcs on the left side - the top and 
bottom of the circle in 'd'.  GOCR would see the serif at the top of 
up-and-down line segment and get confused.  It was not expecting to see 
the serif being there, it expected to see mostly two arcs (the circle at 
the left of the 'd') and then a straight up-and-down segment to the right 
of the 'd' and that's it.  So I put a patch in to make GOCR less strict 
and which would allow for the serif's one finds in text at the top of a 
'd'.  This was done in the ocr0_dD() function.

&lt;P&gt;

2) The second change improved recognition between a's and d's as well.  
This was done in the ocr0_aA() function.  The letters 'a' and 'd' are 
printed in different fonts in different texts, and sometimes the only 
difference is that the up-and-down line segment on the right side extents 
significantly above the circle on the right for 'd', while with 'a' it 
stays level, or extends only slightly above the circle on the left side of 
the character.
&lt;P&gt;
The ocr0_aA() function currently looks into the box struct for x0, x1, y0 
and y1. My patch looks for m1 as well.  With the 2006 patch I put in, I 
make it so that if m1-y0 is greater than or equal to 0, I break.
&lt;P&gt;
While every test I ran showed an improvement in GOCR recognition after 
this, a week after I sent my patch in, as I became more familiar with 
gocr, two things occurred to me.  The first was that instead of breaking, 
and declaring that it was not an 'a', I probably should have used the 
setac() call instead - diminishing the likelihood that the character was 
'a' but not totally eliminating it.  Secondly - the formula &quot;m1-y0 &gt;= 0&quot; 
as the formula to break is somewhat arbitrary.  What exactly is the length 
the line segment can rise where it transforms from an 'a' to a 'd'?  I 
picked the number 0 arbitrarily.  I did a number of tests, but more tests 
can probably be done, especially on very small and very large characters - 
especially very large characters.  These concerns made me think I could do 
an even better patch.  The one I submitted seemed to break nothing, and 
only fix things, but I decided a better patch would use a setac() instead 
of a break in ocr0_aA(), and that more testing should be done, especially 
on bigger characters, so that a better patch could be done.&lt;P&gt;
I would have to spend some time doing that, so I contacted Joerg 
Schulenburg recently and he gave me some encouragement, so I am going 
forward with the new, better patch as the first one was never applied (and 
since I felt I could do better, I never pushed that it be applied).  Joerg 
is busy with things, and I am a little busy as well, but I am less busy 
than I used to be, and have the time in the next weeks to do this new and 
better patch.&lt;P&gt;

Anyhow, Joerg asked for some sample files.  First I should say, I just 
downloaded gocr via cvs (January 24, 2009), patched it, and compared the 
files in the examples directory between the current cvs and my patched 
version (4x6.png  5x7.png  5x8.png  ocr-a.png  ocr-b.png handwrt1.jpg  
matrix.jpg).  There was no change for any of the examples.&lt;P&gt;

What I use to test is OCR scans I got off of the Distributed Proofreaders 
website.  With my 2006 patch, in every test I did on every OCR image, I 
did not see any negative effect - my patch did not remove recognition of 
any correctly labeled 'a' or 'd'.  The only changes I saw were incorrectly 
labeled a's and d's now being unlabeled as such - often with that 
incorrectly identified as an 'a' now being seen correctly as a 'd'.&lt;P&gt; An 
example of this is &lt;a href=086.png&gt;page 83&lt;/a&gt; of the book &quot;Daring and 
Suffering: A History of the Great Railroad Adventure&quot; by William 
Pittenger.  I got the scan of this from Distributed Proofreaders as it was 
on the way to Project Gutenberg.

Line &quot;8&quot; of that page (To GOCR it is the eight line of text or whitespace, 
when reading the text it is the first line) with the current CVS snapshot 
of GOCR is &lt;P&gt;
&lt;B&gt;_ll of the eigbt _e_ were capt4red, a_a are&lt;/B&gt;
&lt;P&gt;&lt;/B&gt;
With my 2006 patch, the text correctly comes out as:&lt;P&gt;
&lt;b&gt;_ll of the eigbt _e_ were capt4red, a_d are&lt;/b&gt;&lt;P&gt;
With my patch, GOCR now recognizes that the last letter of the word and is 
not 'a', but 'd'.  A false recognition of the letter is replaced by the 
correct one.
&lt;P&gt;
You can &lt;a href=086.png&gt;download&lt;/a&gt; page 83 of the book yourself, and run 
it with the current cvs snapshot, and against my patch.
&lt;P&gt;

Another example is &lt;a href=171.png&gt;page 160&lt;/a&gt; of the book &quot;Left End 
Edwards&quot; by Ralph Henry Barbour.  This is another book whose pages I 
grabbed from Distributed Proofreaders on their way to project Gutenberg.  
My patch has a very good effect on this page, fixing four lines, all 
correctly.
&lt;P&gt;
The first line fixed (according to GOCR) is line 8:
&lt;P&gt;
&lt;b&gt;a_d tahe_ 9ou o_.  Peters say6 _obey _il_ be ais-&lt;/b&gt;&lt;P&gt;
becomes:&lt;P&gt;
&lt;b&gt;a_d tahe_ 9ou o_.  Peters say6 _obey _il_ be dis-&lt;/b&gt;&lt;P&gt;
The patch correctly changes &quot;ais-&quot; to &quot;dis-&quot;.  You can look at the 
image and see is correct.&lt;P&gt;
On the line which GOCR says is line 12, the line changes from:&lt;P&gt;
&lt;b&gt;'' I ao_'t believe t_ey _ll,'' replied Steve _o-&lt;/b&gt;&lt;P&gt;to&lt;P&gt;
&lt;b&gt;'' I do_'t believe t_ey _ll,'' replied Steve _o-&lt;/b&gt;&lt;P&gt;
The d in don't is seen as d, not as a.&lt;P&gt;
There are two more corrected lines as well - an a becomes d correctly, on 
lines 26 and 27 as well.  You can see this for yourself.  You can &lt;a 
href=171.png&gt;download&lt;/a&gt; the page and run the current gocr cvs against my 
2006 patch.&lt;P&gt;

As I said, I pulled these pages randomly from &lt;a 
href=www.pgdp.net&gt;Distributed Proofreaders&lt;/a&gt;.  It was the only online 
source of scans I knew of.  I also tested this on scans of my own book 
collection as well, although my books are not in the public domain, so due 
to copyright issues I am less inclined to post them.  The Distributed 
Proofreader books are public domain books.&lt;P&gt;

As I said, I tested many pages on this, and if you want me to post more of 
my tests I will.  Many of my tests simply had no change - the pre-patch 
gocr was the same as post-patch.  On all my tests I saw no negative 
effect.  Only positive ones, usually a 'd' mis-classified as 'a' being 
properly classified as 'd'.&lt;P&gt;

But as I said, I have been thinking about the patch, and think a setac() 
would be better than a break for my ocr0_aA() test.  I should probably 
test this on larger characters as well - the 0 in &quot;m1-y0 &gt;= 0&quot; is somewhat 
arbitrary, and I want to run more tests, especially on large sized 
characters.  So my existing patch seems to only fix things, but I feel I 
can make the patch even better.&lt;P&gt;
Here is a &lt;a href=ocr0.c.patch&gt;copy&lt;/a&gt; of the 2006 patch which works against the current (January 25, 
2009) CVS snapshot.  In the next weeks, I will work to see if I can improve the patch, mostly in terms 
of setting setac() instead of breaking, as well as seeing if the 0 value in &quot;m1-y0 &gt;= 0&quot; is the best 
number to use, especially when I do more testing against larger characters.  So I'll be sending you an 
improved patch of my 2006 patch in the next few weeks.  School is starting up for me again Monday and I 
will be somewhat busy, but I'm fairly sure I will have enough time to improve the 2006 patch in the next 
few weeks.</description>
  </item>
  <item>
    <title>tesseract</title>
    <pubDate>Sat, 27 Dec 2008 09:03:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2008/12/27#20081227</link>
    <description>
Well tesseract looks like the best &quot;free/libre/open source&quot; software out 
there.  I am shifting my work more from GOCR to tesseract.&lt;P&gt;
Anyhow, first I did an examination of how well tesseract translated stuff.  
I did this by taking scanned pages from &lt;a 
href=http://www.pgdp.net&gt;Distributed Proofreaders&lt;/a&gt;, running tesseract 
on them, then manually checking to see what the result was.  DP 
(Distributed Proofreaders) scans from different types of books, so we get 
a range of different fonts and printing styles.  I convert the PNG from 
DP to a TIF and then let tesseract run&lt;P&gt;
One thing I quickly noticed is that tesseract handles &quot; fi&quot; poorly, that 
is, words that begin with the letters fi.&lt;p&gt;
One example is on &lt;a href=/blog/images/0305.png&gt;page 305&lt;/a&gt; of 

&lt;a href=http://www.pgdp.net/c/project.php?id=projectID459544f32f248&gt;
part 1 of 4&lt;/a&gt; of 
Chambers's Twentieth Century Dictionary.  Line 33 is translated as:&lt;P&gt;

&quot;cats proverbially tight till each destroys the other.   1 11111;; ``````&quot;
&lt;P&gt;
The junk after the word other and the period is just junk that was OCR'd.  
Anyhow, this should not OCR as &quot;proverbially tight till&quot; but as 
&quot;proverbially fight till&quot;.  You can see what it looks like in the book 
here:&lt;P&gt;  &lt;img src=/blog/images/305fi.png&gt;&lt;P&gt;
&lt;P&gt;
We can see the same situation on the same page.  Further down, line 72 is 
translated as:&lt;br&gt;
&quot;Catadromous, kat-ad&amp;#39;rom-ns, mt/. of hshes, descend. 1   
11,11;;;;;~,1::::::::::::§&quot;`&quot;
&lt;P&gt;
This should actually be:&lt;br&gt;
&quot;Catadromous, kat-ad'rom-us, adj. of fishes, descend-&quot;&lt;br&gt;
Note once again the &quot; fi&quot; is mistranslated.  We can see this here:
&lt;P&gt;

&lt;img src=/blog/images/fishes.png&gt;
&lt;P&gt;
We can see this from a scan of a different book. &lt;a href=120.png&gt;Page 
120&lt;/a&gt; of 

&lt;a 
href=http://www.pgdp.net/c/project.php?id=projectID46eaf11a62533&gt;Secresy, 
or, The Ruin on the Rock&lt;/a&gt; also has a bad translation of &quot; fi&quot;.  This 
despite different fonts, typesetting and so forth.  Line 4 (line 3 if 
disregarding title) is translates as:&lt;p&gt;
&quot;must acknowledge, his Hmmess has not undergone the trial you have&quot;
&lt;p&gt;
where the real translation is &lt;p&gt;
&quot;must acknowledge, his firmness has not undergone the trial you have&quot;&lt;p&gt;
Hmmess is actually firmness, once again &quot; fi&quot; is mistranslated.
&lt;P&gt;
&lt;img src=/blog/images/firmness.png&gt;
&lt;P&gt;
I have been looking through the tesseract output of these letters and 
words with the debugger on, and am still doing so.</description>
  </item>
  <item>
    <title>/var/tmp</title>
    <pubDate>Mon, 16 Jun 2008 21:15:00 GMT</pubDate>
    <link>http://www.vartmp.com/blog/2008/06/16#20080616</link>
    <description>
Well, I snapped this domain up again.  I had it several years ago and lost it.
Now I have it again.&lt;P&gt;
One of my interests is OCR, particularly a free software OCR.  I spent some
time on gocr, even though none of my patches were used and the project has
not been updated for over a year.  GOCR seemed the best thing to contribute
to when I was looking at this a year or two ago, but Google has put
Tesseract and Ocropus out there so I am going to take a look at those now.
They are in C++ - a language I knew nothing of two years ago, but have
taken a class in so am now a little more familiar with.  Apparently
tesseract only does OCR, not layout.  Ocropus is a layoout plugin.
&lt;P&gt;
I'm trying it now...it's pretty good.  Better than GOCR probably.
&lt;P&gt;
I will attempt to improve it the same way...get a number of samples of
different books from Distributed Proofreaders, match tesseract OCR to
original...see if there are any patterns of failure, then fix that in
the tesseract code if possible
</description>
  </item>
  </channel>
</rss>