/var/tmp
   


About
Android, Linux, FLOSS etc.


Code
My code

Subscribe
Subscribe to a syndicated RSS feed of my blog.

       

Sun, 26 Jul 2009

As I said yesterday, I've taken 36 hours of a Java programming 101 class
and decided to see if I could put any of it to use. I believe I have.

At first I just wanted to see what a real Java program looked like. So I downloaded the latest jEdit source from Sourceforge. jEdit is the sixth most all-time active project on Sourceforge, has had millions of downloads, and is written in Java. Using ant to compile it was easy enough, and I did a cursory look through the code.

As they say, the best way to learn code is to try to change something. I looked through the bug list for open bugs that were not assigned. Bug 2808363 looked interesting so I took a look at that. As Sergey Zhumatiy states, the file he uploaded to Sourceforge does hang jEdit when one scrolls down to the line jEdit has trouble with (the line doing transliteration).

I read through the rest of the thread and repeated some of what the other posters did - I did a thread dump and got the same result as Dale Anson . Denis Dzenskevich simplified the problem by yanking all of the relevant classes, methods, regular expressions etc. and putting them into a short Java file, duplicating the problem, and I ran the program and it hung for me as well. Matthieu Casanova noted the line jEdit was choking on from the uploaded file and mentioned that the regular expression used was in the perl.xml file. Denis Dzenskevich chimed in again, noting a geometric progression in processing with a scale of 2 for every new character processed. He notes he does not know Perl but posits that perhaps the regular expression could be simplified.

The first thing I did was tried to simplify the "in the wild" code that jedit was stumbling on. I cut out extraneous lines, then I changed the file type from Perl module (pm) to Perl executable (pl), then I simplified the expression even more to where I was translating the a's in the word banana to b's (banana -> bbnbnb). A comment of a few words at the end of the transliteration line still had jEdit stumbling.

With this simple line failing, I began to suspect that Denis Dzenskevich was right with regards to the regular expression. I read Sun's information about the Pattern class, and then about the Matcher class. I read Perl documentation about transliteration and the like. I also found a very helpful Javaworld article about out-of-control regular expressions using the java.util.regex package.

I realized that the regular expression was using a greedy quantifier within the transliteration statement's second set of curly brackets. If the regular expression was going to match, this was completely pointless, so I added a question mark to the end of the quantifier, changing it to a reluctant quantifier. My Java test program (based on Denis Dzenskevich's test program) began working for my test perl files. I did an ant compile of jEdit with the new perl.xml file and suddenly jEdit was able to easily load all those test perl files it had been hanging on - it could even easily scroll through the original in the field file that had stumbled across the bug, the one Sergey Zhumatiy had uploaded to Sourceforge.

I also tested Perl files which did use backslashes improperly in the second set of brackets on transliteration lines. Still the same problem. So the bug is still there, but it has been minimized somewhat, instead of stumbling over all kinds of Perl transliteration lines, even proper ones that work, it now only stumbles over lines of Perl where transliteration is done and backslashes are used improperly in the second set of curly braces (if they're used at all - you can do transliteration with forward slashes in Perl). So my patch partially fixes the problem anyhow.

[/java/jedit] permanent link