/var/tmp | |||||
Subscribe
|
Sat, 19 Dec 2009
poppler bug
The segmentation fault happens when the TextWord constructor is called. The reason the segmentation fault happens is because the curFont object has not been created. So without doing much investigation, I simply created the curFont object if it did not exist, and then called a related method. This seemed to solve the problem, the program stopped crashing and the problem pages were displayed seemingly normally (a cursory look shows the problem pages displaying normally, but it is possible some portion of the page is displayed improperly).
git diff TextOutputDev.cc diff --git a/poppler/TextOutputDev.cc b/poppler/TextOutputDev.cc index 442ace2..9686cc1 100644 --- a/poppler/TextOutputDev.cc +++ b/poppler/TextOutputDev.cc @@ -1988,6 +1988,11 @@ void TextPage::beginWord(GfxState *state, double x0, double y0) { rot = (m[2] > 0) ? 1 : 3; } + if (!curFont) { + curFont = new TextFontInfo(state); + fonts->append(curFont); + } + curWord = new TextWord(state, rot, x0, y0, charPos, curFont, curFontSize); } However, this is really just a hack. I don't have much of an understanding of how the poppler library works or how evince works. The Poppler people point out that this segmentation fault is not tripped on pdftotext, which also uses the poppler library. This is correct, it does not seem to. Then again, evince is calling the poppler_page_render() call in the poppler library, and pdftotext does not seem to do that. Thus, what that ultimately adds up to is questionable. Right now I am exploring the Gfx class, as backtrace (and following the program logic) shows that the Gfx class is utilized between the call to poppler_page_render() and the failed construction of the curWord object of the TextWord class. Setting the printCommands boolean to true shows debugging information so I am looking at that. What usually happens with the above patch is that the beginWord method is called many times, with one instance where no curFont object exists (and thus a segmentation fault would happen). I do not know much about the evince code or these libraries, so I am looking into all of this, seeing if I can come up with anything better than the above hack. It is pretty clear this is a poppler problem though - even if these pdf's are messed up, they don't crash PDF displayers that don't use the poppler library. The same goes for if evince is not doing something right with Cairo before handing it off to poppler. If this is happening 12 calls within poppler, it points to poppler being the problem. |
||||