Powered by
Movable Type 3.38 mod_perl/2

 October 2012 Archives

2012, October 01 (Mon)

Linux Nvidia binary driver and nouveau switching

Nvidia binary driver for Linux has the annoying habit of overwriting OpenGL libraries. However, the nvidia-versions are incompatible with the open source ones.

Unfortunately while the nouveau driver is ok for most things, occasionally I run into applications where there are rendering errors, for example in dolphin. So the proprietary driver has to be used.

However since the installation is not possible in parallel by default, a simple reboot is not enough. To solve this problem, we devise a script that gets run at boot and moves the proper libraries into place or out of it.


2012, October 28 (Sun)

Web Dump hocr fix for tesseract

If you like to use tesseract to make “searchable” PDFs and find that some parts of the page content are missing in the PDF, here is the necessary fix for ExactCODE’s hocr2pdf

Index: lib/hocr.cc
===============================================================
--- lib/hocr.cc
+++ lib/hocr.cc
@@ -327,6 +327,12 @@
   //std::cerr << "elementStart: '" << name << "', attr: '" << attr << "'" << std::endl;
   
   BBox b = parseBBox(attr);
+
+  // explicitly flush line of text on manual preak or end of paragraph
+  if (attr.find("class='ocr_line'") != std::string::npos ||
+      attr.find("class='ocr_par'") != std::string::npos)
+    textline.flush();
+
   if (b.x2 > 0 && b.y2 > 0)
     lastBBox = b;