.. PDF-toolbox documentation master file, created by sphinx-quickstart on Tue Oct 18 19:49:21 2016. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Installation ============ PreRequisite ------------ PDF-toolbox is a JAVA program and developed with JAVA version 8. Only a runtime is needed. When creating a searchable PDF file the source is processed with the tesseract OCR engine. This is optional. Download & Install ------------------ First download the code. The code is found on the VirtOrg website http://www.virtorg.org. On the mainpage there is a reference of the latest version of the program. Click on the link and a ZIP file is downloaded. After downloading unpack the zipfile. download and install:: wget http://www.virtorg.org/files/PDF-toolbox/vtgPDF-toolbox-0.1.?-bin.zip mkdir PDF-toolbox cd PDF-toolbox unzip ../vtgPDF-toolbox-0.1.?-bin.zip cd pdf-toolbox-0.1.? java -jar target/vtgPDF-toolbox-0.1.?.jar --version If the version is presented then the code is working. It is posible that you see some loggin messages:: log4j:WARN No appenders could be found for logger (com.virtorg.pdf.ocr.ServiceOCR). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. These messages can be ignored for now. Edit the **pdf-toolbox** script file and change the parameters which are needed for the use of tesseract. (**options="-Djna.library.path=/opt/local/lib -Dvtg.tessdata.path=/opt/local/share"**) pdf-toolbox:: #!/usr/bin/env bash jarfile=/target/vtgPDF-toolbox-0.1.4-SNAPSHOT.jar options="-Djna.library.path=/opt/local/lib -Dvtg.tessdata.path=/opt/local/share" if [[ $0 =~ ^/ ]] ; then # absolute path used program=$(dirname $0)$jarfile; else # relative path used program=`pwd`/$(dirname $0)$jarfile; fi echo java $options -jar $program $* java $options -jar $program $* Usage ----- usage:: usage: PDF-toolbox list of all options and commands -c,--createLogFile create a new log4j.properties -D,--destfile The destination PDF -h,--help print this message -L,--overlayfile The overlay PDF -o,--overlay Overlay the original PDF with a writingpaper -O,--originalfile The original PDF -r,--replace replace the original file with the resultfile -s,--ocr OCR the origanal picture or PDF to searchable PDF -v,--version print program version -V,--verbose be extra verbose Have a lot of fun with this VirtOrg program. usage: PDF-toolbox [[options]] command [[parameters]] usage: PDF-toolbox --overlay --originalfile --overlayfile --destfile usage: PDF-toolbox --overlay -O -L -D usage: commands list of all commands -o,--overlay Overlay the original PDF with a writingpaper -s,--ocr OCR the origanal picture or PDF to searchable PDF usage: parameters list of all parameters -D,--destfile The destination PDF -L,--overlayfile The overlay PDF -O,--originalfile The original PDF Make executable Windows ----------------------- TODO Make executable Mac OS X ------------------------ Use the shell script for starting the program:: cd pdf-toolbox-0.1.3 chmod +x pdf-toolbox ./pdf-toolbox --version If everything is working at the command to the system PATH. Install Tesseract ----------------- Install on MAC:: port search tesseract port install tesseract Install on CentOS:: yum install tesseract