DocSearcher is a search tool.


Overview

Download

Creating Indexes

Developer Info

Search Techniques

Support

Servlet

History

Changelog

ToDo

Creating searchable CDROM

Changelog

version 3.95.3 (February 3rd 2024)
  • problem fixed during start external application and the filepath starts with more than one / (backport)
  • activate anti-aliasing on linux (backport)

version 3.95.2 (November 15st 2020)
  • fix internal text and html view problems (backport)
  • security - don't execute exe and sh files (backport)
  • fix saving index list after delete last index (backport)

version 3.95.1 (July 17st 2020)
  • fix missing create of working directory on first start

version 3.95.0 (June 30st 2020)
  • update POI to 3.17
  • update PdfBox & FontBox to 1.8.15 (CVE-2018-8036)

version 3.94.0 (November 8st 2018)
  • add posibility for localized start page (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • move DocSearcher to Java 8
  • drop Java 7 Support
  • update commons-lang to 3.6
  • fix OpenDocument token problem (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • change Lucene IndexReader to work read only, because Lucene 4 will skip write support on IndexReader
  • update Lucene to 3.6.2 (api changes)
  • update Lucene to 3.5.0 (api changes)
  • update Lucene to 3.2.0/3.3.0/3.4.0 (no api changes)
  • update Lucene to 3.1.0 (api changes)
  • update Lucene to 3.0.0 (no api changes)
  • update Lucene to 2.9.4 (api changes)
  • update Lucene to 2.4.1 (api changes)
  • update Lucene to 2.3.2 (no api changes, but Lucene 2.1 changed the index format after index saving)
  • add support for OpenDocument spreadsheet, presentation and drawing files (.ods, .ots, .odp, .otp, .odg, .otg) (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • add possible file extensions for OpenDocument document files (.odm, .ott), text files (csv, .java, .py, .rst, .md), HTML files (.mhtml .mht, .xhtml, .xhtm), Word files (.dot, .dotx, .docm, .dotm) and Excel files (.xlsm) (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • adapt to checkstyle 6.19, findbugs 3.0.1 and cobertura 2.1.1
  • fix resource leak in Word, Excel and Powerpoint converter
  • add support for Powerpoint 7 files (.ppt, .pps) (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • add support for Powerpoint OOXML files (.pptx, .ppsx) (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • update commons-io to 2.5
  • add commons-collection4 4.1 library (required by POI)
  • update POI to 3.15 (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • add commons-logging library (required by PdfBox & FontBox)
  • update PdfBox & FontBox to 1.8.12 (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • update log4j to 1.2.17 (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • use single app version numbering (instead of two) (contributed by Olivier Descout <odescout at users dot sf dot net>)
  • move DocSearcher to Java 7
  • drop Java 6 Support
  • remove unused help page
  • fix problem with set page in mainview
  • add support for Excel OOXML files (.xlsx)
  • add support for Word OOXML files (.docx)
  • add support for Word 6/95 files
  • update POI to 3.8

version 3.93.0 (November 7st 2012)
  • move DocSearcher to Java 6
  • fix problem with ignored files if you searched with filetype choise
  • fix upper case file type problem since version 3.92.0
  • fix some problems in website spider
  • preparing to Java Webstart

version 3.92.0
  • refactored PDF converter
  • removed old multivalent PDF extractor
  • updated PDF Box to 0.7.3
  • changed the Lucene date to new format (DateTools)
  • refactored internal filetype handling

version 3.91.0
  • refactor DocType handling
  • refactor index creating and don't store the body content
  • search in body and title together is possible again
  • replace old setting file "docSearch_prefs.txt" with "docsearcher.properties"
  • remove tinylaf layout
  • first try to solve some problem with servlet extension
  • remove check if last searchtext was the same, because the options can be changed
  • remove option for search in title and body, because it does not run
  • fix escaping problem in meta data report
  • fix problem with filenames contains whitespaces
  • update POI to 3.2 final
  • fix some problems in Word and Excel converter
  • add commons-io

version 3.90.0
  • fix some problems with search field
  • fix problem with home button
  • add first junit test
  • add findbugs
  • add checkstyle
  • ant - move some directories to build
  • remove back and forward buttons, because they are not useful in this context
  • update Lucene to 2.0.0
  • add some JAVA 5 features
  • add new resources
  • fix problem with command line

version 3.89
  • little fix in meta data report for OpenDocument
  • fix reindex problem with to many filetype variables
  • fix some problems during OpenDocument indexing
  • update Jakarta POI library to actual alpha version 2006-05-15
  • rewrite Word converter
  • rewrite Excel converter
  • remove some unused temporary files
  • add important debug for cd creating mechanism
  • update log4j to 1.2.13
  • add filetype constants
  • add OpenDocument text format
  • fix some different title formats during indexing (filename with path, first content of document)
  • document title normaly from document or filename
  • fix double k(k K bytes) in summary line
  • improve filetype handling
  • change filetype icons

version 3.88
  • no changes since 3.88 pre1
  • fix date convert problem
  • many code cleanups
  • add license (GPL)
  • source code formated (java code convention)
  • add Log4J
  • little fix in website spider
  • add help page
  • add Environment and FileEnvironment class