0 votes

I already read your answer to the question that was asked here:

But still it's not completely clear to me what's going on.

I have the Agent Ransack Software installed and recently noticed that I have some issues with the "new" Office template file formats (.dotx and .dotm).

As proposed by you, I installed the MS Office 2010 Filter Pack (http://www.microsoft.com/de-ch/download/details.aspx?id=17062). However, this seems not to include IFilters for dotx and dotm-Files. When I search for some content in a .dotx-File, Agent Ransack does therefore not find it (with or without "Enhanced Document Searching" enabled to use the Filter Packs IFilters).

However, I don't have this problem with the older template files that have the format .dot! Since this format is not included in the list of supported formats in the Service Pack and not in the list you provided in the other thread from the link above, either, I wondered whether

  1. It's simply strange that with .dot-Files everything works fine or
  2. I could somehow achieve that Agent Ransack will also search the content in .dotx- and .dotm-Files

I hope that the question is asked clearly enough...?

Thanks for help / explanations.

asked by (30 points)

1 Answer

0 votes

Like many new file formats today .docx and .dotx Word formats are actually ZIP files with compressed XML (and other) files inside. It means that they are completely 'opaque' unless the data is extracted. However, the older .doc and .dot files are stored uncompressed, which means that a simple binary search over the data often produces adequate results.

Agent Ransack only applies special processing (ie IFilters or ZIP extraction) to .doc, .docx, .xls, .xlsx, .ppt, .pptx, .odt, .ods, .sxw, .sxc files. Other formats, such as .dotx, are only supported by the premium version of the product FileLocator Pro.

Agent Ransack will still try and search all file types but it's only those listed that have the 'special' processing applied. Therefore Agent Ransack may well produce adequate results on uncompressed formats, like .dot, but won't be able to do much with compressed formats like .dotx.

answered by (66.9k points)

Ah, I see it now. Thanks for the detailed explanations!