Welcome to the Mythicsoft Q&A site for:

- Agent Ransack
- FileLocator Lite
- FileLocator Pro

Please feel free to ask any questions on these products or even answer other community member questions.

Useful Links:

- Contact Us
- Help Manuals
- Mythicsoft Home
0 votes

When searching for a term in Office files, Agent Ransack returns hits from documents which are embedded as objects in the searched files. The hits are returned twice, once for the parent document, and once for the embedded object. Until you understand what is happening, this is very confusing. The search term itself may not appear at all in the parent document, but as this is shown first in the list of results, the inclination is to look at that first. When Word (or other Office app) says "no matches found", this causes confusion and makes one doubt the reliability of AR. Here's an example:

Folder containing 2 files:
folder

Contents of parent file:

parent

Contents of child file:

child

AR search dialog - search for "dog":

results

You can see that if you then open the parent file, there is apparently no instance of the word "dog" in there, although it is shown as having 26 hits (by the way, what are the numbers against the hits in the right hand pane?). All the hits are from the embedded child document.

Notes:

I found that the ordering of the results is not consistent, sometimes the parent would be shown first, sometimes the child. In my real example (which unfortunately I cannot share with you for confidentiality reasons), the parent reliably appeared first in the list.

This only occurs when the child object is actually embedded in the parent. If embedded as a link, this doesn't happen. However, that doesn't stop it being confusing when it does. In my case, the documents were provided by a customer, so I had no idea what method had been used to embed the child objects - in fact I only discovered them after I had had this puzzling experience.

AR has helpfully expanded the references with their original source (Wikipedia) although there is no direct link there, the text was just copied and pasted into the Word file. You could argue this makes things even more confusing, but we won't worry about that just now :)

Can I suggest that either:

  1. a configuration option is provided to prevent hits being shown as if they existed in the parent document, or

  2. a graphical representation of some kind is used to indicate that it is the embedded document one must look at, not the parent - perhaps a tree view of some kind.

Thanks

Nick

asked by (110 points)

1 Answer

0 votes
 
Best answer

Thanks, we'll take that into account for the next update to Agent Ransack.

As a rough work-around if you switch OFF 'Office/PDF document' searching in the Options tab the links should no longer be followed (but you'll see some of the meta-data). Alternatively FileLocator Pro does not expand out child-documents when processing files.

answered by (67k points)
...