When searching for a term in Office files, Agent Ransack returns hits from documents which are embedded as objects in the searched files. The hits are returned twice, once for the parent document, and once for the embedded object. Until you understand what is happening, this is very confusing. The search term itself may not appear at all in the parent document, but as this is shown first in the list of results, the inclination is to look at that first. When Word (or other Office app) says "no matches found", this causes confusion and makes one doubt the reliability of AR. Here's an example:
Folder containing 2 files:
Contents of parent file:
Contents of child file:
AR search dialog - search for "dog":
You can see that if you then open the parent file, there is apparently no instance of the word "dog" in there, although it is shown as having 26 hits (by the way, what are the numbers against the hits in the right hand pane?). All the hits are from the embedded child document.
I found that the ordering of the results is not consistent, sometimes the parent would be shown first, sometimes the child. In my real example (which unfortunately I cannot share with you for confidentiality reasons), the parent reliably appeared first in the list.
This only occurs when the child object is actually embedded in the parent. If embedded as a link, this doesn't happen. However, that doesn't stop it being confusing when it does. In my case, the documents were provided by a customer, so I had no idea what method had been used to embed the child objects - in fact I only discovered them after I had had this puzzling experience.
AR has helpfully expanded the references with their original source (Wikipedia) although there is no direct link there, the text was just copied and pasted into the Word file. You could argue this makes things even more confusing, but we won't worry about that just now :)
Can I suggest that either:
a configuration option is provided to prevent hits being shown as if they existed in the parent document, or
a graphical representation of some kind is used to indicate that it is the embedded document one must look at, not the parent - perhaps a tree view of some kind.