Welcome to the Mythicsoft Q&A site for:

- Agent Ransack
- FileLocator Lite
- FileLocator Pro

Please feel free to ask any questions on these products or even answer other community member questions.

Useful Links:

- Contact Us
- Help Manuals
- Mythicsoft Home
+1 vote

Can you please clarify an issue about using regular expressions to specify folders?

I'm searching in files under "z:\data", but want to exclude from the search any files under "z:\data\Trash"

I don't understand why these regular expressions do not exclude files under "z:\data\Trash":

Look in: z:\\data\\.*
Subfolders and Regular Expression is selected

Persistent Search Filter: -^z:.*\\data\\Trash\\.*
Regular Expression is selected


On the other hand, files under "z:\data\Trash" are excluded if the above search is modified so that the "Look in" regular expression omits the ".*", i.e.,

Look in: z:\\data\\


However, files under "z:\data\Trash" are not excluded if the original search is modified so that the "Look in" regular expression omits the ".*", and the Persistent Search Filter omits the first ".*", i.e.,

Look in: z:\\data\\
Persistent Search Filter: -^z:\\data\\Trash\\.*

======================================

Any explanation of these results is much appreciated.

by (70 points)

1 Answer

+2 votes

The issue is down to how location filters are applied to a search. Filters are applied at the time subfolders are being iterated in the parent folder. Therefore, if folders are specified explicitly they are not subject to the filter.

For example, if you were to search using RegEx on the Look In field:

Look In: C:\usr;-c:\\usr\\tmp

Your results would include all files under the C:\usr folder except those in the C:\usr\tmp path. However, if you searched for:

Look In: C:\usr\tmp;-c:\\usr\\tmp

The C:\usr\tmp would be included because it was specified explicitly.

Therefore, this is the reason why including .* in the Look In, e.g.

Look In: C:\usr\.*;-c:\\usr\\tmp

won't exclude the C:\usr\tmp folder, because it's been, in effect, explicitly specified in the list of folders and is not subject to the filter.

However, having said that I can see that this looks wrong, it's not what you'd naturally expect, so I'll file it as a bug report.


More information

Q. What's the difference between - and ! when excluding folders

A. Any Look In entry prefixed with ! must relate to a specific file or folder (or group of files/folders if using expressions). So if you put

Look In: C:\usr;!InvalidPath

You'll get an error reported in the Summary tab. However a minus prefix is just an expression that is tested against every sub folder found during folder iteration.

Q. What exactly is the expression compared against? e.g., is it always compared against a path that starts with a drive letter?

A. Expressions on Look In folder entries are only matched on the specific folder level. So, with

Look In: C:\.*\temp

the wildcard .* part just matches against the immediate folders within the root of C:\. Excluded folders (ie with an '!' exclamation prefix) have the same restriction in that expressions cannot span multiple levels. However, minus and plus filters can be matched on any part of the folder path and are not tied to the start of the path, so,

Look In: C:\usr;-temp

will exclude any folder with temp in the path wherever it is.

Q. How is escaping done, ie why do I sometimes need to escape directory separator

A. As you can hopefully see by now the escaping is not necessary for folder paths as the expression cannot span multiple levels. However, if you use a +/- filter that spans levels then the directory separator will need to be escaped. This is all assuming that you're using RegEx, if you're using Plain Text then no escaping is required.

Q. Do "Look in" and "Persistent Search Filters" only apply to folders and not files

A. Look In folders (including '!' exclusion folders) can specify files but +/- filters can't. The 8.x release cycle will add the ability to specify Persistent File Name filters.

Q. Is there a way to exclude a specific file from a search? For example, there's one huge log file I want to always exclude from my search.

A. Yes, because you know the location of the specific file you can exclude it with a '!' path, e.g.

Look In: C:\usr;!C:\usr\logs\large.log

Note that you can't use an expression for the file name part of the path, e.g. this is not valid

Look In: C:\usr;!C:\usr\logs\.*.log
by (29.5k points)
Dave,
Thanks for explaining this.  
I'm setting-up a complex search that will be used periodically against several data collections.  Getting the search to work as I intended has been a more challenging than I expected it to be.
If I may make some suggestions for improvements, based on my experience:
The feature you described above is contrary to what I "naturally expected".  
I'd prefer the "naturally expected" operation, but I'm ok with it working differently, if it's documented.
In general, some additional documentation for search-criteria specification would really help, e.g.,
* To specify exclusion, how are "!" and "-" different, e.g., Persistent Search Filters comes with some pre-made filters and both of those symbols are used
* How is escaping done, e.g., it seems that sometimes directory separators are escaped ("\\") and sometimes they are not ("\")
* For regular expressions, what exactly is the expression compared against? e.g., is it always compared against a path that starts with a drive letter?
* It appears "Look in" and "Persistent Search Filters" only apply to folders and not files.  It took a while to figure that out.
* Is there a way to exclude a specific file from a search?  For example, there's one huge log file I want to always exclude from my search.
* A "Persistent Search Filters" for file names would be really nice.  BTW: I found your post about how to edit the config file file_preset.xml, and I'm using that.

The above is a wish list.  I realize you're not Santa Clause, i.e., not all wishes can be granted, and granted without cost.  :=)

Thanks for providing a great tool.
Jim
Dave,
I understand what you're saying about "Look in" being specified explicitly vs. non-explicitly.
In my original post, the third example shows an effect from the Persistent Search Filter being changed, so that the first ".*" is removed.
Any ideas on why that change makes the files under "z:\data\Trash" be not excluded?
Thanks,
Jim
I've updated the answer to add your extra questions. With regard to third situation, ^z:\\data\\Trash\\.* would not match against Z:\data\Trash due to the extra '\\' so files immediately in Trash would not be excluded but subfolders would.
Dave,
Thanks for the great info, and for providing it so quickly--very helpful.

Regarding your last post, when you say " due to the extra '\\' ...", which "\\" are extra?

In a Persistent Search Filter, to use regex to exclude files and folders under z:\data\Trash\, what would the expression be?

BTW: My original regex was:
Persistent Search Filter: -^z:.*\\data\\Trash\\.*
I think that one error in it is that it would match paths starting with:  
z:<anything>\data\Trash\
but I just wanted to exclude files and folders under z:\data\Trash\
Unless you NEED regex, simply put: !Z:\data\trash
Thanks again.
I will need regex, as my example here is a simplified version of what I need to do.
For example, I won't know the exact drive letter, so in the Persistent Search Filter I'd need a way to generalize the drive letter, e.g., via regex along these lines:  
-^[a-z]:\\data\\Trash\\

BTW:  I've just run several tests using the third scenario from my original post.  I'm finding that sometimes files in z:\data\Trash are excluded, and sometimes they are not excluded.  I'm wondering if the inconsistent results are due to the z drive being network-attached via WebDav?  In the past, I've had other unrelated goofy stuff happen when using WebDav.

Thanks,
Jim
If you want the 'Trash' folder excluded remove the trailing '\\', ie -^[a-z]:\\data\\Trash

The results should always be consistent. So if subsequent searches are producing different results WebDAV could well be to blame.
...