0 votes

Did I overlook such an option?

To check such an option would not be really useful in most Boolean search cases, neither in most regex line-by-line search cases, BUT it would be of utmost importance in regex whole-file search, for multiple hits there, so an option would be of tremendous usefulness.

For example, you could search (regex, file-wide, line span e.g. 100) for
(string1[^}]+string2)|(string2[^}]+string1),
in order to find, in some big file, all "pages" containing both strings, in any order, provided that these "pages" are separated by the character "}", ditto for similar use cases; in order to better view the finds/"hits", you would check the option "WordWrap".

The visual problem now being that in every hit, you will also get all the text between the two strings, and that may be a 3- or 4-digit number of "unwanted" characters which then clutter all your screen, making it very uncomfortable to visually check every hit for its relevance to your current use case.

With an option "separate finds by blankline" though, you could do that check very fast, one of the strings being within the very first line after a blank line, the other one being in the very last line of that - now clearly distinguishable - text block.

Many other regex searches would equally profit from such an option, especially since lookbehind is not allowed in FLP regex, and exlusions do not work in (positive) lookaheads EDIT: I WAS MISTAKEN HERE, SEE BELOW, which means that in many such regex search results, you will get hundreds or thousands of "unwanted" characters "in-between", making it currently very difficult to distinguish one hit block from the next one.

by (110 points)

2 Answers

0 votes

I've update the Issue Tracker item 'Line Separator in Hits Tab' with your comments.

Just to correct an error in your comments, lookbehind and lookahead are supported in FLP regex, both negative and positive.

by (74.8k points)
0 votes

This would be extremely helpful indeed, thank you!

As for the lookbehind, I hastily presumed it not being allowed since I got an error message instead of FLP starting the search, for my basic (and obviously wrong) try (?<=}.+?)searchstring(?=.+?}) (see below), sorry!

After reading your answer, I finally remembered that lookbehinds come, according to the specific regex flavor, with all sorts of general or flavor-specific limitations, so I looked that up again.

Basics: for
(lookbehind)searchtext(lookahead)
- positive (i.e. "look..." text must be there):
(?<=lookstring)searchtext(?=lookstring)
- negative (i.e. "look..." text must NOT be there):
the "=" is to be replaced by a "!" (for logical "NOT")

In the lookahead lookstring:
Any (!) regex allowed here except a lookbehind, even capturing groups (for the latter some exceptions in Tcl regex).

But in the lookbehind lookstring:
LOTS of problems/exclusions, according to the regex flavor, in Tcl not allowed at all it seems, in Perl/Python/etc.
- only fixed-length strings, and other limitations,
- alternation (i.e. the "OR" "|") only if the alternatives are of the same length (e.g. th(is|at) for this OR that),
- NO quantifiers (i.e. my (?<=}.+?) was obviously wrong, as would have been (?<=}.starsymbol) etc. (I write starsymbol, since the regex star symbol would close the italics here)),
- etc., etc., so I had been overly optimistic in thinking about an alternative for my (working) code (above in bold = 2 strings with NO "}" in-between).

It would not have made sense anyway since in the alternative I had in mind (= 2 strings in-between 2 "}") would have brought hit blocks as extensive as my code above: ("}" and then anything except a "}" as a lookbehind, but not possible this way)(searchstring1)(anything except a "}")(searchstring2)(anything except a "}" and then a "}").

On the other hand, there are use cases where the "full" page / chapter / subchapter / section / etc., having 1/2/3... search terms in combination, would be wanted, and there, a simple
(}.+searchstring1.+?searchstring2.+?})|(}.+searchstring2.+?searchstring1.+?}) (more complicated but doable for 3, 4...) would work fine (with starsymbol instead of the "+" in case if you want to provide for "possibly nothing" instead), so how to get the "}" or a similar separator character?

Since the FLP regex (as far as I have understood the help file / other answers here) will not work onto the original file but on a "text-only" version of that, least the command characters, sometimes the original page / chapter / section separator characters will not be available anymore for the FLP regex search; in such cases it should often be possible to do a simple search-and-replace, in the original application, for the specific separator codes over there, and replacing them with those codes plus another, otherwise not used special character which can then be searched for in FLP regex (e.g. my "}", a "¬" or any other such character which doesn't occur elsewhere in the text).

(And if that is not successful: I got the "}" from opening a copy (!) of one of my original files within a hex editor (there are free ones available), in which I was able to identify the "}" (which is not visible in the original application), which then is recognized by the FLP regex (i.e. not stripped off in-between). This may sound quite exotic, but it's successful in the end, and I know of no text or other such application which would provide a search for a combination of several search terms within the same subdivision as page, chapter...)

by (110 points)
...