0 votes

I have a client with a non indexed or catalogued collection of over 250,000 text documents(doc, docx, pdf) that may or may not contain a social security number. That collection is stored on a file server and there is a great need to know which of those documents contain a social security number.

What I want to do is build a solution around Agent Ransack or File locator Pro to search for possible Social Security numbers, return the string that is found and store filename, filepath and found string in a database. That collection is static so the tool would be used only once but it would be so helpful and save ton's of manual labor.

Will I find tools in the API that make this possible? It's my intention to use MS Visual Studio and C#.

by (30 points)

1 Answer

0 votes

Yes you could build a solution around the core search engine using the C# .NET API:

If you look at the CSSampleNET4 app in the SDK you can see the basics for creating a search application.

However, the quickest method is to just use the FileLocator Pro UI which can perform the search you want and has several reports to provide the results in the multiple formats. You could also use the command line app flpsearch.exe to perform the search and create the report.

by (82.1k points)