Skip to main content

Text not being extracted from ZIP files

Expected Behavior

*.zip files should be classified based on the sum of their content. This means that the documents within are read as a single file and the .zip is classified as a whole.

Issue

When you review .zip files in the Netwrix Data Classification interface you may realize that no text has been extracted from them and they are therefore not being classified based on their content.

Resolution

To extract text from a .zip file Netwrix Data Classification uses ifilters. For .zip files, this filter pack must be installed on the NDC server (or each server in a DQS cluster).