File processing - What happens when I upload a file?

Below is a list of all the processing actions applied to your documents upon upload.

  • Virus scan - each document is scanned, and quarantined if a virus is found.

  • Archive analysis - ZIP files are analysed to detect password-protected content (only applies to zip files)

  • Timestamp extraction - Any timestamps relevant to the document are extracted and stored.

  • Page count - The page count of the document is calculated.

  • Checksum calculation - MD5 checksums are calculated for each document.

  • Conversion to PDF - Word documents, Spreadsheets, Slideshows, Text documents and emails are converted to PDF.

  • Rasterisation - If required, images of each page of the document are generated.

  • Native text extraction - Text is extracted from the document's native format for indexing, where possible.

  • OCR - Optical Character Recognition is applied to extract text for scanned or imaged documents if required.

  • Indexing - The extracted text and any metadata is added to a full-text index to enable searching