1. Helpdesk
  2. Miscellaneous

File processing - What happens when I upload a file?

Below is a list of all the processing actions applied to your documents upon upload.

  1. Virus scan - each document is scanned, and quarantined if a virus is found.
  2. Archive analysis - ZIP files are analysed to detect password-protected content (only applies to zip files)
  3. Timestamp extraction - Any timestamps relevant to the document are extracted and stored.
  4. Page count - The page count of the document is calculated.
  5. Checksum calculation - MD5 checksums are calculated for each document.
  6. Conversion to PDF - Word documents, Spreadsheets, Slideshows, Text documents and emails are converted to PDF.
  7. Rasterisation - If required, images of each page of the document are generated.
  8. Native text extraction - Text is extracted from the document's native format for indexing, where possible.
  9. OCR - Optical Character Recognition is applied to extract text for scanned or imaged documents if required.
  10. Indexing - The extracted text and any metadata is added to a full-text index to enable searching