1. Information Protection Hierarchy (The Classification Engine)

Microsoft Purview Information Protection relies on a structured hierarchy to identify, classify, and protect data from the ground up.

  • Level 1: Sensitive Information Types (SITs) & Classifiers: The “What.” These are the technical definitions used to identify sensitive data.
    • SITs: Patterns like credit card numbers or tax file numbers identified via Regex and keywords.
    • Trainable Classifiers: Machine learning models that recognize document types (e.g., contracts or source code) based on their overall structure.
  • Level 2: Sensitivity Labels: The “Definition.” This layer defines what happens to the data (e.g., encryption, watermarking). Labels are the metadata tags that users see and apply.
  • Level 3: Label Policies: The “Who and How.” Policies are used to publish labels to specific users or groups.
    • Label Publishing: Makes labels visible for manual application.
    • Auto-labeling: Automatically applies labels based on the SITs or Classifiers found in the content.

2. Taxonomy & Deployment Strategy

  • The Taxonomy: An enterprise should have a simple, universally understood taxonomy, typically consisting of four to five tiers: Public, General/Internal, Confidential, and Highly Confidential.
  • Default Labels: Applying a default label (e.g., General/Internal) to all new emails and documents is the most effective way to baseline tenant security.
  • Mandatory Labeling: Forcing users to choose a label before saving or sending data. This requires users to actively categorize their work.

3. Item-Level vs. Container-Level Labels

  • Item-Level Labels: Applied directly to a file (Word, Excel, PDF) or an email. The protection—such as encryption or watermarks—travels with the file regardless of its location.
  • Container-Level Labels: Applied to M365 Groups, Teams, or SharePoint Sites.
    • They do not automatically encrypt the files inside.
    • They control container settings: Privacy (Public vs. Private), External Guest Access, and Unmanaged Device Access (e.g., web-only access).
  • Default Library Labels: A SharePoint Document Library setting that acts as a “bridge,” automatically applying an Item-Level label to any new or edited file within that specific library.

4. Encryption & Access Control (Azure RMS)

  • Azure Rights Management (RMS): When a label applies encryption, the document is wrapped in Azure RMS, requiring users to authenticate against Entra ID to open it.
  • Granular Permissions: Admins can define specific rights, such as View Only, Edit, Print, or Copy.
  • Co-Authoring: To allow multiple users to edit encrypted documents simultaneously in SharePoint or OneDrive, “Co-authoring for files with sensitivity labels” must be enabled at the tenant level.

5. Auto-Labeling (Requires E5 / Purview Premium)

  • Client-Side Auto-Labeling: Real-time recommendations or automatic application as a user types sensitive information into Word or Outlook.
  • Service-Side Auto-Labeling: A background process that scans data at rest in SharePoint, OneDrive, and Exchange.
    • Constraint: This process can only handle a maximum of 25,000 files per day per tenant, making it unsuitable for instant remediation of massive legacy migrations.

6. Troubleshooting & Client Behavior

  • Built-in Labeling: Organizations must use the native labeling capabilities built into M365 Apps; the legacy AIP unified labeling client is deprecated.
  • Sync Delays: New or updated labels can take up to 24 hours to appear in desktop applications. Users can force a sync by clearing the local cache in %localappdata%\Microsoft\Office\CLP.
  • PDF Support: Native M365 apps and Adobe Acrobat support reading and applying labels to PDF files.

7. Essential PowerShell Cmdlets (Security & Compliance)

  • Connection: Connect-IPPSSession
  • Label & Policy Management:
    • Get-Label: Lists all sensitivity labels.
    • Get-LabelPolicy: Lists all published policies.
  • File Diagnostics:
    • Unlock-SPOSensitivityLabelEncryptedFile: Allows a compliance admin to strip encryption from a file if the original owner has left the company.