Month: June 2023

Design a Data Auditing Strategy – Keeping Data Safe and Secure

When you take an audit of something, it means that you analyze it and gather data about the data from the results. Many times the findings result in actions necessary to resolve inefficient or incorrect scenarios. What you analyze, what you are looking for, and what kind of analysis you need to perform are based on the requirements of the object being audited. The data management perspective (refer to Figure 5.41) includes disciplines such as quality, governance, and security. Each of those are good examples of scenarios to approach when creating a data auditing strategy. From a data quality perspective, you have been exposed to cleansing, deduplicating, and handling missing data using the MAR, MCAR, and MNAR principles, as discussed in Chapter 5, “Transform, Manage, and Prepare Data,” and Chapter 6, “Create and Manage Batch Processing and Pipelines.” This chapter focuses on the governance and security of data and how you can learn to design and implement strategies around those topics.

Governance encompasses a wide range of scenarios. You can optimize the scope of governance by identifying what is important to you, your business, and your customers. The necessary aspects of data governance include maintaining an inventory of data storage, enforcing policies, and knowing who is accessing what data and how often. The Azure platform provides products to achieve these aspects of data governance (refer to Figure 1.10). Microsoft Purview, for example, is used to discover and catalog your cloud‐based and estate‐based data estate. Azure Policy provides administrators the ability to control who and how cloud resources are provisioned, with Azure Blueprints helping to enforce that compliance. Compliance is a significant area of focus concerning data privacy, especially when it comes to PII, its physical location, and how long it can be persisted before purging. In addition to those products, you can find auditing capabilities built into products like Azure Synapse Analytics and Azure Databricks. When auditing is enabled on those two products specifically, failed and successful login attempts, SQL queries, and stored procedures are logged by default. The audit logs are stored into Log Analytics workspace for analysis, and alerts can be configured in Azure Monitor when certain behaviors or activities are recognized. Auditing is applied across the entire workspace, when enabled, and can be extended to log any action performed that affects the workspace.

Microsoft Azure provides policy guidelines for many compliance standards, including ISO, GDPR, PCI DSS, SOX, HIPPA, and FISMA, to name just a few of the most common standards. From a security perspective, you have seen the layered approach (refer to Figure 8.1) and have learned about some of the information protection layer features, with details about other layers coming later. Data sensitivity levels, RBAC, data encryption, Log Analytics, and Azure Monitor are all tools for protecting, securing, and monitoring your data hosted on the Azure platform.

Microsoft Purview

Microsoft Purview is especially useful for automatically discovering, classifying, and mapping your data estate. You can use it to catalog your data across multiple cloud providers and on‐premises datastores. You can also use it to discover, monitor, and enforce policies, and classify sensitive data types. Purview consists of four components: a data map, a data catalog, data estate insights, and data sharing. A data map graphically displays your datastores along with their relationships across your data estate. A data catalog provides the means for browsing your data assets, which is helpful with data discovery and classification. Data estate insights present an overview of all your data resources and are helpful for discovering where your data is and what kind of data you have. Finally, data sharing provides the necessary features to securely share your data internally and with business customers. To get some hands‐on experience with Microsoft Purview, complete Exercise 8.2, where you will provision a Microsoft Purview account.