Azure Databricks and Microsoft Fabric Integration

Samarendra Panda
2 min readJun 26, 2023

Since Delta table format is open-source and both Azure Databricks and Microsoft Fabric seamlessly support it, integrating the two services becomes a straightforward process. This blog post aims to explore two options for ingesting data from Azure Databricks and performing near real-time data analysis and visualization using Power BI’s directlake storage mode with Microsoft Fabric.

Here in this blogpost, we discuss how we can ingest, transform, and visualize data using Microsoft Fabric.

Disclaimer: At the time of writing this blog, Microsoft Fabric is currently in Public Preview. It’s important to note that the information discussed here reflects the current state, and future updates may introduce new features that could simplify the use case.

The video below showcases a comprehensive demonstration of the use cases that we are going to discuss in this blogpost.

Option 1 — using ADLS Gen2 as storage.

Azure Databricks & Microsoft Fabric Integration using ADLS Gen2 as storage.

In this option, we keep the storage in our own Azure environment, and we create a shortcut from Fabric to access the data as an external table.

References:

  1. How to attach ADLS Gen2 in Microsoft Fabric Lakehouse — https://learn.microsoft.com/en-us/fabric/data-engineering/get-started-shortcuts

Option 2 — using OneLake as Storage.

Azure Databricks & Microsoft Fabric Integration using OneLake as Storage

What is OneLake? OneLake is a single data lake which comes with one Fabric-teannt/ Organization. Each workspace created within Fabric functions like a container in an Azure storage account. The I/O operation is optimized when we work with OneLake and Fabric Engines since we bypass the network latency since its native storage for the compute engines.

Why did we consider a staging workspace? At present, workspace is the primary security boundary for data within OneLake. If we grant access to the analytical workspace, the service principal could potentially modify or delete unintended data. To prevent this, we have created a staging workspace where the service principal will have full access. The analytical workspace will create tables using a shortcut to OneLake, ensuring data integrity and security.

How to write data from Azure Databricks to Microsoft Fabric?

  1. We create a Service Principal which is associated with Fabric Tenant and give contributor access over Staging workspace.
  2. create the mount point/ external location using the oneLakePath ‘abfss://myWorkspace@onelake.dfs.fabric.microsoft.com/myStagingLakehouse.lakehouse/Files/’.
writing the data into onelake

We can now use Power BI in directlake mode to query the data in near real time.

Hope this helps!

--

--