How to Manage Traffic Between Databricks and Private Endpoints in Hub and Spoke Architecture?

0
10
Asked By CuriousCoder99 On

I'm setting up some workloads using a hub and spoke architecture in Azure, where I have an Azure firewall and Private Endpoints for storage accounts in the hub VNet and my Databricks workspace in the spoke VNets. I've set up VNet peering between the hub and spoke. While I can access the storage accounts from Databricks, I want to provide selective access to certain storage accounts. During my research, I found out that traffic between Databricks and the storage account Private Endpoints doesn't travel via the firewall by default. To change this, I enabled network policies for the Private Endpoint subnet, created a route to route traffic through the firewall, and set up a firewall rule to allow specific Private Endpoint IP addresses while denying other Databricks traffic. However, after implementing this, I'm unable to access the storage accounts from Databricks, even for those IPs that my Azure firewall network policy allows. I need help resolving this issue!

5 Answers

Answered By NetworkNinja14 On

Check if the Private Endpoints for the storage accounts are in the same VNet. The default VNet routing rules will usually direct traffic from Databricks to the Private Endpoint's subnet if they're configured correctly.

SecurityGuru23 -

The Private Endpoints for the storage accounts and the firewall are in the same VNet (hub) but in different subnets, while Databricks is in a spoke VNet.

Answered By TechWizard82 On

You might want to consider using service endpoints on your Databricks subnets if they aren't already enabled. This way, they'll respect your Private Endpoint configuration. For optimal performance, it's best if essential Private Endpoints are located in the same VNet as Databricks instead of being across the firewall. We usually put critical endpoints directly in the workload VNet, like Databricks storage Private Endpoints. Remember to secure them with NSGs. Also, keep in mind that Databricks can use a lot of bandwidth, so monitor it to avoid network slowdowns.

DataDude45 -

The reasoning for placing storage endpoints in the hub is valid since you have multiple Databricks VNets in different spokes linking to the same storage account.

Answered By RoutingExpert10 On

What are your route rules from the Databricks subnet? Make sure to add the Private Range to the firewall and allow connections specifically from the Databricks subnet. You might also need to check if you need routes on the Private Endpoint subnet for returning traffic.

CuriousCoder99 -

I have those routes already set. Just trying to figure out if I need return traffic routes on the PE subnet too.

Answered By ConsultationKing99 On

I have extensive experience with this kind of architecture; if you're interested in a consultation, let me know!

CuriousCoder99 -

Thanks for the offer, but I’m not looking for any paid consultations right now.

Answered By CloudSavvy77 On

It's usually best not to route Private Endpoints through the firewall. Instead, use DNS linking or Storage Account network rules to manage access without that extra complexity.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.