Manage network access to a HDInsight Cluster

Azure HDInsight is an Apache Hadoop distribution powered by the cloud. This means that it handles any amount of data, scaling from terabytes to petabytes on demand. Spin up any number of nodes at any time.

Since HDInsight is a PaaS offering, it is by default publicly accessable from any internet connection. The cluster contains often valuable data of customers. These customers also have requirements how to securely connect to this data, for example using IP restrictions so only their block of IP addresses can connect to the cluster.

In this article we are going to secure the HDInsight cluster so only IP adresses that we specify can connect to it.

  1. Log in to Azure using http://portal.azure.com
  2. You must have a Virtual Network (vNet) to continue, if you don’t have a vNet yet, create one.  This is mandatory.
  3. Click on +New
    1
  4. Search for HDInsight
    image
  5. Select the HDInsight Cluster
    image
  6. Click on Create

    image
  7. Give the HDInsight Cluster a name
    image
  8. Select the correct Cluster Type and Version

    image

  9. Enter the correct credentials
    image
  10. Give the Storage Account and the Container a name
    image
  11. Select the correct sizing of your cluster.
    Be aware that there is a default quota of 60 cores for a Subscription. This can be increased by raising a Support Request.
    See https://azure.microsoft.com/en-us/blog/azure-limits-quotas-increase-requests/ for more information about quotas.
    image
  12. Click on Optional Configuration and select Virtual Network
  13. Select the correct vNet:
    image
  14. Select the correct Subscription
  15. Click on Create and wait 30 minutes:

    image

  16. Now that the HDInsight Cluster is created it is accessible from the public internet. This is something many customers want to prevent, so we need to secure it.
    Since HDInsight is connected to a Private Network, we can assign a Network Security Group (NSG) and then create Inbound Security Rules to allow (not deny) traffic.

    Microsoft requires access from some IP adresses for managebility.  They provide a PowerShell script to create the Network Security Group and give these addresses access to access the cluster. This script can be downloaded here.
    The adjusted script for the environment above, can be seen here.

  17. It is necessary to modify the script and run it. It will create the Network Security Group and have the Microsoft address as inbound rules.
    Note: you cannot set Outbound Security Rules  on the Network Security Group.
    image
  18. Add your own public address, like your datacenter, home IP or office WiFi ip addresses as Inbound Security Rule
    image

 

Now the HDInsight cluster is only available from the addresses and ports that you specified in the Inbound Security Rules.

Advertisements
This entry was posted in Cloud and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s