Setting up a firewall to secure a Hadoop cluster's network with Shorewall

Shorewall is a tool to configure Linux inbuilt IPTables in an easy and understandable way. Assume we have a basic setup: Lan | Firewall with Proxy server | Internet Network_Config A secure setup is to:

  • ACCEPT HTTP(80) and HTTPS(443) from LAN to NET
  • ACCEPT special services port’s from specific LAN to NET (like e-banking)
  • ACCEPT only the needed FW services from LAN to FW - SSH(22) and MAIL(25,443,993,…) to FW (if mail server is on FW)
  • ACCEPT only the needed FW services from NET to FW - SSH(22) with IP restriction (and SSH key) and MAIL(25,443,993,…) to FW (if mail server is on FW)
    • SSH port can be changed
  • LOC to LOC connections are not possible to be governed by FW, therefore all allowed
  • REJECT any incoming connection (other than above) to LAN or FW

Shorewall configuration - with some additional examples on the syntax and rules:

# File: zones

# http://www.shorewall.net/manpages/shorewall-zones.html

#

###########################################################

#ZONE	TYPE		OPTIONS		IN			OUT

#					OPTIONS			OPTIONS

net	ipv4

loc	ipv4

fw	firewall



#-------------------------------------------------------

# File: masq

# http://www.shorewall.net/manpages/shorewall-masq.html

#

############################################################

#INTERFACE		SOURCE		ADDRESS		PROTO	PORT(S)	IPSEC	MARK

eth0    eth1



#------------------------------------------------------------------------------

# File: interfaces

# http://www.shorewall.net/manpages/shorewall-interfaces.html

#

############################################################

#ZONE	INTERFACE	BROADCAST	OPTIONS

net	eth0	detect

loc	eth1	detect   routeback



#------------------------------------------------------------------------------

# File: policy

# http://www.shorewall.net/manpages/shorewall-policy.html

#

############################################################

#SOURCE		DEST		POLICY		LOG		LIMIT:BURST

#						LEVEL

loc	net	REJECT

loc	fw	REJECT 

fw	loc	ACCEPT

fw	net	ACCEPT

net	all	DROP	info

all	all	REJECT	info



#------------------------------------------------------------------------------

# File: rules

# http://www.shorewall.net/manpages/shorewall-rules.html

#

#############################################################

#ACTION		SOURCE		DEST		PROTO	DEST	SOURCE	ORIGINAL  RATE	USER/ MARK

#							PORT	PORT(S)	

#DEST		LIMIT		GROUP

# Proxy server - exception on redirect for a server machine

REDIRECT        loc     3128    tcp     80    - !192.168.0.10

ACCEPT 		loc 	net	tcp	443



# Allow SMTP, SMTPs, HTTPs, POP3S for loc

ACCEPT     net	fw     tcp   25,443,995	-



AllowFTP   loc  net



# AllowAndroid  loc  net

ACCEPT     loc  net    tcp   5222,5228

ACCEPT     loc  net    udp   5222,5228



#AllowPOP, AllowIMAP loc net

ACCEPT     loc  net    tcp   110,143

ACCEPT     loc  net    udp   110,143



# Windows Update

ACCEPT     loc  net    udp   137,138,53

ACCEPT     loc  net    tcp   137,138,139,53

ACCEPT     loc  net    tcp   445



# loc:192.168.0.3 to have limitless connection to net

ACCEPT	loc:192.168.0.3	net	all

Assume we have a Hadoop cluster that needs secure firewall : Hadoop Network A secure setup is to:

  • A FW functioning as a Jumpbox machine hiding all internal network components
  • LAN as Cluster network
  • All internal nodes without firewalls (if internal nodes can be accessed from the outside then the setup is elsewhat, all nodes with firewalls enabled with strict access policy)
  • ACCEPT HTTP(80) and HTTPS(443) from LAN to NET for repository updates
  • ACCEPT only the needed FW services from NET to FW - SSH(22) with IP restriction (and SSH key)
    • SSH port can be changed
  • REJECT any incoming connection to LAN or FW
  • Access all Hadoop services, Ambari, Hue, etc. by using SSH tunneling to Jumpbox (FW)

The plan of the network topology and security for large Hadoop deployments with co-located racks needs thorough planning and security settings.