I have just been through the headaches of getting this set up and working, so I thought I would share a few notes and tips that I have come across on my way.
I am not saying that this is a complete set up guide, or that it contains every step that is needed to make the solution work. It is probably far from it. However I do hope that is points someone else in the right direction.
It is worthwhile gaining an understanding of Kerberos and how it actually works. There are a couple of guides on Kerberos on the web. I found this guide helped explain the process for me: http://www.roguelynn.com/words/explain-like-im-5-kerberos/. There are plenty of others though.
There is a recent NetApp TR that covers this setup, and if you read it very carefully, then it does contain all of the information that you should need to get this working. The problem with the TR is that it is very detailed and covers a wide range of setups. My advice is to print the document, and read it at least twice highlighting all of the parts that you believe are relevant to your setup. TR-4073 can be found here: http://www.netapp.com/us/media/tr-4073.pdf
If you are coming at this having previously set up Kerberos on a DoT 8.2 or older system then you will notice that a lot of the Kerberos commands have moved, and I think nearly everything now resides within the nfs Kerberos context from the command line.
My Setup
- Windows 2012 R2 domain controllers, running in Windows 2008 domain functional level
- NetApp DataOnTap 8.3 Cluster Mode
- Ubuntu 12.04 and Ubuntu 14.04 clients, which is already bound to the AD domain and can be logged on to using domain credentials
- All devices on the same subnet, with no firewalls in place
The guide here, which uses AES 128 for the encryption mode requires DoT 8.3. Support for AES128 and 256 encryption was added in this version. If you are using an older version then you will need to use either DES or 3DES encryption, which will require modification of your domain controller and is not covered at all below.
I have not managed to get AES256 to work. Although all of the items in the key exchange supported it, the NetApp never managed to see the supplied Kerberos tickets as valid. As I was aiming for any improvement over DES, I was happy to settle for AES 128 and did not continue to spend time investigating the issues with AES256. If anyone happens to get it to work and would like to send me a pointer on what I have missed then it would be much appreciated.
So, on to the details:
- Setting Up the Domain Controller
No changes had to be made to the Windows DC. This is only because we were using AES encryption which Windows DCs have enabled by default. In this case the DC is also the authoritative DNS server for the domain with both forward and reverse lookup zones configured.
- Define a Kerberos Realm on the SVM
In 8.3, this can be completed in the nfs Kerberos realm context at the command line. Quite a bit of repetition in the definition of the server IP address here.
cluster::> nfs kerberos realm create -realm TEST.DOMAIN.CO.UK -vserver svm-nas -kdc-vendor Microsoft -kdc-ip 192.168.10.21 -adserver-name ad1.test.domain.co.uk -adserver-ip 192.168.10.21 -adminserver-ip 192.168.10.21 -passwordserver-ip 192.168.10.21
Verify that the realm is created
cluster::> nfs kerberos realm show
Kerberos Active Directory KDC KDC
Vserver Realm Server Vendor IP Address
-------- ------------------------ ---------------- ---------- -----------------
svm_nas_03
TEST.DOMAIN.CO.UK ad1.test.domain.co.uk
Microsoft 192.168.10.21
- Bind the SVM interface to the Kerberos realm
Now we need to bind this SVM interface to the Kerberos realm. This will create an object in Active Directory for NFS. This object will contain the Service Prinicipal Names for the SVM.
cluster::*> nfs kerberos interface enable -vserver svm-nas -lif svm-nas-data -spn >nfs/svm-nas.test.domain.co.uk@TEST.DOMAIN.CO.UK
Once the command is run, open up Active Directory Users and Computers, look in the Computers container and check that a new computer object has been created. There should be an object with the name NFS-SVM-NAS.
You can also verify that the object has been created with the correct SPNs by querying the domain for the SPNs that are listed against an object. Run the following command from an elevated command prompt:
Setspn.exe –L NFS-SVM-NAS
The command should return output similar to this.
C:\>setspn -L NFS-SVM-NAS
Registered ServicePrincipalNames for CN=NFS-SVM-NAS,CN=Computers,DC=test,DC=domain,DC=co,DC=uk:
nfs/nfs-svm-nas.test.domain.co.uk
nfs/NFS-SVM-NAS
HOST/nfs-svm-nas.test.domain.co.uk
HOST/NFS-SVM-NAS
- Restrict the accepted Encryption Types to just use AES on the SVM
If you are not making any changes to the Windows Domain Controller, then DES and 3DES encryption will not be supported by the domain controller. For tidiness I prefer to disable these options on the SVM so that nothing can even try to use them. Any clients that do would get an Access Denied error when trying to mount.
cluster::> nfs server modify -vserver * -permitted-enc-types aes-128, aes-256
This command will modify all SVM on the cluster, or you could specify the SVM that you wanted to modify if you wanted.
- Setting up Kerberos – Unix Name Mapping
This setup will attempt to authenticate the machine using the machine SPN. This means that there needs to be a name-mapping to accept that connection and turn it into a username that is valid for authentication purposes for a volume. By the time that the name mapping kicks in, the authentication process has been completed. The name-mapping pattern uses regular expressions, which are always fun!
The name mapping rule should be as specific as you possibly can. This could be just your realm, it could be part of the FQDN and the realm.
In my case, I have multiple FQDN’s for clients, so the match I set up was based on matching the realm only.
cluster::*> vserver name-mapping create -vserver svm-nas -direction krb-unix -position 1 -pattern (.+)@TEST.DOMAIN.CO.UK -replacement nfs
The name mapping is applied per SVM. To see all of the mappings run:
cluster::*> vserver name-mapping show
- Setting up the NFS User account
A user needs to be created which corresponds with the name mapping rule that you have defined in the previous step. If no user is defined, then the mapping will work but access will still be denied. To create a user:
cluster::>vserver services name-service unix-user create -vserver svm-nas -user nfs -id 500 -primary-gid 0
- Verify that Forward and Reverse DNS Lookups are working
This is important to get right. Kerberos requires that all clients can successfully forward and reverse lookup the IP address. Check that using your DNS server you can perform a nslookup of the registered name of the SVM that you specified in step 3. Ping is not sufficient as it can cache the results and may not actually query the DNS Server.
All clients will also need to have fully resolvable DNS entries. Verify that everything is being registered correctly and can be resolved. If there are any errors then they will need to be corrected before continuing as mounts will fail.
- Check the configuration of the accepted and default ticket types in the Kerberos configuration on the client.
The clients need to know that they can use the AES128 encryption method, and also that this method takes a higher priority that other suites, such as ArcFour or DES. Check the entries that are listed in the /etc/krb5.conf file. The settings that I found to work for me have been included below. An important note is that with DoT 8.3, there is no longer a requirement to enable the Allow Weak Encryption option. AES is considered a strong encryption method.
[libdefaults]
default_realm = TEST.DOMAIN.CO.UK
ticket_lifetime = 7d
default_tgs_enctypes = aes128-cts-hmac-sha1-96 arcfour-hmac-md5 aes256-cts-hmac-sha1-96 des-cbc-crc des-cbc-md5 des3-hmac-sha1
default_tkt_enctypes = aes128-cts-hmac-sha1-96 arcfour-hmac-md5 aes256-cts-hmac-sha1-96 des-cbc-crc des-cbc-md5 des3-hmac-sha1
permitted_enctypes = aes128-cts-hmac-sha1-96 arcfour-hmac-md5 aes256-cts-hmac-sha1-96 des-cbc-crc des-cbc-md5 des3-hmac-sha1
dns_lookup_realm = true
dns_lookup_kdc = true
dns_fallback = true
allow_weak_crypto = false
You will notice that AES128-CTS-HMAC-SHA1-96 has been brought to the front of the list. I did originally have the order as AES256/AES128/ArcFour, however this did not work. Dropping AES256 down the list enabled everything to work. I did not drop the AES256 entirely as other services are using Kerberos and are successfully using this encryption method.
After making changes to this file, you will need to restart the gssd service using the command
sudo service gssd restart
- Done!
At this point, with a heap of luck, you should be able to issue a mount command with the sec=krb5 option specified and have it work successfully.
If it hasn’t worked, then see the troubleshooting information below.
Troubleshooting
One of the biggest things that annoys me with articles such as this, is when you get to the end, they say it should work, and it doesn’t. You are left with a configuration that you have no idea if it is right, and no idea on how to fix. So here are a few places to look for information to solve any problems that you may hit.
This section is not exhaustive. There are probably many other tools that you could use to check out what is happening, but this is what I used to get me to the process above.
If it is not working, then there is plenty of information that you can obtain and filter through in order to determine the problem. When you have the information I often found that the problem could be identified reasonably easily.
When I hit an error, I tended to run all of these logs and then look through all of them.
- Netapp Filer SecD Trace
The secd module on the filer is responsible for the authentication and the name lookup. This information is useful when the filer is rejecting the credentials or if the spn is not able to be mapped to a valid user.
You first have to turn on the logging, then run your command, then turn it off.
cluster::> set diag
Warning: These diagnostic commands are for use by NetApp personnel only.
Do you want to continue? {y|n}: y
cluster::*> secd trace set -trace-all yes -node clusternode1
Run your mount command here
cluster::*> secd trace set -trace-all no -node clusternode1
cluster::*> event log show –source secd
If this logged an error, then the NetApp was involved in the process. These messages tended to be fairly clear and useful.
- Run mount with verbose mode turned on
On your Ubuntu machine, you can run the mount command in verbose mode to see what is happening.
sudo mount svm-nas:/nfs_volume /mnt/nfs_volume –o sec=krb5 –vvvv
- Run the RPC GSSD daemon in the foreground with verbose logging.
This is the client side daemon responsible for handling Kerberos requests. Getting the verbose output from this can show you what is being requested and whether it is valid or not. You will have to stop the gssd service first, and remember to restart the service when you are finished. You will have to run this in another terminal session as it is a blocking foreground process.
sudo service gssd stop
sudo rpc.gssd –vvvvf
Use Ctrl+C to break when finished.
sudo service gssd start
- Capture a tcp dump from the client side.
This allows you to look at the process from a network perspective and see what is actually being transmitted. It was through a network trace that I was able to see that the ordering of my encryption types was wrong.
sudo tcpdump –i eth0 –w /home/username/krb5tcpdump.trc
Again, this is a blocking foreground process so will need to be run in another terminal session. When you are finished the trace can be opened up in Wireshark. Specify a filter in Wireshark of the following to see only requests for your client
kerberos && ip.addr == 192.168.10.31
Substitute the IP address for the address of your client.
When looking at the Kerberos packets, it is important to drill down, check that the sname fields, etype and any encryption settings are what you are expecting them to be. Encryption types in requests are listed in the order that they will be tried. If the first one succeeds against the AD, but is not accepted by the Netapp, then you will get access denied.
- Testing Name Mapping on the Netapp Cluster
A number of the errors that I was getting were related to problems with name resolution on the Netapp. These were shown clearly by using the secd trace in section a). You can test name mapping without going through the whole process of mounting directly from the Netapp.
Use the following command substituting in the SPN of the client that you want to test.
cluster::> set diag
Warning: These diagnostic commands are for use by NetApp personnel only.
Do you want to continue? {y|n}: y
cluster::*> secd name-mapping show -node clusternode1 -vserver svm-nas- -direction krb-unix -name ubuntu.test.domain.co.uk@TEST.DOMAIN.CO.UK
ubuntu.test.domain.co.uk@TEST.DOMAIN.CO.UK maps to nfs
Summary
I doubt this post is exhaustive in covering this setup, but hopefully it is a pointer in the right direction and includes some useful information on troubleshooting.
If you have any suggestions on items that could be added to the troubleshooting, or information that you think is missing from the guide, please let me know and I can update.
Reference Materials
TR-4073 Secure Unified Authentication for NFS – http://www.netapp.com/us/media/tr-4073.pdf
TR-4067 Clustered Data ONTAP NFS Best Practice and Implementation Guide – http://www.netapp.com/us/media/tr-4067.pdf
Requirements for configuring Kerberos with NFS – https://library.netapp.com/ecmdocs/ECMP1196891/html/GUID-480032FD-1FD6-49F0-82D3-6E24511D4CE0.html
rpc.gssd(8) – Linux man page – http://linux.die.net/man/8/rpc.gssd
krb5.conf – http://web.mit.edu/kerberos/krb5-1.4/krb5-1.4.1/doc/krb5-admin/krb5.conf.html
Encryption Type Selection in Kerberos Exchanges – http://blogs.msdn.com/b/openspecification/archive/2010/11/17/encryption-type-selection-in-kerberos-exchanges.aspx
Kerberos NFSv4 How To – http://joshuawise.com/kerberos-nfs