Multichannel (UFP and vNIC) on IBM Switches

Introduction to multichannel
Hardware density has become increasingly important over the last several years. Being able to do more compute with a smaller physical and energy footprint has become the new focus for IT. In order to do this effectively, more and more businesses are moving traditional workloads to the cloud, whether it be public, private or hybrid.

We often find however that all of this consolidation requires previously separated network and storage fabrics to coexist on the same physical hardware, virtualised, especially in private cloud scenarios. Traditionally that has meant introducing more and more IO adapters into systems to cope with the number of isolated fabrics, but we are trying to make everything smaller and cheaper; This is where converged network adapters and multi-channel networking becomes relevant.

CNAs (Converged Network Adapters)
Network adapters such as the Emulex and Brocade CNAs are physical network adaptors which have the ability to provide multiple virtual network or storage adaptors in a single card. Multiple communications channels are established between the CNA (installed in a server) and a compliant network switch. When working with IBM switches, these virtual NICs or HBAs (channels) are configured depending on the multi-channel mode set in the server’s UEFI menu.

Multi-channel modes: vNIC, Switch-Independent vNIC and UFP – what’s the difference?
There are several ways in which multi-channel can be configured between IBM switches and supported CNAs.
The most common configuration modes are Switch-Independent vNIC, vNIC and UFP.
These modes are mostly the same in functionality, with the exception of UFP which adds some additional functionality.
The main way in which these modes differ, is how and where the channels are configured. Let’s step through the individual modes and how they are configured.

Configuring Multi-Channel Mode
Multi-Channel operation is configured by first entering the adaptor configuration menu, under the server’s UEFI menu. On IBM servers, you can enter UEFI by pressing F1 when prompted during server boot up. It may help to take node of the MAC address and PCI address of the adaptor you would like to configure, prior to entering the setup menu, if you have an OS installed on the server.
Once in the UEFI menu, you navigate to System Settings -> Network, and select the device you would like to configure multi-channel mode on. For Emulex adaptors, you can enter the Emulex NIC Setting screen the change the multi-channel mode. Once in this screen, you can select supported Multi-Channel modes by altering the Multichannel setting.

UFP Configuration
For UFP, you select UFP as the multi-channel mode. With UFP, the remaining configuration is done in the configuration of a supported switch. Negotiation between the NIC in the server and the switch to establish UFP channels and settings normally only happens at server boot, so remember that once the mode is changed in the UEFI menu, and when any changes are made to the UFP configuration on the switch, the server will need a reboot to pick up the changes.

An example switch configuration section is below, which we will walk through.

 #configuration for the first vNIC on physical port 10
ufp port INTA1 vport 1
        network mode trunk #Configure this port as a trunk port
        network default-vlan 1 #Not required when using VLAN 1, just shown for example purposes. Similar to setting native-vlan on physical ports.
        qos bandwidth min 40 #Set the QoS to allocate 40% of bandwidth minimum. Useful for identifying which vNIC corresponds to a network interface in the OS, as this will change the advertised link speed show in the OS.
        qos bandwidth max 40
        enable
        exit
!
 #configuration for the second vNIC on physical port 10, configured for FCoE
ufp port INTA1 vport 2
        network mode fcoe #Set the port mode to FCoE
        network default-vlan 1003 #On IBM switches, the FCoE VLAN can be changed. 
        qos bandwidth min 60 #Set the QoS to allocate 60% of bandwidth minimum
        qos bandwidth max 60 #Set the QoS to allocate no more than 60% of bandwidth
        enable
        exit
!
#Enable UFP on the port
ufp port INTA1 enable
!
#Enable UFP functionality globally
ufp enable
!
#Example VLAN to demonstrate different configuration for adding vNICs/UFP ports to a VLAN
vlan 100
        #vmember is used to add virtual ports to a VLAN.
        #Syntax is <physical port>.<virtual port>
        vmember INTA1.1
        enable
!
#FCoE vlan example
vlan 1002
        vmember INTA1.2
        enable
!

As you can see from the configuration, virtual NICs or “UFP Channels” and their usage are configured on the switch side. All that needs to be done from the server side is to enable UFP as the multi-channel mode in the UEFI menu for the CNA.

Switch-Independent vNIC Configuration
Switch independent vNIC mode is different to UFP mode, in that the CNA allocates vNICs for each VLAN permitted on the switch port. As many switches are capable of using VLANs, even older ones which are not UFPcapable, this mode is considered switch-independent as the CNA is doing most of the heavy lifting when it comes to detecting and configuring vNICs. The majority of configuration is handled by the CNA. This mode can be selected in the UEFI menu, an example switch config for an EN4093 below could be used to allocate vNIC ports based on VLAN.

#This is our port connected to a CNA adapter configured for switch-independent vNIC mode
interface port INTA1
        #Tagging is required on the port, as the CNA inspects tagged packets to map VLANs to vNICs
        tagging
        exit
!
#Our data VLAN, 1000
vlan 1000
        enable
        name "Data"
        #Add the port as a member of this VLAN, allowing VLAN 1000 traffic to flow on INTA1. Note below that this port is also a member of VLAN 1002, our FCoE VLAN
        member INTA1
!
#Our FCoE VLAN, 1004
vlan 1004
        enable
        name "FCoE"
        #Add the port to this VLAN too, which in turn will cause the CNA to allocate another vNIC. 
        member INTA1
        #NPV not required, just here for demonstration purposes
        npv enable
        npv traffic-map external-interface EXT11,EXT12
!

The above switch configuration does not carry any vNIC-specific configuration as you can see. The CNA does all the work in finding out how many vNICs need to be mapped to VLANs. Keep in mind, that your second vNIC will typically be used for any FCoE traffic.

Standard vNIC Configuration
Standard vNIC predates UFP, and typically if supported by your CNAs and switches you should go with UFP, as it is more flexible. Configuration-wise, it is very similar to UFP, except some of the configuration is performed in the UEFI menu in addition to the switch configuration. It is essentially in-between UFP and switch-independent vNIC when it comes to the CNA and switch’s responsibility for configuration of the fabric. If you are on older switching or NIC/HBA hardware, or older firmware levels you might need to use this mode for compatibly.

To configure this mode, enter the UEFI menu on your server and enter the configuration for your virtual fabric adapter. Depending on your adaptor, you will be able to see the virtual fabric channels listed and you will be able to change the modes of each channel. This information will propagate to the switching hardware connected to the VFA on next reboot.

Switch configuration can be then used to configure VLAN membership on the virtual ports, which are referred to by their physical port number and channel. Example: INT4.1
Also, when adding vNIC ports to a VLAN, you normally need to use the vmember command instead of the regular member command.

An example for enabling VNIC and the first channel on physical port 4, allocating 50% of the physical bandwidth the port:

#Enable vNic
vnic enable
#Switch config context to vNic port 4.1
vnic port 4 index 1
#Allocate 50% bandwidth
bandwidth 50
enable
#Add another channel on the same port
vnic port 4 index 2
bandwidth 50
enable
#Add first channel on port 4 to VLAN 20
vlan 20
vmember int4.1

Keep in mind, your bandwidth percentages will need to add up to 100. Unlikely UFP, you set the modes of the various channels in UEFI if you need to use things like FCoE.

FCoE and Multichannel
For consistency and stability, when FCoE is in use with multi-channel, ensure you use the second channel for FCoE traffic, unless you have reason to change it. The reason for this is that some CNAs, especially when using less sophisticated multi-channel modes (Switch-Independent vNIC for example) will use the second channel for FCoE by default, and it is non-trivial or not possible to change it, which can lead to some confusion if you are trying to configured a different channel on your switching equipment for FCoE use.

XBMC Frodo Buildroot for Raspberry Pi

I’ve recently upgraded my dodgy Samsung DLNA + minidlna setup to XBMC. XBMC is a media center, and IMHO it’s the best one.

It’s pretty obvious also from the number of Raspbery Pi XBMC kits on the internet, that the Raspbery Pi is a popular option given the price and low power consumption.

I’ve recently started using XBMC daily on a Raspberry Pi, and I’m pretty happy with the performance. As most people working with the Pi know, capacity constraints are everywhere. I can happily say that despite the considerable constraints on the Pi, I’ve got it running quite nicely. Here’s a few things I did that made a significant performance improvement.

Cut down OS
Rather than running one of the more popular distros, like Rasbmc, or OpenELEC, I’ve set up a simple Buildroot based on this repo and have updated/cleaned up the XBMC packages to install XBMC Frodo. Not having any superfluous daemons or other software running in the background allows XBMC to take advantage of all of the hardware available on the Pi.

You can take a look at my repo, including required packages and configs, here.

Overclock slightly
I’m running my Pi using the following overclocking settings in config.txt

arm_freq=1000
core_freq=500
sdram_freq=500
over_voltage=6

Compile with optimisations
The buildroot I’ve linked earlier in this post already is configured to do this, but optimising compiled code specifically for the Pi’s CPU seems to give a little performance boost. It’s free, so why not.
The flags (CFLAGS & CXXFLAGS) I use are -pipe -mhard-float -march=armv6 -mtune=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard -ffast-math -O3

MySQL Library
Running with the inbuilt SQLite library is very hard on the Pi I’ve found, so migrating to MySQL shared library on another machine really helped to speed up the menu responsiveness. It’s also let me do library updates from my file server using a headless install of XBMC, as the XBMC on the fileserver and the Pi shared the same media database.

Instructions are here.
Basically you need to set up MySQL on another system, and configure your advancedsettings.xml as such:

<advancedsettings>
  <videodatabase>
    <type>mysql</type>
    <host>ip of server</host>
    <port>3306</port>
    <user>username</user>
    <pass>password</pass>
  </videodatabase> 
  <musicdatabase>
    <type>mysql</type>
    <host>ip of server</host>
    <port>3306</port>
    <user>username</user>
    <pass>password</pass>
  </musicdatabase>
  <videolibrary>
    <cleanonupdate>true</cleanonupdate>
  </videolibrary>
</advancedsettings>

Restrict your GUI quality
You can limit the quality of your GUI to 720p without effecting playback quality of 1080p videos. I’ve found this also really helps the responsiveness of the GUI. In your guisettings.xml, put:

<settings>
    <videoscreen>
        <limitgui>720</limitgui> 
    </videoscreen>
</settings>

Disable actor thumbnails
XBMC can download photos of all of the actors in a specific TV show or movie. I never, ever use this, and it slows down the menus and scanning process, so I disable it. In your guisettings.xml:

<settings>
    <videolibrary>
        <actorthumbs>false</actorthumbs> 
    </videoscreen>
</settings>

Don’t scan for media on the Pi
The is the thing that makes the most difference, in my opinion. Especially if you have a large library. Scanning your media across NFS/CIFS or even a USB hard drive is really slow on the Pi, and bogs down the GUI really heavily (remember, single core CPU). If you’ve got the MySQL library option set up, and you media is accessible via the same path on your file server and Pi (hint: use NFS and mount your videos under the same folder structure that exists on your file server) you can run XBMC on your file server and do the scanning there. If you don’t have a GUI on your file server, you can run it headless using something like this (reasonably complicated to set up).

Suspend to RAM Issues on Lenovo E540

Since moving on from IBM, I’ve been using a Lenovo E540 that I picked up fairly cheap, and naturally I run Linux on it.
Being a Debian nut that Linux is of course Debian Sid.

I compile my own kernel tailored for the hardware about once a week from the linux-stable git tree and so far haven’t had any real issues. I’ve noticed the card reader doesn’t seem to work but haven’t put a lot of time into fixing it. There does seem to be a driver in my current kernel (3.16.1) but I haven’t added it to my kernel config, yet. Linux compatibility and performance has been great. The only upgrade I’ve made is adding a Samsung 840 series SSD.

I did however want to write this post to warn anyone having suspend to ram issues on this laptop to stick to UEFI verison 2.06 or lower. I upgraded last week and suddenly my laptop would not longer fully suspend. The power LED would slowly pulse, but I could hear the fans still going if I’d been taxing the CPU prior to suspend. Resume would definitely not happen.

The solution of course is to downgrade back to a working UEFI version if you’ve upgraded and are facing this issue. I should also mention that I tried loads of difference ACPI configuration flags on the kernel commandline and none of them seemed to help. It all comes down to a change in the UEFI code in version 2.07.

Link to the working (for me) BIOS (version 1.61) for the E540 is here, as Lenovo has taken it off their main download site: http://download.lenovo.com/ibmdl/pub/pc/pccbbs/mobiles/j9uj09wd.iso

MPPT Solar Charger for Arduino

IMG_0001

I’ve been studying and closely following in some cases the work of Tim Nolan (timnolan.com, Michael Pedersen (http://techmind.dk) and Julien Ilett (http://256.co.uk and https://www.youtube.com/user/julius256) as I’ve been on a mission to build an MPPT Solar Battery charger for off-grid usage based around Arduino. The focus has been on cheap, available components and open-source design.

Whilst I’m not really anywhere near having something working, I’ve uploaded to GitHub the WIP code based on Tim Nolan’s (modified by Michael Pedersen) code and will have more info coming soon around circuit schematics and the modifications I have made once I start to get things working a bit better.

I’m basing my hardware around the ACS712 30A current sensor module, and Arduino Pro Mini, a Freetronics 128×128 OLED module, a pair of voltage dividers for voltage measurement, and a simple buck convertor controlled by the PWM output of the Arduino.

Feel free to get in contact if you’ve got any questions about the code or what I’ve been working on. I’d love to hear from people working on similar projects.

Configuring FCoE on IBM Switches

A bit of background

I’ve been lucky enough recently to work with FCoE a fair bit, using IBM Systems Networking switches both rack-mount and integrated into the IBM Flex/PureFlex chassis. What I’ve found is that getting FCoE running with IBM equipment is actually really easy. I thought I’d try and condense the wealth of information in the Application Guides down to a blog post.

Through talking with people about FCoE, and especially the IBM equipment, I get the impression that FCoE is a fairly common method these days for implementing storage fabric, however not everyone I speak with has an understanding of how it all hangs together, so I’ll attempt to also cover the general theory behind FCoE where necessary.

FCoE?

I think everybody has at least heard of FCoE by now. If you are fairly unfamiliar with it, FCoE is basically a way to carry Fibre-Channel storage traffic using Ethernet switching equipment. The equipment needs to support FCoE, namely CEE/DCB for transmitting frames, and FIP snooping (or FIPS) for inspecting FCoE FIP packet contents to ensure packets are only transmitted between valid FCoE endpoints.

You also require an FCF (Fibre Channel Forwarder) somewhere in your FCoE fabric (storage network), which dissects the FCoE packets for the purposes of WWN zoning or NPIV (N-Port Identification Virtualisation). The FCF is then connected to your storage via plain old FC, or FCoE, completing the fabric.

Let’s consider an example scenario, with an IBM EN4093 Flex chassis switch, connected to an 8264CS converged fabric switch/FCF. We’ll assume the Omni (switchable Ethernet/FC SFP+ ports) on the 8264CS switch are connected to existing Fibre-Channel storage.

All of the below commands require your log on to the switch CLI via SSH, Telnet or using a serial cable. Setting up these connections is covered in the installation documentation for the switches.

tagging, switchport, ISCLI?

You may notices that subtle differences (and some overt ones) exist between the EN4094 and 8264CS configuration syntax on the switches you are trying this on. The CLI used on IBM switches is in the process of transitioning between IBMNOS (the old BNT style configs) and ISCLI – the new “industry standard” (read: Cisco) style configuration. The syntax changes slightly each firmware release to make it more intuitive, and more familiar to people without IBM/BNT switch experience. Check the firmware versions of your switches if any of the below does not work, I recommend updating to at least version 7.7. The updates can be downloaded from IBM FixCentral.

On the EN4094

Start by enabling CEE on the Chassis switch. This will allow DCB/CEE capable servers connected to the switch to negotiate via DCBX to use DCB/CEE instead of standard Ethernet.
cee enable
You will then need to create a VLAN for FCoE traffic. I suggest using VLAN 1002 unless you have a reason to change it, as most FCoE HBAs will default to VLAN 1002, and you might save yourself some configuration on the server’s HBA and OS.

#enter the vlan configuration context for VLAN 1002
vlan 1002 

    #Just a descriptive name to make the config easier to read, not required
    name "FCoE"
 
    #An example of adding a single physical port to the FCoE network. 
    #Not required if you are only using UFP or vNICs to connect your servers.
    member INTA1 

    #Replace with the ports your 8264CS switch are connected to (uplinks)
    member EXT13,EXT14 

    #An example of adding a virtual NIC (UFP or vNIC) to the FCoE network. 
    #Not required if you are not using vNICs/UFP
    vmember INTA2.2 

#exit VLAN configuration context
exit

Enable tagging on your uplink ports, as you will likely want to carry other VLANs. It is also possible to have dedicated uplinks for FCoE traffic. If you are doing that – you can skip this step.

#Enter the configuration context for the two
#external uplink ports. Substitute EXT13,EXT14
#for your uplink ports.
interface EXT13,EXT14

    #A descriptive name for the ports, again, not required.
    name "Uplink"

    #Enable tagging (trunking in Cisco-speak)
    tagging

    #Exit the configuration context
    exit

Then enable FIP Snooping, so that FCoE sessions can be detected and established on the switch.
fcoe fips enable

This is all that is required on the access (chassis) switch to pass FCoE traffic through to the uplink ports.

On the 8264CS

The 8264CS is a fairly new switch in the IBM lineup. It is basically an 8264 top-of-rack switch, with an additional full FC switch, and can act as a Fibre Channel Forwarder. It can therefore be directly attached to FC devices, the zoning being done on the switch, but also supports NPV/NPIV. I’ll cover both NPIV and FCF modes of operation.

The Familiar Stuff

We will need to configure CEE, FIPS and VLANs. We covered these when we configured the EN4093, so I’ll write these without much explanation.

#Enable CEE
cee enable

#Configure FCoE VLAN
vlan 1002 
    name "FCoE"

    #the ports connected to your access switch, in
    #this example, the EN4093. 
    member 20,21
    exit

#Configure links to access switch (EN4093)
interface 20,21
    name "EN4094 Chassis Switch"
    tagging
    exit

#Enable FIPS
fcoe fips enable

Omni/FC Port Configuration
The first thing we have to do before we can use the 8264CS to complete the FCoE fabric, is to enable some FC ports. Even if you are connecting to FCoE-enabled storage using RJ45 copper links, you need to have some Omni ports configured as FC ports. This seems like an odd limitation, until you consider that there are separate Ethernet and FibreChannel boards in the 8264CS. Configuring some FC ports allows the FC switch board to participate in the FCoE VLAN, and provide FC switching/zoning or NPIV to WWNs in the FCoE fabric.

If you do not have any native FC devices/switches that you would like to connect your FCoE fabric to, just add some dummy FC ports. Ports need to be added in top/bottom pairs, so instead of just adding port 53, you will need to add 53 and 54.

#Configure Omni ports 53 and 54 as FC ports
system port 53,54 type fc

#Add the FC ports into your FCoE VLAN, so that
#the built-in FC switch can dissect and 
vlan 1002
    member 53,54
    exit

NPV/NPIV
NPIV is a great way to avoid compatibility issues that often spring up when trying to join FC fabrics with switching equipment from different vendors, or for avoiding fabric domain limits in larger fabrics. In NPIV, each endpoint has it’s own virtual WWN, that appears as a Virtual N-Ports in the upstream FC fabric. To enable this, you have to configure your FCF to provide NPIV on your FCoE VLAN, and specify the ports that connect to the FC fabric in the NPIV traffic map.

NPIV, like FCF mode, is enabled on a per-VLAN basis.

#Enter the FCoE VLAN configuration context
vlan 1002
    #Enable NPV, which provides NPIV - confusing much?
    npv enable
    
    #Configure FC ports connected to fabric as
    #external ports in the traffic map. These will
    #be your dummy FC ports if you are using only FCoE
    #and no FC connected devices!
    npv traffic-map external-interface 53,54

    #exit VLAN context
    exit

You will then need to configure zoning. This will either be done on the storage itself. In the case of the v3700/v7000, you will want to create hosts for each virtual N port. The WWNs you can use for zoning are shown on the 8264CS by running the below command.

This will show all FCoE connections. Note: WWNs will only be visible on your FCF, which is the 8264CS in this case.
show fcoe fips fcoe

FCF/FC Zoning
The 8264CS is a capable FC switch, and can provide an FC fabric of it’s own. It can also be joined to an existing fabric. It is good standard advice when connecting FC switches from different vendors to look out for compatibility issues. Don’t assume that switches from different vendors are correctly passing traffic through the fabric, test thoroughly. I have personally tested IBM switches with Brocade switches and had no issue, but if you do have issues, consider using NPIV between two separate fabrics, one fabric for each vendor.

To FCF mode on the switch, you have to enable FCF on your FCoE VLAN.

#Our FCoE VLAN, which are very familiar with by now.
vlan 1002

    #Enable FCF for traffic in this VLAN
    fcf enable

    #Exit configuration context
    exit

The next step, is to create your zone. You will need the WWNs of your HBAs, and the WWNs of your storage controllers. Create separate zones if your servers have multiple paths to the storage controllers, for example if they have multiple HBAs you would like to connect to the same LUNs, in order to provide multi-pathing.

#First, create aliases for your WWNs. Not required, 
#but much easier in the long-run if you need to change 
#WWNs on a server or storage controller!

#Alias for your server's HBA. Replaces the x's with 
#the actual WWN, be careful to include the colons between segments.
fcalias ServerHBA wwn xx:xx:xx:xx:xx:xx:xx:xx

#Alias for your storage controller, you will likely have multiple
#controllers for the same storage, so create more aliases, and 
#include them in the zone if required.
fcalias StorageController xx:xx:xx:xx:xx:xx:xx:xx

#Create a zone named FirstZone
zone name FirstZone
    #Add aliases of HBA, and Controller.
    member fcalias ServerHBA
    member fcalias StorageController
    
    #exit this context
    exit

#Now create a zone set. We only have one zone, but
#if you have multiple related zones, you can group 
#them logically.
zoneset name ActiveConfig
    member FirstZone
    exit

#Now make the zone set, and zone, active.
zoneset activate name ActiveConfig

You should then be able to rescan your HBAs and see the storage controllers on the server.

Easy, right?
Feel free to send me any questions, or let me know if you hit any snags. I hope this is useful, and that it saves you some time. Thanks for reading.

LACP on RHEL 6.4

LACP, or the link aggregation control protocol, is a network protocol that enables active/active redundancy between network-connected devices. A common use for LACP is providing more than one physical network link between server and switch for redundancy purposes. Unlike other methods for providing redundancy, LACP also makes the total bandwidth of the links usable, meaning nothing is wasted.

Under Linux, LACP is provided as part of the bonding driver, specifically by passing the mode=4 option when loading the module.

Once the module is loaded, you can use ifenslave to “enslave” the physical network interfaces to a bonding interface. The physical network interfaces must be connected to network ports that provide LACP, and have been configured with the same LACP admin key.

Of course, you don’t need to screw around with module options and ifenslave under most Linux distributions in order to configure an LACP-capable interface. Under RHEL 6.4, it’s a matter of configuring the physical interfaces, and the bonding interface, using interface configuration files, which are found in /etc/sysconfig/network-scripts/.

For a typical two interface LACP setup, you would need to create three interface definitions.
The first two (or more, if you would like to configure additional interfaces) looks like this:

/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
USERCTL=no
NM_CONTROLLED=no
ONBOOT=yes
BOOTPROTO=none
SLAVE=yes
MASTER=bond0
HWADDR=00:00:00:00:be:ef

Change the interface name in the filename (ifcfg-eth0) and the DEVICE= line to match the interface name of the physical interface.

Also change the HWADDR= line to match the MAC address of the interface as well. This is really important, or the network scripts will freak out and apply this configuration to any interface it wants to. You can find the MAC address by looking at the output of ifconfig eth0 for the interface, replacing eth0 with the interface name.

You will need to create one interface definition file per physical interface involved in the LACP link.

You then need to create the logical interface, bond0. If you need to create multiple logical interfaces out of multiple groups of physical interfaces, in the MASTER= line of the physical interface definition files above, and in the name and DEVICE= line of the logical interface definition file below, you will need to substitute bond0 with bond1 etc.

/etc/sysconfig/network-scripts/ifcfg-bond0
ONBOOT=yes
USERCTL=no
BOOTPROTO=none
BONDING_OPTS=“mode=4”
NM_CONTROLLED=no
IPADDR=x.x.x.x
NETMASK=x.x.x.x

Note the BONDING_OPTS="mode=4" line. This configures the bonding interface to use LACP. You can pass any options you like to the bonding module here as well. I have listed the IPADDR and NETMASK lines here to demonstration that the IP configuration should be recorded in the logical bonding interface configuration file, not the physical interface configuration files. If you are using a dynamic IP, you can set BOOTPROTO=DHCP to use DHCP instead of static IP configuration, removing the IPADDR and NETMASK lines.

For more information about what you can and can’t put into these configuration files, take a look here.

Loading Windows PE with iPXE

One of the things that got me looking at iPXE, was the number of Operating Systems you can load. I often need to deploy series of systems with a mixture of Windows, VMware and Linux of various flavours, so having one tool to network boot them all is really handy.

To get Windows loaded, I use Microsoft’s MDT to script an unattended build. In order to install an MDT-based build over the network, you need to get Windows PE loaded first, and the PE media will then load the rest of the scripts (VBScript) and GUI (HTA-based) and run through the task-sequence you configure in the MDT wizard.

Loading PE turned out to be not so tricky:

#!ipxe
  
kernel wimboot
initrd http://${next-server}/win/bootmgr          bootmgr
initrd http://${next-server}/win/BCD              BCD
initrd http://${next-server}/win/segmono_boot.ttf segmono_boot.ttf
initrd http://${next-server}/win/segoe_slboot.ttf segoe_slboot.ttf
initrd http://${next-server}/win/wgl4_boot.ttf    wgl4_boot.ttf
initrd http://${next-server}/win/boot.sdi         boot.sdi
initrd http://${next-server}/win/lt64.wim         boot.wim
imgstat
boot

The first thing you need is the wimboot kernel, available here. Once you’ve compiled it and placed it in your PXE/HTTP path, you can load it like you would any Linux OS kernel. The various PE components are then loaded into memory using the initrd command, and boot continues as normal as if you had booted the DVD, with PE resident in memory.

The files – bootmgr, BCD, boot.sdi and boot.wim can all be copied from your PE media when generated in MDT.
The fonts – segmono_boot.ttf, segoe_slboot.ttf and wgl4_boot.ttf are optional, however highly highly recommended. If anything goes wrong with your boot, you won’t get any messages on-screen without these fonts.

The first argument to initrd should point to the TFTP or HTTP location of the file. The second should remain the same as the above example, as wimboot/PE looks for files loaded into memory with these names.

Using iPXE to install RedHat Enterprise Linux (RHEL) 6.4

It took me a bit of experimenting to get RHEL 6.4 install booting using iPXE, especially when passing the correct network device to the installer, so that it knows which interface to use when accessing the repositories.

The below iPXE script got me installing in the end, so I thought I would share.

#!ipxe

echo Booting RHEL 6.4 x86_64
imgfree
set base-url http://${next-server}/rhel64/
kernel ${base-url}images/pxeboot/vmlinuz ksdevice=${netX/mac} ks=${base-url}ks.cfg
initrd ${base-url}images/pxeboot/initrd.img
imgstat
boot

You might notice the ksdevice line in the above kernel command. This is possibly the most important part of doing a successful unattended install, as it tells the installer which interface is used to access the kickstart file and any installation media. Setting this to ${netX/mac} passes the MAC (hardware) address of this interface through to the installer, which works out the correct interface based on this. I saw a lot of different methods for achieving this, but this method is by far the most reliable and simplest I was able to come up with when booting RHEL.

The variable base-url uses the variable next-server, which is the IP address of the server the machine was PXE-booted from. You will likely need to modify base-url to match the directory on your HTTP server where the RHEL 6.4 installation DVD is extracted.

You can simply copy the files from the mounted DVD (or loop mounted) to a directory in your HTTP server’s path, no modification necessary. You should also place your kickstart file, if you are using one, in this same directory. If it is not called ks.cfg, you should change the end of the kernel line in the above script to reflect the correct file name.

If you are not using a kickstart file, you can remove the ksdevice and ks arguments from the kernel command.

iPXE PHP Scripts – Basic Setup

I’ve been lucky enough to have a chance to play around with iPXE. iPXE is a replacement PXE firmware intended for burning on to network cards in order to bring their functionality up to speed with more modern PXE firmwares. It can also be chain loaded from existing PXE firmwares.

Part of my job sees me deploying large numbers of servers at once, often with different OSes and build requirements. I’ve always been extremely lazy innovative, so I’ve been working on building iPXE scripts for all of the OSes I have been deploying, and booting the correct one for each server using a PHP script that checks the serial number of the server.

Of course this is all possible with iPXE, because it’s amazing. The basic flow of how this works is:

  1. PXE firmware on card sends DHCP request
  2. DHCP server is configured to provide an IP address, and then pass a boot file, which in our case is the iPXE undionly.kpxe file used to chain load iPXE
  3. iPXE can be compiled to use a default script, which in my case chain loads a script from the boot server, which is actually a PHP file running on NGINX which returns iPXE commands.
  4. Profit

A few tips that will hopefully get you chain loading iPXE and writing your own dynamic scripts, to make your life easier:

iPXE chain loading itself

When you configure DHCP to send out the undionly.kpxe version of iPXE for chain loading, you have to remember that by default, iPXE will do its own DHCP broadcast, and then receive the boot file name of itself (undionly.kpxe), which it will then load.

You can break this cycle with a conditional in your DHCP (ISC DHCPd in this example) configuration:

subnet 192.168.1.0 netmask 255.255.255.0 {
    option domain-name-servers 192.168.1.1;
    option routers 192.168.1.1;
    next-server 192.168.1.1;
    if exists user-class and option user-class = "iPXE" {
        filename "default.ipxe";
    } else {
        filename "undionly.kpxe";
    }
}

Obviously you can replace the IPs and filenames with whatever you like, so long as undionly.kpxe refers to the UNDI version of iPXE, and default.ipxe points to your default iPXE script you would like to load once iPXE has been chain loaded.

Using PHP to generate dynamic iPXE scripts

When chain loading an iPXE script using the chain command, the target can be an HTTP path, so long as the returned data is of type text/plain, and the content begins with the line #!ipxe followed by two carriage returns.

An example of this:

<?php
header("Content-type: text/plain");
echo "#!ipxe\n\n";
echo "chain -ar myscript.ipxe\n";
?>

Use HTTP to load everything!

One thing you will notice when you start using iPXE, and start reading some of the scripts people put together (I will be posting some examples in another post) is that HTTP is used quite often to load things like ISO files, initrd/initramfs files, and in the case of Windows, .WIM files – instead of using the more traditional TFTP protocol, which is often used when PXE booting machines.

This is because HTTP is much, much faster. TFTP uses UDP, which gives no guarantee that packets will arrive in the right order, or at all, and as such has to be a lot more conservative with how large and how furiously transmitted those packets can be. It’s also a lot simpler to implement in tiny PXE boot roms, which is why it is used. HTTP however uses TCP, which guarantees delivery of each packet and also the sequence of the packets, at a protocol level, so the packets can be sent as quickly as the underlying switching and network cards will allow.

One example, when booting windows, is the loading of the install.wim file. On 10 gigabit ethernet, I have had this take 15-20 minutes to transfer across the wire, for a 3GB Windows PE image. This same file with HTTP over the same link takes 2-3 minutes.

To use HTTP, all you have to do is specify an HTTP URL for a file as an argument to any of the iPXE commands which accept files, for example, the chain, imgload, kernel or initrd commands, and iPXE will do the rest. Obviously this means you will need to run an HTTP server such as NGINX (recommended), lighttpd or Apache to server the files in additional to your TFTP server.

Fridge temperature control testing

I’ve got my fridge up and running using an ATMEGA328P based Leostick from freetronics – the code is on Github here.

One thing I noticed during the initial testing was the lag between the temperature inside the fridge and fermenting beer, and the temperature of the fridge enclosure was much bigger than I thought! The fridge enclosure would be consistently 10-15 degrees celsius below that of the inside/beer temperature. This was a problem – because turning the compressor in the fridge off based on this temperature, meant that over the 5-10 minutes following, the fermenting beer would cool some 10 degrees below the cutoff temperature.

I’ve combatted this by adding a predictive cutoff. I’ve tuned it for my fridge, and now the temperature gradually lowers to the target temperature after the compressor is turned off. I have also added a timer to prevent the compressor from turning on within 15 minutes of it being turned off – to protect the compressor and also to allow the temperature to drop once the predictive cutout has happened.

I’ll be adding an ethernet module to the setup shortly, so I can get some control and logging happening. Once that’s done, I should be able to chart the temperature over time to better tune the temperature control.