CategoryLinux Networking

Getting My Real VM Server Back Online Part III: Storage, iSCSI, and Live Migrations

After some dubious network configurations (that I should have never configured incorrectly) I finally got multipath working to the main storage server. All of the multipath.conf examples I saw resulted in non-functional iSCSI MPIO, while having no multipath.conf left me with failover MPIO instead of interleaved/round-robin.

A large issue with trying to get MPIO configured was the fact that all the examples I found were either old (and scsi_id works slightly differently in Ubuntu 14.04) or just poor. Yes, I wound up using Ubuntu. Usually I use Slackware for EVERYTHING, but lately I’ve been trying to branch out. Most of the VMs run Fedora, “Pegasus” or VMSrv1 uses Fedora, “Titan” uses Ubuntu.

Before I did anything with multipath.conf (It’s empty on Ubuntu 14.04), I got this:

root@titan:/home/frankd# multipath -ll
size=256G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 13:0:0:0 sde 8:64 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 12:0:0:0 sdd 8:48 active ready running

Note the disks are both round-robin — with only one member each! This works for fail-over, but did nothing for performance. The only thing that wound up working for multipath.conf was this:

defaults {
 user_friendly_names yes
 polling_interval 3
 path_grouping_policy multibus
 path_checker readsector0
 path_selector "round-robin 0"
 features "0"
 no_path_retry 1
 rr_min_io 100

multipaths {
 multipath {
  wwid 1FREEBSD_HTPC1-D1
  alias testLun

The wwid/alias doesn’t work, however. All of the MPIO is just coming from the defaults stanza. I attempted many things with no luck, unfortunately. I’m going to have to delve into this more especially if I want live migrations to work properly with MPIO. As it stands the disk devices are pointing at a single IP (ex /dev/disk/by-path/ip-, I’ll need to point at aliases to get the VMs working with multipath.

The multipath tests themselves were promising though, dd was able to give me a whopping 230MB/s to the mapper device over a pair of GigE connections.

The output from ‘multipath -ll’ now looked more reasonable:

root@titan:/home/frankd# multipath -ll
size=256G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 39:0:0:0 sde 8:64 active ready running
  `- 40:0:0:0 sdg 8:96 active ready running

You can see the drives are both under the same round-robin policy instead of two separate ones.

The storage server also saw some slight changes, including upgrading from one Intel X25-V 40GB for L2ARC to 2xX25-Vs for a total of 80GB. I also added a 60GB Vertex 2 as a SZIL device. I really need to build a machine with more RAM and partition out the SZIL. I’ll likely wind up using my 840Pro 256GB for L2ARC and leave the old X25Vs out of the main array once I get a pair of 10GbE cards for maximum speed (hopefully near-native of the 840Pro — perhaps better with a large amount of ARC) to my workstation.

So we’re at a point where everything appears to be working, although in need of some upgrades! Great! I’m looking at a KCMA-D8 Dual Opteron C32 motherboard as I have a pair of Opteron 4184s (6 core Lisbon, very similar to a Phenom II X6 1055T) laying around, so I could put together a 32GB 12 core machine for under $400 — but as always, budgetary constraints for a hobby squash that idea quickly.

Rearranging The Intranet of Things Part II

I’m sure there will be a lot more posts like this to come. I had formerly moved the edge router to the ‘closet’ (aka the garage, right next to the cable modem and 3560-24PS sitting there) and added another router there to have a routed gig port into my ‘office’ (aka my bedroom with a couple desks).

Today I replaced both routers with a single 7206VXR with an NPE-G1. I had it all configured and everything should’ve worked off the bat, but it didn’t — not exactly, anyway. The routing was perfect, the NAT was great. But I only have a VAM card which doesn’t work with 15.x (only VAM2 cards work with new code), and I didn’t want it doing VPN in software.

So I decided to keep the old WAN router as VPN-only duty. I briefly considered using a 1760 with a VPN module (I have a few), but when I finally get to having decent internet speeds it would choke. The 3825 has an EPII+ card on top of the onboard hardware engine, so it should at the least have no issue keeping up with my internet connection with weak Triple-DES. The only issue is when I went to forward UDP 4500 from the edge router to the VPN router I got:

% Port 4500 is being used by system

I was able to successfully forward ports UDP 500 and ESP, but here I got stumped. I verified there was no crypto config, I tried clearing crypto stuff, I tried disabling software crypto — all with no luck. Googling didn’t give me much to go on, but I finally ran into something showing this error as an IOS-XE bug for 15.2(4)S2 –and I was running 15.2(4)S3 (pure IOS, but basically the same), so being out of options and ideas I decided to just install 15.2(4)M7 and Voila! Problem solved!

Two routers replaced with — two routers, maybe that doesn’t sound very good, but it will allow me to do more at the edge with more ports available directly on the router instead of playing with switches and VLANs/VRFs.

And in case you want to see how my network is physically wired — and this is somewhat simplified, here you are!

Network Diagram

Simplified Network Diagram – 01/01/15

Temporary VM Host update

The AMD E350 (2×1.6GHz Bobcat cores) was a little light on CPU power as a VM host so I changed it out for a uATX board with an AMD FX-8320 (8×3.5GHz). The PSU in the uATX case is a little light so turbocore was disabled to try to keep it alive. Two flexible PCI-E risers (1x using a USB3.0 cable for data transfer) were added to the single 5.25″ bay with one on each side of the bay. In those slots are  Intel 82571EB-based dual GigE NICs facing each other and tied together with a couple #6-32 standoffs tying them together where a PCI bracket would normally go. Unfortunatley the motherboard is only 760G based, so no IOMMU support for passing through GigE ports (I’d need a 970 or 990FX motherboard for that). Being on x16->x1 powered risers (commonly used in coin mining setups) leaves just enough bandwidth for 2 GigE ports ignoring buffer bursts.

Tiny VM Server

Tiny VM Server

I had an nVidia GT430 (49w TDP, GF108) laying around, so that’s more than adequate for video output for the server without wasting any of the precious 8GB of RAM for onboard video. A GeForce 210/205 would be better as far as power usage, but unfortunately the only 210 I had bit the bullet. A GTX 750 GM107 would also be great, but not worth the cost — and not available in half-height unfortunately.

The same 8GB of RAM and 500GB SATA HDD remain. If the GT430 stays that leaves no more room for more NICs as the board only has x1/x16/x1 slots. It may be worth giving up the x16 slot for a 4 port GigE card if I ran into one. It may be worth the loss of a little system RAM in order to pick up 4 more ports.

To keep TDP down even further I may lock down the power states via TurionPowerControl as there’s no adjustable TDP settings like there are on my 2P C32 server.

VMs, Linux Software Bridges and 802.1q — What I Learned This Time

When initially setting up the box, I had the idea in my head that I might create several bridges. One for each VLAN. That’s probably one of the best ways to tackle the issue unless you really want the trunk to exist on the VM –which is also fine and valid. But by default it gives every device the option of accessing any VLAN in the trunk, which since we’re in a lab environment is not particularly an issue.

But I like to work reality into lab mockups as much as possible. I have plenty of NIC ports, so even creating a lot of trunks is not an issue. But our VMs will accept a large number of virtual NICs, so this option seemed semi-elegant.

The first issue I ran into was crosstalk between the VLANs, I had created a bunch of 802.1q sub-interfaces (which strip/tag incoming and outgoing frames) via ‘vconfig’ or ‘ip link’. I attached p32p1.10 to br10 and p32p1.1 to br1. I attached tap0 to br10 and tap1 to br1. Everything appeared to be working on the very initial configuration until I saw the output of ‘sh cdp nei’ on the physical Cisco 3550. It saw itself. That meant it was receiving bounceback. So I loaded up tcpdump and watched bridge traffic and examined the macs in the Linux software bridge. There was definitely cross talk — and after a hunch and a little investigation it turns out that QEMU doesn’t do much to separate NIC traffic as I called them with the ‘old’ syntax. After updating my QEMU launch options the problem disappeared and I was happy… until…

Neither OSPF or EIGRP were forming neighborships. Load up tcpdump again, examine traffic. I see the packets hitting br1 and br10 from XRv, and from both the 3550 and the 2821 that I’m currently configuring. That looks good.. but ‘debug ospf packet’ on XRv was not giving me anything aside from what it was sending out. So I aimed tcpdump at the tap interfaces instead, and I saw that the tap interface was not receiving the HELLO packets on either VLAN. (Hint: Here is where I went wrong in my diagnostic chase, I had filtered tcpdump down to only EIGRP/OSPF, had I not the problem would’ve been almost immediately evident)

Thinking for some reason with no basis in fact that it may be a multicast issue with Linux software bridges, I decided to configure neighbors manually. That also resulted in a neighborship not forming with XRv to any other box. Other traffic (ICMP/TCP/UDP) was unaffected, so I thought that was interesting. I started watching the interface again, this time with no filter — and I saw the VM host replying to XRv with an ICMP Unreachable. Pretty clearly a firewall rule problem. Ebtables (iptables for layer 2 stuff if you haven’t seen it) was clear, and I didn’t see anything immediately in iptables.. but it’s always faster to test fixes than to examine things (and in a lab, perfectly acceptable), so a simple iptables -I FORWARD -j ACCEPT while removing the manual neighborships so EIGRP and OSPF would both go back to using multicast resulted in everything working viagra aus holland bestellen. Great! Classic implicit deny caught me.

This is where I usually get annoyed by pre-configured rules. Usually I load up Slackware, but I’ve been using Fedora lately for ease of getting some things up and running with real dependency management. Had I been using Slackware with its default no rules everything would’ve been honky dory, and I would’ve configured some myself when I felt the time was right.

To continue — I tried to make a new bridge, trunk0, with p32p2 in it. I load up tcpdump and notice that there’s no traffic aside from STP on some VLANs that aren’t in active use. Apparently configuring those subinterfaces whisks the frames away from the main interface, so I just deleted all of them off of p32p1, configured another trunk port and added that to trunk0 on the Linux box and voila! Tagged packets from everywhere! I have yet to try a tap interface into trunk0 and a VM, but I have a feeling everything should be all right. Then again, every time I have that feeling is usually when things are about to go terribly wrong.

VM Host, IOS XRv, CSR1000V

I’m trying to get some IOS-XR and IOS-XE VMs machines up. Mostly to play with some of the IOS-XR configurations. After playing around with some Linux networking stuff that I haven’t done in a while I was finally able to get the 801.q trunks through both the Linux bridge and individual VLANs elsewhere. The preconfigured ebtables and iptables rules in Fedora 20 are really annoying, I’ve always preferred to start with Slackware and a blank slate.

So far I have one IOS-XR instance up running successfully and traffic is now normal after I had some weird inter-bridge traffic caused by qemu-kvm.

Ah — the other thing, instead of everyone’s standard VMWare setups, I of course am sticking to my familiar virtualization technologies and running qemu-kvm with all my standard Linux tools. Unfortunately with just a low power dual core AMD E350 and 8GB of RAM at the moment I won’t be running too many instances as XR/XE are really RAM heavy.

Thankfully I’ve kept a separate VLAN and VRF setup on every device for management only so I can (usually) get back into boxes if I break their config without rummaging around for a console cable and USB to RS232 adapter. I really need a 16 port serial card.

So I’m looking at probably 8 XRv instances and 4 XE instances to play with — unfortunately they have to suffer through a 2mbit rate limit so I can’t really use them in my network.. and they are sadly ridiculously resource heavy. But they’ll be fine to learn some of the IOS-XR stuff on, I suppose.

© 2017 Musings

Theme by Anders NorenUp ↑