Getting My Real VM Server Back Online Part III: Storage, iSCSI, and Live Migrations

After some dubious network configurations (that I should have never configured incorrectly) I finally got multipath working to the main storage server. All of the multipath.conf examples I saw resulted in non-functional iSCSI MPIO, while having no multipath.conf left me with failover MPIO instead of interleaved/round-robin.

A large issue with trying to get MPIO configured was the fact that all the examples I found were either old (and scsi_id works slightly differently in Ubuntu 14.04) or just poor. Yes, I wound up using Ubuntu. Usually I use Slackware for EVERYTHING, but lately I’ve been trying to branch out. Most of the VMs run Fedora, “Pegasus” or VMSrv1 uses Fedora, “Titan” uses Ubuntu.

Before I did anything with multipath.conf (It’s empty on Ubuntu 14.04), I got this:

root@titan:/home/frankd# multipath -ll
size=256G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 13:0:0:0 sde 8:64 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 12:0:0:0 sdd 8:48 active ready running

Note the disks are both round-robin — with only one member each! This works for fail-over, but did nothing for performance. The only thing that wound up working for multipath.conf was this:

defaults {
 user_friendly_names yes
 polling_interval 3
 path_grouping_policy multibus
 path_checker readsector0
 path_selector "round-robin 0"
 features "0"
 no_path_retry 1
 rr_min_io 100

multipaths {
 multipath {
  wwid 1FREEBSD_HTPC1-D1
  alias testLun

The wwid/alias doesn’t work, however. All of the MPIO is just coming from the defaults stanza. I attempted many things with no luck, unfortunately. I’m going to have to delve into this more especially if I want live migrations to work properly with MPIO. As it stands the disk devices are pointing at a single IP (ex /dev/disk/by-path/ip-, I’ll need to point at aliases to get the VMs working with multipath.

The multipath tests themselves were promising though, dd was able to give me a whopping 230MB/s to the mapper device over a pair of GigE connections.

The output from ‘multipath -ll’ now looked more reasonable:

root@titan:/home/frankd# multipath -ll
size=256G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 39:0:0:0 sde 8:64 active ready running
  `- 40:0:0:0 sdg 8:96 active ready running

You can see the drives are both under the same round-robin policy instead of two separate ones.

The storage server also saw some slight changes, including upgrading from one Intel X25-V 40GB for L2ARC to 2xX25-Vs for a total of 80GB. I also added a 60GB Vertex 2 as a SZIL device. I really need to build a machine with more RAM and partition out the SZIL. I’ll likely wind up using my 840Pro 256GB for L2ARC and leave the old X25Vs out of the main array once I get a pair of 10GbE cards for maximum speed (hopefully near-native of the 840Pro — perhaps better with a large amount of ARC) to my workstation.

So we’re at a point where everything appears to be working, although in need of some upgrades! Great! I’m looking at a KCMA-D8 Dual Opteron C32 motherboard as I have a pair of Opteron 4184s (6 core Lisbon, very similar to a Phenom II X6 1055T) laying around, so I could put together a 32GB 12 core machine for under $400 — but as always, budgetary constraints for a hobby squash that idea quickly.

Getting My Real VM Server Back Online Part II: Storage Server!

Anticipating the arrival of RAM for my VM server tomorrow I decided I needed some kind of real storage server, so I started working on one. I haven’t touched BSD since I was a kid, so I’m not used to it in general. I wasn’t sure how OpenSolaris would work on my hardware (I hear it’s better on Intel than AMD) so I opted for FreeBSD. Unfortunately I just found out FreeBSD doesn’t have direct iSCSI integration with ZFS, but that’s okay! We can always change OS’s later, especially since the storage array leaves a lot to be desired (RAID-Z1 with 4x1GB 2.5″ 5200RPM drives + 40GB Intel X25V for L2ARC, no separate ZIL).

I’m getting used to the new OS and about to configure iSCSI, which will be handed out via multipath over an Intel 82571EB NIC into two separate VLANs into a dedicated 3550-12T switch. We’ll see how it works, and if it’s fine I’m going to get my HTPC booting over it.

I’m going to look around for a motherboard with more RAM slots, for now I’m stuck with a mATX motherboard, a SAS card that won’t let the system boot, and 2 RAM slots (8GB) with an FX-8320.

Performance tests to come.. after I encounter a dozen issues and hopefully deal with them!

Getting My Real VM Server Back Online

My server has been off hiding somewhere far away from me for a while, so I’ve been running virtual machines on an AMD FX-8320 990FX based box. Unfortunately it only had 16GB of RAM and I gutted the server RAM for use in my workstations.

I’ve decided to order some used ECC Registered 4GB sticks off of eBay — 32GB ought to do for now. I won’t have to worry about whether I can launch a new VM due to RAM constraints (I was using a lot of swap before!), so titan.frankd.lab will soon be back online with the FX-8320 machine for failover. I’m going to need shared storage, so I’ll have to setup a real iSCSI storage box soon.

End short random thought.

Another VM Host Upgrade

And yet another not-exciting blog entry. My VM host with an FX-8320 was on an AMD 760G board so it lacked IOMMU which I’d love to have for SR-IOV among other things. I have a spare machine laying around that was formerly a gaming machine. Needing more RAM (The 760G board only had two slots) and IOMMU, I decided to repurpose the gaming machine as the VM host. The 990FX based board already had an FX-8120 in it, so I took a single step back in CPU generation but it’s fairly close. I only had 8GB of RAM in the old setup, so I combined that with 2x2GB sticks of ECC DDR3 RAM I had hiding in a box. I have a bit of head room now and can launch a few more VMs with 12GB of total RAM. While that’s not impressive as far as virtualization host hardware goes it does let me run a bunch of local services for testing/learning/re-learning. Not having onboard graphics with the new board necessitated the use of another video card, luckily I had some GTX 750 Tis laying around (I seem to experience ‘laying around’ about hardware pretty often) so one went in the bottom PCI-E x4 slot so as to not block any other slot for future upgrades. The Intel I350-T2 card went in the next x4 slot for iSCSI.

VM storage is going to be split off from the hardware, so it will all be through iSCSI with MPIO. That pretty much just leaves me with a ton of PCI-E slots for NICs.

I was able to reduce reported CPU TDP by offlining the “odd” cores (1/3/5/7) while load is low (better to offline these cores as 01, 23, 45, 67 are shared in AMD’s CMT architecture), locking the CPU at idle and reducing power state 6 (idle) voltage from 0.9375v to 0.825v which has been stable so far (sensors reports 0.85v). Power tends to stay close to 30w and never breaks 50w. If it was more heavily utilized I’d let it clock up, but nothing is CPU limited at the moment. I’ll have to try monitoring power usage at forced idle vs the ‘ondemand’ governor with various load transition points. I wouldn’t call anything sluggish, but I don’t have hundreds of devices on my network.

As for a power supply, the case already had a SeaSonic 660XP2 80+Platinum power supply, so even if I do have to run the CPU at full tilt there should be little waste in the PSU department. It’s completely overkill both for being Platinum at this power level (likely sub 100w at all times), and for its 660w rating. If I was going to buy something I probably would’ve got a SeaSonic Gold which would still allow me plenty of headroom even if it was full of NICs and RAM. It does feel a little safer than running off a 180w power supply with an FX-8320 and a drive array, though.

There’s plenty of local services running here, eventually I’m going to make some (counter)intuitive web GUIs for configuring stuff (ie IP Address Management which then configured DHCP/DNS).. so it was good to brush up on configuring these things from scratch.

More Virtualization Hardware

The dual C32 Opteron 4284 (aka titan.frankd.lab) server I have had its 16GB of RAM gutted for my workstation. I’m looking at 32GB-64GB kits on eBay to try to get a decent amount of memory for virtualization. I’ve also picked up an Intel I350 card to go in the “light-weight” FX8320 server in order to have a card with SR-IOV, so it will still have 4 Intel ports + the onboard Broadcom. For now the I350-T2 is replacing the GT430 and video duty is being served by the onboard graphics. It’s not as if a server needs to have video output, but it does make things somewhat easier as I’m right next to the hardware. The other alternative would be running a Windows X server so I could have access to virtmanager easy-mode.

The C32 already has 2 82571 ports and 2 82576 (SR-IOV) ports, so the extra I350 card will eventually go in there for a total of 6 ports while the FX8320 machine keeps its dual 82571EB cards.  This leaves me with lots of options for super-segregating networks and great offload capabilities for the 82576/I350 NICs. Currently I’m running many VLANS to keep things separate (SevOne/Icinga/Graylog all have their own respective VLANs), which means I might be able to get a little more interesting with my routing for actual use gibt es viagra rezeptfrei.

More ports will be interesting if I actually build an iSCSI server (which I’m certainly planning on once I have money I can actually spend).. though it may necessitate a higher quality switch (4948), or some 10G interconnects.. or IPoIB (IP over Infiniband).

Temporary VM Host update

The AMD E350 (2×1.6GHz Bobcat cores) was a little light on CPU power as a VM host so I changed it out for a uATX board with an AMD FX-8320 (8×3.5GHz). The PSU in the uATX case is a little light so turbocore was disabled to try to keep it alive. Two flexible PCI-E risers (1x using a USB3.0 cable for data transfer) were added to the single 5.25″ bay with one on each side of the bay. In those slots are  Intel 82571EB-based dual GigE NICs facing each other and tied together with a couple #6-32 standoffs tying them together where a PCI bracket would normally go. Unfortunatley the motherboard is only 760G based, so no IOMMU support for passing through GigE ports (I’d need a 970 or 990FX motherboard for that). Being on x16->x1 powered risers (commonly used in coin mining setups) leaves just enough bandwidth for 2 GigE ports ignoring buffer bursts.

Tiny VM Server

Tiny VM Server

I had an nVidia GT430 (49w TDP, GF108) laying around, so that’s more than adequate for video output for the server without wasting any of the precious 8GB of RAM for onboard video. A GeForce 210/205 would be better as far as power usage, but unfortunately the only 210 I had bit the bullet. A GTX 750 GM107 would also be great, but not worth the cost — and not available in half-height unfortunately.

The same 8GB of RAM and 500GB SATA HDD remain. If the GT430 stays that leaves no more room for more NICs as the board only has x1/x16/x1 slots. It may be worth giving up the x16 slot for a 4 port GigE card if I ran into one. It may be worth the loss of a little system RAM in order to pick up 4 more ports.

To keep TDP down even further I may lock down the power states via TurionPowerControl as there’s no adjustable TDP settings like there are on my 2P C32 server.

VMs, Linux Software Bridges and 802.1q — What I Learned This Time

When initially setting up the box, I had the idea in my head that I might create several bridges. One for each VLAN. That’s probably one of the best ways to tackle the issue unless you really want the trunk to exist on the VM –which is also fine and valid. But by default it gives every device the option of accessing any VLAN in the trunk, which since we’re in a lab environment is not particularly an issue.

But I like to work reality into lab mockups as much as possible. I have plenty of NIC ports, so even creating a lot of trunks is not an issue. But our VMs will accept a large number of virtual NICs, so this option seemed semi-elegant.

The first issue I ran into was crosstalk between the VLANs, I had created a bunch of 802.1q sub-interfaces (which strip/tag incoming and outgoing frames) via ‘vconfig’ or ‘ip link’. I attached p32p1.10 to br10 and p32p1.1 to br1. I attached tap0 to br10 and tap1 to br1. Everything appeared to be working on the very initial configuration until I saw the output of ‘sh cdp nei’ on the physical Cisco 3550. It saw itself. That meant it was receiving bounceback. So I loaded up tcpdump and watched bridge traffic and examined the macs in the Linux software bridge. There was definitely cross talk — and after a hunch and a little investigation it turns out that QEMU doesn’t do much to separate NIC traffic as I called them with the ‘old’ syntax. After updating my QEMU launch options the problem disappeared and I was happy… until…

Neither OSPF or EIGRP were forming neighborships. Load up tcpdump again, examine traffic. I see the packets hitting br1 and br10 from XRv, and from both the 3550 and the 2821 that I’m currently configuring. That looks good.. but ‘debug ospf packet’ on XRv was not giving me anything aside from what it was sending out. So I aimed tcpdump at the tap interfaces instead, and I saw that the tap interface was not receiving the HELLO packets on either VLAN. (Hint: Here is where I went wrong in my diagnostic chase, I had filtered tcpdump down to only EIGRP/OSPF, had I not the problem would’ve been almost immediately evident)

Thinking for some reason with no basis in fact that it may be a multicast issue with Linux software bridges, I decided to configure neighbors manually. That also resulted in a neighborship not forming with XRv to any other box. Other traffic (ICMP/TCP/UDP) was unaffected, so I thought that was interesting. I started watching the interface again, this time with no filter — and I saw the VM host replying to XRv with an ICMP Unreachable. Pretty clearly a firewall rule problem. Ebtables (iptables for layer 2 stuff if you haven’t seen it) was clear, and I didn’t see anything immediately in iptables.. but it’s always faster to test fixes than to examine things (and in a lab, perfectly acceptable), so a simple iptables -I FORWARD -j ACCEPT while removing the manual neighborships so EIGRP and OSPF would both go back to using multicast resulted in everything working viagra aus holland bestellen. Great! Classic implicit deny caught me.

This is where I usually get annoyed by pre-configured rules. Usually I load up Slackware, but I’ve been using Fedora lately for ease of getting some things up and running with real dependency management. Had I been using Slackware with its default no rules everything would’ve been honky dory, and I would’ve configured some myself when I felt the time was right.

To continue — I tried to make a new bridge, trunk0, with p32p2 in it. I load up tcpdump and notice that there’s no traffic aside from STP on some VLANs that aren’t in active use. Apparently configuring those subinterfaces whisks the frames away from the main interface, so I just deleted all of them off of p32p1, configured another trunk port and added that to trunk0 on the Linux box and voila! Tagged packets from everywhere! I have yet to try a tap interface into trunk0 and a VM, but I have a feeling everything should be all right. Then again, every time I have that feeling is usually when things are about to go terribly wrong.

VM Host, IOS XRv, CSR1000V

I’m trying to get some IOS-XR and IOS-XE VMs machines up. Mostly to play with some of the IOS-XR configurations. After playing around with some Linux networking stuff that I haven’t done in a while I was finally able to get the 801.q trunks through both the Linux bridge and individual VLANs elsewhere. The preconfigured ebtables and iptables rules in Fedora 20 are really annoying, I’ve always preferred to start with Slackware and a blank slate.

So far I have one IOS-XR instance up running successfully and traffic is now normal after I had some weird inter-bridge traffic caused by qemu-kvm.

Ah — the other thing, instead of everyone’s standard VMWare setups, I of course am sticking to my familiar virtualization technologies and running qemu-kvm with all my standard Linux tools. Unfortunately with just a low power dual core AMD E350 and 8GB of RAM at the moment I won’t be running too many instances as XR/XE are really RAM heavy.

Thankfully I’ve kept a separate VLAN and VRF setup on every device for management only so I can (usually) get back into boxes if I break their config without rummaging around for a console cable and USB to RS232 adapter. I really need a 16 port serial card.

So I’m looking at probably 8 XRv instances and 4 XE instances to play with — unfortunately they have to suffer through a 2mbit rate limit so I can’t really use them in my network.. and they are sadly ridiculously resource heavy. But they’ll be fine to learn some of the IOS-XR stuff on, I suppose.

© 2017 Musings

Theme by Anders NorenUp ↑