CategoryNetworking

Cellular backup (again) via Google’s Project Fi, a Cisco 3825 and an HWIC-3G-GSM

I get really poor signal with TMo and Sprint in my area, but I am using Project Fi for my cellular service. I figured I’d grab one of their free SIMs and put my HWIC-3G-GSM back in service. Unfortunately.. I get really poor signal. Like -102dB RSSI! I’m going to have to see if I can get a cheap TNC antenna with a cord long enough to put it outside; for some reason I can get good signal according to my Nexus 5X — even getting LTE just by stepping outside.

For now I figured I’d get the basic configuration done. Even with little to go by it was fairly easy to setup — especially as most cell carriers don’t have PAP/CHAP authentication.


router(config)#
chat-script gsm "" "ATDT*99#" TIMEOUT 30 CONNECT
!
interface Cellular0/0/0
 description Project Fi - TMobile
 ip address negotiated
 ip nat outside
 ip virtual-reassembly in
 encapsulation ppp
 dialer in-band
 dialer idle-timeout 360 either
 dialer string GSM
 dialer-group 1
 async mode interactive
 ppp chap hostname h2g2
 ppp chap password 0 ""
 ppp ipcp dns request
!
ip nat inside source list 1 interface Cellular0/0/0 overload
ip route 0.0.0.0 0.0.0.0 Cellular0/0/0
!
access-list 1 permit any
dialer-list 1 protocol ip list 1
!
line 0/0/0
 script dialer GSM
 modem InOut
 no exec
 rxspeed 3600000
 txspeed 384000
!
^Z


router#cellular 0/0/0 gsm profile create 1 h2g2 

Note that you can also pass authentication information in the GSM profile creation; however that isn’t needed for Project Fi.

This was a basic test just to see if I could get connectivity, I’m going to use it as an alternate route to the big cloud that is the internet in the event that I really need access and Time Warner Cable has let me down by abusing maintenance windows daily — as seems to be the case lately!

So far I’ve run into issues; I’d disconnect from the network and not be able to reconnect. Sometimes a modem power-cycle fixes this. I decided to upgrade the crusty old firmware to something slightly less crusty (note: I hate Sierra Wireless modems) — Cisco says “not for use in the US,” however it IS listed for the MC8775 modem on my HWIC and I know people have used it for the same modem in ThinkPads (trusty T61/T61p!) in the US. I couldn’t get it through Cisco as you need a support contract for Sierra Wireless firmware, oddly enough. I was able to find it online, you’d be looking for version 2.0.8.19 — generally named 8775_h2_0_8_19.tar. This of course will depend on the modem in your HWIC as they didn’t all come with MC8775s.

Check your hardware! It’s easy enough to see what modem is on your HWIC even if you don’t want to physically pull it to check it:

cell.wan#sh controllers cellular 0/0/0
Interface Cellular0/0/0
HSDPA/UMTS/EDGE/GPRS-850/900/1800/1900/2100MHz unit 0, 
HWIC cellular modem configuration:
---------------------------
Modem is recognized as valid for this HWIC
manufacture id: 0x00001199 product id: 0x00006812
Sierra Wireless MC8775 UMTS modem.
GPS State: GPS disabled

Pre-upgrade:

cell.wan#sh cell 0/0/0 hardware
Modem Firmware Version = H1_1_8_3MCAP C:/WS/F
Modem Firmware built = 03/08/07
Hardware Version = 1.0
International Mobile Subscriber Identity (IMSI) = NUMBERSHERE
International Mobile Equipment Identity (IMEI) = NUMBERSHERE
Integrated Circuit Card ID (ICCID) = NUMBERSHERE
Mobile Subscriber International Subscriber
IDentity Number (MSISDN) = NUMBERSHERE
Factory Serial Number (FSN) = NUMBERSHERE
Modem Status = Low Power Mode
Current Modem Temperature = 23 deg C, State = Normal
PRI SKU ID = 0, SKU Rev. = 20.0

Upgrade process:

cell.wan#microcode reload cellular 0 0 gsm modem-provision
Reload microcode? [confirm]
Log status of firmware download in router flash?[confirm]
Firmware download status will be logged in flash:fwlogfile
Microcode Reload Process launched for hwic slot=0; hw type=0x51E
cell.wan#
*****************************************************
 The interface will be Shut Down for Firmware Upgrade 
 This will terminate any active data connections.
 Do not make any config changes related to the interface.
*****************************************************
Modem radio has been turned off
*****************************************************
 Modem will be upgraded!
 Upgrade process will take up to 15 minutes. During 
 this time the modem will be unusable.
 Please do not remove power or reload the router during 
 the upgrade process.
*****************************************************
backing up NV data..Could take up to 3 minutes
*Aug 23 10:18:06.423: %LINK-5-CHANGED: Interface Cellular0/0/0, changed state to administratively down++++++++++++++++++++++++++++++
Prepare modem for downloading boot image.
Begin boot image download
Firmware [size:234279 bytes] will be downloaded in 228 segments
Sync indication Successful
Sync indication Successful
***** Boot File Upgrade OK *****
******************************************
The firmware file will be copied in blocks 
from Compact Flash. Please DO NOT remove 
Compact Flash during Upgrade Process. Doing 
so will cause download failure and leave 
modem in unusable state
*******************************************
Begin application image download
Firmware [size:13393280 bytes] will be downloaded in 13129 segments
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Sync indication Successful
Sync indication Successful
***** Application Firmware Upgrade OK *****
Modem Upgrade OK
*Aug 23 10:25:30.179: %CELLWAN-2-MODEM_DOWN: Modem in HWIC slot 0/0 is DOWN
*Aug 23 10:25:46.963: %CELLWAN-2-MODEM_UP: Modem in HWIC slot 0/0 is now UP
*Aug 23 10:25:46.963: %CELLWAN-2-MODEM_DOWN: Modem in HWIC slot 0/0 is DOWN
*Aug 23 10:26:00.191: %CELLWAN-2-MODEM_UP: Modem in HWIC slot 0/0 is now UP
Modem radio has been turned on
*Aug 23 10:26:44.587: %LINK-3-UPDOWN: Interface Cellular0/0/0, changed state to down

Post-upgrade:

cell.wan#sh cellular 0/0/0 hardware
Modem Firmware Version = H2_0_8_19MCAP G:/WS/
Modem Firmware built = 08/29/08
Hardware Version = 1.0
International Mobile Subscriber Identity (IMSI) = NUMBERSHERE
International Mobile Equipment Identity (IMEI) = NUMBERSHERE
Integrated Circuit Card ID (ICCID) = NUMBERSHERE
Mobile Subscriber International Subscriber
IDentity Number (MSISDN) = NUMBERSHERE
Factory Serial Number (FSN) = NUMBERSHERE
Modem Status = Online
Current Modem Temperature = 27 deg C, State = Normal
PRI SKU ID = 9991803, SKU Rev. = 1.3

The 2.0.8 line increases the HSDPA downlink speed from 3.6mbit/s to 7.2mbit/s (assuming your signal is acceptable, of course). I’m just hoping it stabilizes my connection and I don’t have to do any actual debugging! After installing the firmware update I had no problems bringing a connection up. So far it seems “more stable,” but only time will tell if it randomly drops. It also gives more information from “show cellular 0/0/0” ..

Once I get an antenna and (hopefully) better signal so I can maintain “usable,” speeds for basic browsing I’ll throw it in the routing table. For now, it can sit to the side as a novelty.

How My Network Broke Today (Part I of atleast a billion)

So today I went to spin up a new VM for development use. It wouldn’t get an IP address, I saw the DHCP request on the DHCP server, and saw an offer go out but it was never received.  I dug through, and it seemed like this was just happening on one VLAN since everything else was OK.

Did I mention everything else was already running?

Did I mention if I had a trap collector with an alarm board that I would know what had happened almost immediately and been able to pinpoint the issue before I even saw the effects?

No? Well, now I have.

Let’s just say that I spent over an hour digging, running tcpdump on various interfaces, then finally hit the switches. I noticed there was only one port in the port channel on the Dell 5224 access switch when there should have been two down to the distribution switch. Odd but I thought inconsequential (at the time).

I got into the Cisco switch and saw MAC flaps (TRAPPABLE) all over the place with Po2. Odd again. The Dell switch must be to blame, so I go back to it and shut the port that’s not in the LACP port channel but should be. Things improve. Have I mentioned that I’d unplugged that fiber a week ago and only recently got a new one to plug back in?

I spend some time trying to get both ports in the port channel to no avail. I finally look at the config and notice the VLAN allowed config is slightly off (one is missing from eth 1/23), so I shut both the ports on the Cisco side as Dell won’t let you change interface configs while it’s part of a port channel and this was just faster — I reset the eth 1/23 config to match eth 1/24, and voila both ports came up.

But things were even worse now, barely any MACs were seen in ‘show mac address-table’ on my 3550-12 from Po2. And they were all on VLAN 1. Ugh viagra ohne kreditkarte. I shut the interfaces again and reset some more of the configuration on the Dell switch. I pray. (I don’t really pray). I bring the interfaces back up and all is good. The VM gets its IP address and everything is right in the world.

I really hate the Dell configurations. If I hated this switch before it’d be an understatement, and it’s only given me more of a reason to want to smash it with a hammer today. It’s mainly due to me not being familiar with them, but their configs aren’t as intuitive as I’d like.

iSCSI Booting Win2012 Server WITHOUT an HBA (Intel I350-T2 / 82571 / 82574 etc)

Thankfully Intel cards have iSCSI initiators in their firmware, so I setup a ZFS volume to make my HTPC diskless to attempt to stress the file server a bit more and generally just play with things as I tend to do.

So I added some settings to my ISC DHCP daemon under my shared network stanza to pass IQN/server settings to the Intel I350 card (82574 etc would work equally well here):

shared-network "VLAN-451" {
 default-lease-time 720000;
 option domain-name "p2.iscsi.frankd.lab";
 option domain-name-servers ns.frankd.lab;
  subnet 172.17.2.128 netmask 255.255.255.128 {
  range 172.17.2.144 172.17.2.239;
 }
 host intel-htpc1 {
  hardware ethernet a0:36:9f:03:99:7c;
  filename "";
  option root-path "iscsi:172.17.2.130::::iqn.2014-12.lab.frankd:htpc1";
 }
}

Voila, the card came up, grabbed DHCP settings and immediately initiated a connection! Awesome, the first thing to go right so far!  I admit I briefly spent some time trying to get iPXE to work with the Realtek card, but I ran into issues and just decided to use something I had laying around to get up and running quicker. The onboard Realtek is now for regular network data only, I might get a single port Intel card since I don’t need MPIO to this machine.

I imaged Win2012 Server to a USB stick using Rufus and plugged it in, it saw the drive and installed to it. I can’t believe things are going so easy/well for once! Then the system reboots. And it mounts the volume. And the Windows logo comes up. Then an error message comes up saying it couldn’t boot. Right away I knew it wasn’t getting past the BIOS calls to the disk (which were taken care of by the Intel NIC), and some Googling came up with horrible answers until I found an IBM document saying a new Intel driver fixes the issue — in a very indirect way. They don’t specify what, but it apparently has something to do with the iBFT tables that are created for the handoff. So I downloaded the newest drivers, put them on the USB stick and I installed Windows 2012 Server AGAIN. This time I loaded the newest version of the network drivers off the USB stick before even partitioning the disk, though.

The machine rebooted..

 

And..

 

IT WORKED! I was up and running. I installed the User Experience stuff so I could get Netflix/Hulu up easy, downloaded nVidia drivers and am now getting my Steam games downloaded to the machine — although I could stream off my workstation/gaming PC. It can’t hurt to have more than one machine with them installed in case either one of them dies and I need to go blow some pixels up to relieve some stress though, right?

 

Getting My Real VM Server Back Online Part III: Storage, iSCSI, and Live Migrations

After some dubious network configurations (that I should have never configured incorrectly) I finally got multipath working to the main storage server. All of the multipath.conf examples I saw resulted in non-functional iSCSI MPIO, while having no multipath.conf left me with failover MPIO instead of interleaved/round-robin.

A large issue with trying to get MPIO configured was the fact that all the examples I found were either old (and scsi_id works slightly differently in Ubuntu 14.04) or just poor. Yes, I wound up using Ubuntu. Usually I use Slackware for EVERYTHING, but lately I’ve been trying to branch out. Most of the VMs run Fedora, “Pegasus” or VMSrv1 uses Fedora, “Titan” uses Ubuntu.

Before I did anything with multipath.conf (It’s empty on Ubuntu 14.04), I got this:

root@titan:/home/frankd# multipath -ll
1FREEBSD HTPC1-D1 dm-2 FREEBSD,CTLDISK
size=256G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 13:0:0:0 sde 8:64 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 12:0:0:0 sdd 8:48 active ready running

Note the disks are both round-robin — with only one member each! This works for fail-over, but did nothing for performance. The only thing that wound up working for multipath.conf was this:

defaults {
 user_friendly_names yes
 polling_interval 3
 path_grouping_policy multibus
 path_checker readsector0
 path_selector "round-robin 0"
 features "0"
 no_path_retry 1
 rr_min_io 100
}

multipaths {
 multipath {
  wwid 1FREEBSD_HTPC1-D1
  alias testLun
 }
}

The wwid/alias doesn’t work, however. All of the MPIO is just coming from the defaults stanza. I attempted many things with no luck, unfortunately. I’m going to have to delve into this more especially if I want live migrations to work properly with MPIO. As it stands the disk devices are pointing at a single IP (ex /dev/disk/by-path/ip-172.17.2.2:3260-iscsi-iqn.2014-12.lab.frankd:htpc1-lun-0), I’ll need to point at aliases to get the VMs working with multipath.

The multipath tests themselves were promising though, dd was able to give me a whopping 230MB/s to the mapper device over a pair of GigE connections.

The output from ‘multipath -ll’ now looked more reasonable:

root@titan:/home/frankd# multipath -ll
mpath1 (1FREEBSD HTPC1-D1) dm-2 FREEBSD,CTLDISK
size=256G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 39:0:0:0 sde 8:64 active ready running
  `- 40:0:0:0 sdg 8:96 active ready running

You can see the drives are both under the same round-robin policy instead of two separate ones.

The storage server also saw some slight changes, including upgrading from one Intel X25-V 40GB for L2ARC to 2xX25-Vs for a total of 80GB. I also added a 60GB Vertex 2 as a SZIL device. I really need to build a machine with more RAM and partition out the SZIL. I’ll likely wind up using my 840Pro 256GB for L2ARC and leave the old X25Vs out of the main array once I get a pair of 10GbE cards for maximum speed (hopefully near-native of the 840Pro — perhaps better with a large amount of ARC) to my workstation.

So we’re at a point where everything appears to be working, although in need of some upgrades! Great! I’m looking at a KCMA-D8 Dual Opteron C32 motherboard as I have a pair of Opteron 4184s (6 core Lisbon, very similar to a Phenom II X6 1055T) laying around, so I could put together a 32GB 12 core machine for under $400 — but as always, budgetary constraints for a hobby squash that idea quickly.

Getting My Real VM Server Back Online Part II: Storage Server!

Anticipating the arrival of RAM for my VM server tomorrow I decided I needed some kind of real storage server, so I started working on one. I haven’t touched BSD since I was a kid, so I’m not used to it in general. I wasn’t sure how OpenSolaris would work on my hardware (I hear it’s better on Intel than AMD) so I opted for FreeBSD. Unfortunately I just found out FreeBSD doesn’t have direct iSCSI integration with ZFS, but that’s okay! We can always change OS’s later, especially since the storage array leaves a lot to be desired (RAID-Z1 with 4x1GB 2.5″ 5200RPM drives + 40GB Intel X25V for L2ARC, no separate ZIL).

I’m getting used to the new OS and about to configure iSCSI, which will be handed out via multipath over an Intel 82571EB NIC into two separate VLANs into a dedicated 3550-12T switch. We’ll see how it works, and if it’s fine I’m going to get my HTPC booting over it.

I’m going to look around for a motherboard with more RAM slots, for now I’m stuck with a mATX motherboard, a SAS card that won’t let the system boot, and 2 RAM slots (8GB) with an FX-8320.

Performance tests to come.. after I encounter a dozen issues and hopefully deal with them!

Rearranging The Intranet of Things Part II

I’m sure there will be a lot more posts like this to come. I had formerly moved the edge router to the ‘closet’ (aka the garage, right next to the cable modem and 3560-24PS sitting there) and added another router there to have a routed gig port into my ‘office’ (aka my bedroom with a couple desks).

Today I replaced both routers with a single 7206VXR with an NPE-G1. I had it all configured and everything should’ve worked off the bat, but it didn’t — not exactly, anyway. The routing was perfect, the NAT was great. But I only have a VAM card which doesn’t work with 15.x (only VAM2 cards work with new code), and I didn’t want it doing VPN in software.

So I decided to keep the old WAN router as VPN-only duty. I briefly considered using a 1760 with a VPN module (I have a few), but when I finally get to having decent internet speeds it would choke. The 3825 has an EPII+ card on top of the onboard hardware engine, so it should at the least have no issue keeping up with my internet connection with weak Triple-DES. The only issue is when I went to forward UDP 4500 from the edge router to the VPN router I got:

% Port 4500 is being used by system

I was able to successfully forward ports UDP 500 and ESP, but here I got stumped. I verified there was no crypto config, I tried clearing crypto stuff, I tried disabling software crypto — all with no luck. Googling didn’t give me much to go on, but I finally ran into something showing this error as an IOS-XE bug for 15.2(4)S2 –and I was running 15.2(4)S3 (pure IOS, but basically the same), so being out of options and ideas I decided to just install 15.2(4)M7 and Voila! Problem solved!

Two routers replaced with — two routers, maybe that doesn’t sound very good, but it will allow me to do more at the edge with more ports available directly on the router instead of playing with switches and VLANs/VRFs.

And in case you want to see how my network is physically wired — and this is somewhat simplified, here you are!

Network Diagram

Simplified Network Diagram – 01/01/15

Rearranging The Intranet of Things

So after dealing with a bunch of random dd-wrt based access points I decided to grab some LAP1142Ns off of eBay. I set up a vWLC on the VM machine, and was able to get it going fairly quickly even with no knowledge of Cisco Wireless technology.

So far my throughput is only slightly increased even after moving to 5GHz and having a 3×3 MIMO radio in my laptop.

I added a real router for the upstairs network (3825), and a gig link from the ‘closet’ to my office/workstations. Some of the interconnects in the lab are temporarily dual 100MBit load balanced via EIGRP to alleviate some of the bottlenecks. The LAP1142Ns are limited to 100mbit due to a 3560-24PS being the only POE switch I have, but I never see more than about 60mbit of throughput over wireless, and the port never exceeds 70mbit — so until I get that sorted out it’s not a limitation.

To get more gig links in my ‘office’ (aka my bedroom) I trunked a cheap Dell 5224 to a 3550-12G, replacing the 3550-12T that was formerly there. I wish I could afford newer Cisco gig switches my budget is basically non-existent.

I still need a total network redesign, my routing table is almost laughable:

dswr1.core#sh ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
 D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
 N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
 E1 - OSPF external type 1, E2 - OSPF external type 2
 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
 ia - IS-IS inter area, * - candidate default, U - per-user static route
 o - ODR, P - periodic downloaded static route

Gateway of last resort is 172.16.5.6 to network 0.0.0.0

D 192.168.30.0/24 [90/28928] via 10.255.1.6, 22:58:10, FastEthernet0/16
 [90/28928] via 10.255.1.2, 22:58:10, FastEthernet0/14
 172.17.0.0/16 is variably subnetted, 6 subnets, 2 masks
D 172.17.0.48/28 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.17.0.32/28 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.17.0.16/28 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.17.0.0/28 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.17.0.72/29 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.17.0.64/29 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
 172.16.0.0/16 is variably subnetted, 7 subnets, 4 masks
C 172.16.255.0/28 is directly connected, Vlan601
D 172.16.2.8/30 [90/28416] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28416] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.16.2.4/30 [90/28672] via 10.255.1.6, 22:58:18, FastEthernet0/16
 [90/28672] via 10.255.1.2, 22:58:18, FastEthernet0/14
C 172.16.5.4/30 is directly connected, FastEthernet0/24
D 172.16.3.2/32 [90/156672] via 10.255.1.6, 22:58:14, FastEthernet0/16
 [90/156672] via 10.255.1.2, 22:58:14, FastEthernet0/14
D 172.16.1.0/24 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 172.16.3.1/32 [90/156160] via 172.16.5.6, 10:49:53, FastEthernet0/24
 172.18.0.0/28 is subnetted, 1 subnets
D 172.18.0.0 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
D 192.168.99.0/24 [90/28928] via 10.255.1.6, 03:03:00, FastEthernet0/16
 [90/28928] via 10.255.1.2, 03:03:00, FastEthernet0/14
 10.0.0.0/30 is subnetted, 2 subnets
C 10.255.1.4 is directly connected, FastEthernet0/16
C 10.255.1.0 is directly connected, FastEthernet0/14
D 192.168.0.0/24 [90/30720] via 172.16.5.6, 10:49:54, FastEthernet0/24
D 192.168.100.0/24 [90/28672] via 10.255.1.6, 1d00h, FastEthernet0/16
 [90/28672] via 10.255.1.2, 1d00h, FastEthernet0/14
C 192.168.101.0/24 is directly connected, Vlan400
D*EX 0.0.0.0/0 [170/30720] via 172.16.5.6, 10:49:54, FastEthernet0/24

A lot of bit of nothing

As it sometimes happens personal stuff has taken hold of my life and stopped me from doing anything major with anything technology related. I decided that I should pick a little project to pick up some new skills, so I’ll be setting up Cisco’s AIR-CTVM Wireless controller along with a couple LAP-1142Ns 802.11n (draft) access points that I picked up off of eBay to get rid of the DD-WRT APs which haven’t been entirely cooperative. For example, the Netgear WNR834B v2 will only use the base channel assigned with the second channel being two channels above it (currently channels 6 and 8) which is clearly not optimal for throughput.

I’m going to be rearranging my home network to segment it a bit more and do some more with routing. I want to keep the LAPs running off the 3560-24PS with PoE power instead of powering them with external bricks, so unfortunately each AP will be limited to 100mbit of throughput — that’s actually still better than what I get now over the 2.4GHz N AP, so it’ll still be a usable throughput improvement.

I’ll also be able to actually do some L3 segmenting instead of needing to share a VLAN across physical boundaries for the ‘dumb’ AP bridges currently in place.

I’ve been doing some work on IP management software, and while a lot of the back-end functionality is currently there for calculation, I’d like to rewrite some of it for speed. There are parts that are written strictly for readability using strings instead of bit compares, and they’re much slower than I’d like them to be for large address spaces. I should have something interesting to show if I can manage to put a little more time into it.

PHP, Screen Scraping & SevOne Deferred Data — or a Network Operator/Engineer That Can Do More Than Networks! (Possible Rant)

This post is semi work related, but I do have SevOne running at home as my NMS for graphing, trending and alerting. There are some statistics that I insert via their ‘deferred data,’ which is done through a (fairly horrible) SOAP API. Before Cablevision I hadn’t written a line of PHP, or used SOAP in any language — but all of the SevOne examples were written in PHP. I picked it up to write some scripts that insert data from CMTS (Cable Modem Termination Systems) that were not available  through SNMP which was really helpful in our monitoring and alerting.  As I do have some kind of programming background it wasn’t difficult to pickup the ‘beginners,’ scripting language. Eventually I wrote a class to encapsulate a bunch of stuff that I do to interface with SevOne as we could really spike the CPU usage if we did things by way of their example scripts.

The entire CMTS script involved me writing a telnet wrapper that uses socket calls and is entirely written in PHP. This was probably a naive approach and a stream to telnet would’ve been much better (as we will be moving to SSH anyway), although it did allow me to flex some critical thought muscles I forgot I had and haven’t been used in some time. After some tuning it’s actually pretty fast and doesn’t present a bottleneck or resource drain. Eventually even the telnet functions got expanded on considerably into a couple of classes for accessing data on IOS and Arris C4. The IOS functions got expanded on to parse data for some critical things for CMTS-specific stuff, then for other pieces of equipment (ie 7600/6500). The base telnet class was later extended to IOS-XR for ASR9K devices. Then NX-OS. A flexible SNMP wrapper was created primarily for polling STB (Set Top Boxes). I got pretty good at writing PHP in a time-efficient manner to parse all kinds of stuff. As a “network operator,” this was great! I could get all kinds of statistics that would normally be “impossible,” or really “time-intensive. The SevOne stuff expanded and got refined as I dug more into my programming background.  Even as I was using a fairly clumsy language without formal training I had a good idea of how to write EFFECTIVE PHP (in my mind anyway). Parts of it might be ugly, but it gets the job done. I became adept at writing regular expressions, I even dynamically generated it based on the output so it would be extremely easy to update in the future and could even deal with a good deal of changes without being updated. I wrote SevOne wrappers to generate (or update) hundreds of threshold alarms at a time that would otherwise take a LOT of man hours to manually input. I wrote scripts to poll SevOne for information so we could figure out how to most effectively utilize our servers, or just to look at devices for certain criteria.. things that we wouldn’t be able to do an effective manner otherwise.

That’s a lot of background, but the end result: I created a great SevOne Deferred Data wrapper that caches a lot of information to cut back on CPU hits to the SevOne servers. It connects to peers dynamically as required to cut back on CPU. It takes a lot of work away from SevOne that might otherwise be (naively) processed on SevOne’s side. I know because I did it. And we pegged CPU cores on our SevOne boxes. We ran into so many issues. Because of that we were unable to do what we needed to do without some of these (admittedly somewhat minor) innovations. For Cablevision it’s amazing. For me, it allows a great amount of work in small amounts of time. All of the SevOne nitty-gritty is abstracted away and done automagically. Things are created on the fly, SOAP calls that might generate an exception (and this happens often) are automatically retried depending on the error. There’s little need for error checking in a script that screen scrapes to collect data. In my mind that’s an example of a great piece of code, it doesn’t make you think about what you’re doing with something. It does what you expect of it (for the most part). When upper management asks for something I look good because I can get it to them extremely quickly. When SevOne says it will take months for their Professional Services division to create something or cost some obscene amount of money, I’m there offering a quick solution which is cheap for them. Maybe that’s not great, but I like doing my job well.

I’m not the best programmer ever. I’m not a scripting pro. I don’t know that many scripting languages, but I can pick them up fairly quickly. I’ve never touched Python. I can whip things up in Ruby (ON RAILS EVEN!) but I’m not a Ruby guru. I can write shell scripts but I’m not extremely proficient in BASH, nevermind TCSH, KSH or any of the others. I’m not really a network engineer. But combining my general purpose background has allowed me to do things that might not otherwise be possible. Certainly your regular CCNP or CCIE carrying network guy is not going to have quite the scripting background to get the data required into an NMS — and your regular programmer is not going to have enough of an idea about network devices or the network in general to have a good understanding of what might need to be collected into the NMS.  At home I want to monitor some things on my VM server that aren’t available through SNMP. I could write scripts and custom OIDs to access them through SNMP but that requires configuration on the SevOne side. It’s much faster for me to use my wrapper to insert the data I need (a good example — CPU wattage).

I realize this post is more of a rant than usual, but feel free to check the wrappers out at SevOne PHP Wrappers

 

 

Temporary VM Host update

The AMD E350 (2×1.6GHz Bobcat cores) was a little light on CPU power as a VM host so I changed it out for a uATX board with an AMD FX-8320 (8×3.5GHz). The PSU in the uATX case is a little light so turbocore was disabled to try to keep it alive. Two flexible PCI-E risers (1x using a USB3.0 cable for data transfer) were added to the single 5.25″ bay with one on each side of the bay. In those slots are  Intel 82571EB-based dual GigE NICs facing each other and tied together with a couple #6-32 standoffs tying them together where a PCI bracket would normally go. Unfortunatley the motherboard is only 760G based, so no IOMMU support for passing through GigE ports (I’d need a 970 or 990FX motherboard for that). Being on x16->x1 powered risers (commonly used in coin mining setups) leaves just enough bandwidth for 2 GigE ports ignoring buffer bursts.

Tiny VM Server

Tiny VM Server

I had an nVidia GT430 (49w TDP, GF108) laying around, so that’s more than adequate for video output for the server without wasting any of the precious 8GB of RAM for onboard video. A GeForce 210/205 would be better as far as power usage, but unfortunately the only 210 I had bit the bullet. A GTX 750 GM107 would also be great, but not worth the cost — and not available in half-height unfortunately.

The same 8GB of RAM and 500GB SATA HDD remain. If the GT430 stays that leaves no more room for more NICs as the board only has x1/x16/x1 slots. It may be worth giving up the x16 slot for a 4 port GigE card if I ran into one. It may be worth the loss of a little system RAM in order to pick up 4 more ports.

To keep TDP down even further I may lock down the power states via TurionPowerControl as there’s no adjustable TDP settings like there are on my 2P C32 server.

VMs, Linux Software Bridges and 802.1q — What I Learned This Time

When initially setting up the box, I had the idea in my head that I might create several bridges. One for each VLAN. That’s probably one of the best ways to tackle the issue unless you really want the trunk to exist on the VM –which is also fine and valid. But by default it gives every device the option of accessing any VLAN in the trunk, which since we’re in a lab environment is not particularly an issue.

But I like to work reality into lab mockups as much as possible. I have plenty of NIC ports, so even creating a lot of trunks is not an issue. But our VMs will accept a large number of virtual NICs, so this option seemed semi-elegant.

The first issue I ran into was crosstalk between the VLANs, I had created a bunch of 802.1q sub-interfaces (which strip/tag incoming and outgoing frames) via ‘vconfig’ or ‘ip link’. I attached p32p1.10 to br10 and p32p1.1 to br1. I attached tap0 to br10 and tap1 to br1. Everything appeared to be working on the very initial configuration until I saw the output of ‘sh cdp nei’ on the physical Cisco 3550. It saw itself. That meant it was receiving bounceback. So I loaded up tcpdump and watched bridge traffic and examined the macs in the Linux software bridge. There was definitely cross talk — and after a hunch and a little investigation it turns out that QEMU doesn’t do much to separate NIC traffic as I called them with the ‘old’ syntax. After updating my QEMU launch options the problem disappeared and I was happy… until…

Neither OSPF or EIGRP were forming neighborships. Load up tcpdump again, examine traffic. I see the packets hitting br1 and br10 from XRv, and from both the 3550 and the 2821 that I’m currently configuring. That looks good.. but ‘debug ospf packet’ on XRv was not giving me anything aside from what it was sending out. So I aimed tcpdump at the tap interfaces instead, and I saw that the tap interface was not receiving the HELLO packets on either VLAN. (Hint: Here is where I went wrong in my diagnostic chase, I had filtered tcpdump down to only EIGRP/OSPF, had I not the problem would’ve been almost immediately evident)

Thinking for some reason with no basis in fact that it may be a multicast issue with Linux software bridges, I decided to configure neighbors manually. That also resulted in a neighborship not forming with XRv to any other box. Other traffic (ICMP/TCP/UDP) was unaffected, so I thought that was interesting. I started watching the interface again, this time with no filter — and I saw the VM host replying to XRv with an ICMP Unreachable. Pretty clearly a firewall rule problem. Ebtables (iptables for layer 2 stuff if you haven’t seen it) was clear, and I didn’t see anything immediately in iptables.. but it’s always faster to test fixes than to examine things (and in a lab, perfectly acceptable), so a simple iptables -I FORWARD -j ACCEPT while removing the manual neighborships so EIGRP and OSPF would both go back to using multicast resulted in everything working viagra aus holland bestellen. Great! Classic implicit deny caught me.

This is where I usually get annoyed by pre-configured rules. Usually I load up Slackware, but I’ve been using Fedora lately for ease of getting some things up and running with real dependency management. Had I been using Slackware with its default no rules everything would’ve been honky dory, and I would’ve configured some myself when I felt the time was right.

To continue — I tried to make a new bridge, trunk0, with p32p2 in it. I load up tcpdump and notice that there’s no traffic aside from STP on some VLANs that aren’t in active use. Apparently configuring those subinterfaces whisks the frames away from the main interface, so I just deleted all of them off of p32p1, configured another trunk port and added that to trunk0 on the Linux box and voila! Tagged packets from everywhere! I have yet to try a tap interface into trunk0 and a VM, but I have a feeling everything should be all right. Then again, every time I have that feeling is usually when things are about to go terribly wrong.

VM Host, IOS XRv, CSR1000V

I’m trying to get some IOS-XR and IOS-XE VMs machines up. Mostly to play with some of the IOS-XR configurations. After playing around with some Linux networking stuff that I haven’t done in a while I was finally able to get the 801.q trunks through both the Linux bridge and individual VLANs elsewhere. The preconfigured ebtables and iptables rules in Fedora 20 are really annoying, I’ve always preferred to start with Slackware and a blank slate.

So far I have one IOS-XR instance up running successfully and traffic is now normal after I had some weird inter-bridge traffic caused by qemu-kvm.

Ah — the other thing, instead of everyone’s standard VMWare setups, I of course am sticking to my familiar virtualization technologies and running qemu-kvm with all my standard Linux tools. Unfortunately with just a low power dual core AMD E350 and 8GB of RAM at the moment I won’t be running too many instances as XR/XE are really RAM heavy.

Thankfully I’ve kept a separate VLAN and VRF setup on every device for management only so I can (usually) get back into boxes if I break their config without rummaging around for a console cable and USB to RS232 adapter. I really need a 16 port serial card.

So I’m looking at probably 8 XRv instances and 4 XE instances to play with — unfortunately they have to suffer through a 2mbit rate limit so I can’t really use them in my network.. and they are sadly ridiculously resource heavy. But they’ll be fine to learn some of the IOS-XR stuff on, I suppose.

© 2017 Musings

Theme by Anders NorenUp ↑