Archive for the 'IT' Category

OpenVPN, Slow Download, VMware ESX

Tuesday, March 4th, 2008

I just finally solved a horrible OpenVPN problem that I’ve been having for a while. The OpenVPN server is a VM running Fedora 8 under ESX Server 3.5.

When I would go to download a file over the VPN, it was very, very slow - about 128kbit/s instead of the 16mbit/s it should be. Interestingly enough, if I downloaded a file directly from the OpenVPN server, and not from a server on the LAN that OpenVPN serves, it would run at full speed, suggesting that the problem was somewhere in the kernel’s NAT system. Further weirding things out, uploads worked at full speed.

I tried every OpenVPN config option I could think of. I tried changing MTUs on every interface. I looked at the iptables logs, tcpdump, even strace. Despite all this, iperf showed perfect numbers, suggesting an ip fragmentation problem. Finally, I tried one more thing, which should not have made a difference, but it did.

I changed the virtual hardware, for the ethernet adapters in the VM, from “enhanced vmxnet” to “flexible vmxnet”.

This should not have made a difference for a variety of reasons, but the most notable reason is the fact that the VMware tools were running on the VM. If you’ve got those tools running, it uses the virtual ethernet adapter in ‘enhanced vmxnet’ mode automatically. If you’re not running the tools, then the virtual ethernet adapter appears as some regular card that every OS has drivers for, in the event that you can’t run the tools. So in effect, the virtual hardware is exactly the same. The original option forced the card to run in ‘enhanced vmxnet’ mode only, this new option gives it a choice, but the tools pick ‘enhanced vmxnet’ anyway.

VMware’s networking support can be very strange at times. There is a huge bug in ESX 3.5 right now with regards to a certain ethernet chipset (”Intel Gigabit VT Quad Port Server Adapter”). You can’t run VLANs over an 802.3ad aggregated link (that’s when you use more than one ethernet port out of your switch in a sort of tunnel, adding redundancy and speed). If you do, traffic is randomly dropped. I get about 85% packet loss when I do that. So, my install currently does not use 802.3ad. VMware told me that it’s a known bug in their driver for this card, and that a fix will be included in the next patch. I reported this bug back in December and they still have not released a patch. And just a few weeks ago, they posted the bug to their knowledge base.

It seems like VMware’s support, as far as patches and updates go, has gradually become slower and slower since EMC bought them. If you want to make your fast, resilient company become super slow, sell to a gigantic faceless corporation. It’ll work every time.

Supermicro, you suck

Tuesday, November 6th, 2007

We’ve got about 80 Supermicro X7DBR motherboards in use at my company. They all have slots for this nifty IPMI card. Half of them don’t work, because the BIOS is too old on the motherboard. Supermicro provides a BIOS upgrade. You must either be running Windows, or have the ability to boot from a floppy disk. SUPERMICRO DOES NOT SELL SERVERS WITH FLOPPY DRIVES!

So I have to go through hell to make a DOS bootable CDROM that also has the bios upgrade files on it. This is a huge pain in the ass. SUPERMICRO, PUT BOOTABLE CD IMAGES ON YOUR SITE!

Also, while I’m at it, Supermicro’s IPMI implementation blows, too. The remote video console thing has a HUGE keyboard button repeat problem. I’ve worked with their support department a lot, but after weeks of trying the problem never got fixed. It has taken me 30 to 40 minutes to type “linux ks=blah blah blah” because I have to keep erasing all the repeated buttons, and then a lot of the time it repeats me pressing backspace, erasing the entire line.

Leopard & ATI Radeon 9600

Wednesday, October 31st, 2007

In the PowerMac G5 I have at work I have an ATI Radeon 9600 Mac / PC Edition video card. The reason I have this particular card is that it’s the cheapest AGP G5 compatible video card that can drive a 30″ monitor.

However, once upgrading to Leopard, I was having lots of graphical artifacts. I’d click on something, and sometimes whatever was behind the window would bleed through. Sometimes parts of the screen would not get redrawn. I was okay with this, but then suddenly it decided it couldn’t drive my 30″ monitor anymore. I almost gave up on it and bought a newer but waaaaaaaaaay more expensive card, when I finally fixed it.

The fix is to install these drivers from ATI’s site. This is a really funny fix, because the drivers were written in 2005 and are only 10.4 compatible. Obviously this is a really stupid and dangerous thing to do - installing ancient drivers on Leopard - but I was so desperate that I’d give it a try.

It installed and tried to load the drivers and the kernel simply would not do it, which is good. So I’m pretty sure my computer isn’t being effected negatively by the ancient drivers. However, some of the other stuff it installed completely fixed all my problems. No graphical artifacts (which I even had a little bit of in tiger) and the 30″ mode worked perfectly. I don’t know which file it installed fixed it, but I think it might have been something that tells the OS what this video card is capable of. You may have to run the “control panel” app it puts in System Preferences to get the full effect. I think the problem was that Leopard decided the card had lesser capabilities, and these archaic drivers informed it otherwise.

VersionTracker is dead - long live iUseThis

Friday, October 26th, 2007

I’ve complained before about VersionTracker and how freaking slow they are. (For those who don’t know, VersionTracker is a software catalog site.) Other deficiencies include:

  • Virtually no screenshots
  • Site layout is too busy
  • Linkjacking
  • Despite gigantic corporate backing from CNET, it is still slower than my 14.4kbps dialup web server from 1998
  • Pay-for desktop companion software

Why did I ever use them? They were the best at what they did. Not anymore. iUseThis has arrived. Thank the gods. What makes iUseThis better?

  • Social: tagging, “who uses this program”, comments, digg-like rating system, done in a very friendly, “web 2.0″(cringe)-ish way.
  • Just about every app has a screenshot. Welcome to 1987. And on this side of the bus you’ll see ‘Ol Man Versiontracker, shaking his cane at all these new hip tagging kids, get off my interlawn!
  • Free desktop companion software: AppFresh. It does two things: (a) searches your computer for outdated programs and will automatically install new versions if you ask it to; (b) (optionally) uploads your list of software to the site, integrating it with all the social aspects.
  • The site is very fast. Not slower than my 14.4kbps dialup web server from 1998, even though they seem to have started in 2006, unlike VersionTracker who has been operational for 11 years. Normally this is nothing to write home about- any properly designed site can be fast within a short amount of time. The real shocker here is that after 11 years VersionTracker is still so slow.

Just to really drive the point home, I’m going to list a few things that humanity has accomplished within an 11 year timespan, in no particular order:

That is all.

Terminal automatic string escaping & KeyCue

Thursday, October 18th, 2007

OK two things.

  1. KeyCue is an amazing utility. You hold down CMD for a second or so, and it pops up a screen containing all the keyboard shortcuts in the currently running app. Screenshot below.Screenshot of KeyCue
  2. Thanks to KeyCue, I discovered an unbelievably handy keyboard shortcut in Terminal: CTRL+CMD+V. It automatically escapes a string. This is super handy for programming. I had this string in my clipboard: & ‘”\& &2> `$%^ (%s). When I pasted it using this shortcut, it turned into this: \&\ \’\”\\\&\ \&2\>\ \`\$\%^\ \(\%s\).

Awesome.

horde + imp sucks

Tuesday, August 14th, 2007

THIS JUST IN: horde sucks. imp sucks. I hate them both. If I was a reddiot (reddit idiot, I just coined this term), I would have written BREAKING NEWS instead of THIS JUST IN.

Back to Horde. The UI is wonderful but the setup is horrible. horde defaults to preventing you from logging in, except if you’re connecting from 127.0.0.1. Yeah, I run a web browser on my server all the time. Once you remove that restriction, you find that it automatically logs you in as Administrator, without even prompting for a password. The setting to change this is buried very deep in the configuration page, which has 22 tabs (not exaggerating).

And that’s just horde. imp is a bitch too. Configuration of horde is done all through its web interface, which leads you to believe that all the imp settings are configurable through imp’s config page, but no, you’d be wrong. This is not obvious.

And wtf does imp have to be a module of a larger web application? Why can’t they make a standalone version, for people that want only a decent webmail program?

And finally, I want to say that I really don’t like bashing open projects but sometimes I need to say things like this. Despite all this, I do really appreciate the efforts of the Horde and imp folk, and hope that they won’t take the above too badly.

Is there anything needlessly slower than versiontracker.com?

Saturday, August 4th, 2007

No. Well, yes: America’s response to the New Orleans disaster.

Seriously, versiontracker.com, get with 2007. There are all kinds of ways to optimize your site. Every time I go there it takes at least 3 to 5 seconds before anything useful appears on the screen. Wheras youtube.com is practically instantaneously, and I’m guessing they get a lot more traffic than you do. And god forbid I want to search for anything, another 5 second wait. Don’t you employ any kind of load balancing, or database caching (your hits are 99% read-only!)? Or, come on, even something as simpleĀ  as squid! From memory, I’d say that, roughly, your site has been this slow for at least 3 years. There was a time when it was fast. Please come back to that.

Let me add that I’m on a 25mbit connection. My end is not part of this problem. I get a 25ms ping from www.versiontracker.com. By the way, wtf? You guys let external pings to your web servers?

Go to websiteoptimzation.com and see what it thinks about our site. 2 congratulations, 9 WARNINGS. Did you know your site takes almost a full minute for a 56k modem user to load? For what purpose? There’s no video on your site or any other reason to justify cutting out the 56k community.

Surely you have a techops department. Surely you have talent to fix things. Add a load balancer, a database cluster, whatever you need to. Analyze the situation, find bottlenecks, fix fix fix. My employer managed to improve page load time by an order of magnitude with about 30 minutes of work, by adding some caching software and applying a few tweaks to Apache. Our staff and budget are that of a startup. You guys are much bigger. You have no excuse but incompetency or simply not caring how slow our site is, by which, you expresses your opinion to your visitors that you don’t care about their experience, so long as the ads load. And they do. First. Before the rest of your slow-ass site does. Hmmmmmm.

Avoid The Planet, theplanet.com, at all costs

Wednesday, May 2nd, 2007

It’s a rare thing indeed to be completely pleased with a hosting company for the duration of your stay. softlayer.com deserves the distinction of best customer support ever (they answer the phone on the first ring, among other things). But no, this post isn’t about softlayer.com.

This post is about The Planet and EV1 Servers and Server Matrix. They’re all the same company.

I’ve had an unusually large number of problems with them, too numerous to describe here, but I’ll describe two of them:

(1) Emergency response time is abysmal. My server traffic had suddenly spiked and I needed more RAM ASAP. I placed an emergency order and told them to shut down my server whenever they were ready to install it. I heard nothing for an hour, and then I called in. They said to just keep waiting, so I did. The RAM never came. This is when I went to softlayer.com and ordered a server from them. The entire server was provisioned and running within 20 minutes. I moved all my sites there.

33 days later, no kidding, no exaggerating - they responded to my ticket. “Sir, are you still interested in the RAM?” My response: “No, I want to cancel my service with you.” This leads me to the next problem.

(2) I’ve been trying to cancel my service with theplanet for 3 months. I followed all of their instructions. I gave 30 days notice, I opened a cancellation ticket. I called in to confirm the cancellation. Then, a few days before the cancellation was to be effective, I called again and made sure everything wast set. I was told twice by two different support reps. that everything was canceled and I would no longer be billed. But then I was billed again. And again.

I called my credit card company to refuse the charges. No problem, they say. They refund my money. But there’s a catch. Even if I change my credit card number, theplanet.com will still be able to bill me. My credit card company said that monthly service billing works that way. After the first charge to a credit card number, the system uses a different code that always refers to your account, regardless of the credit card number. It sounded like BS but after some googling, it seems to be true.

I asked my credit card company how to deal with this. They said that I should call in every time I’m charged again. That means, conceivably, a monthly phone call to my credit card company to refuse theplanet.com’s fraudulent charges. Who has time for this? It takes 45 minutes per phone call!

I had one option. I could completely close my account. Fortunately the account had no balance so that was an option. I did this.

The Planet forced me to completely cancel my credit card because they kept steeling money from me.

Googlebot is flooding librivox.org

Tuesday, April 24th, 2007

Googlebot makes, on average, 270,524 hits to librivox.org every day. I’ve been working with google for 2 weeks now to fix this and so far it hasn’t helped. They said that they fixed it, by causing googlebot to index librivox.org less, but it hasn’t actually taken effect.

I keep telling them that googlebot must be broken in some way; librivox.org is just a wordpress / phpBB install, there’s nothing there that should make googlebot consume 41 GB of traffic per month. It’s actually driving the load average of my server high. It even caused a 31mbit/s traffic spike at 6AM this morning.

Here’s hoping Google continues doing no evil.

Silencing “Treason uncloaked!”

Saturday, April 7th, 2007

I am really, really sick of looking at dmesg and seeing messages like this:


TCP: Treason uncloaked! Peer 62.68.89.50:64651/80 shrinks window 2345094196:2345094197. Repaired.

I googled, and I googled, and I could not find a way to disable the messages. I found plenty of posts saying that the messages are harmless. There is no file in /proc that you can use to turn off the messages. However, I want to be able to look at dmesg without important messages being flooded away.

So, I commented out the printk() in the kernel source and recompiled it. The kernel I modified/recompiled is from Fedora Core 6, specifically kernel-2.6.20-1.2933.fc6. Below are links to the RPM for i686 and also the SRPM.

Beware, this is my first time recompiling a kernel the Fedora way.

Kernel for i686: kernel-2.6.20prep-1.i386.rpm

Source RPM: kernel-2.6.20prep-1.src.rpm

Polycom SoundStation IP-4000 & Asterisk retardation

Thursday, August 10th, 2006

Yesterday I had the joy of configuring two Polycom SoundStation IP-4000 speakerphones. We use an Asterisk PBX. It took 4 hours. Why? Because Polycom expects you to use Google to find out that, when they use the word “Address” in the Line configuration screen, they actually mean USERNAME.

That’s right. If you put in the IP address of your SIP server where it asks for “address” under the “line” section, it will not authenticate. Asterisk tells you that it’s using the wrong username and password. Instead you have to put your SIP username in the “address” field. This is made more complicated by the fact that there are 3 other locations where you have to type in the username.

Another spectacular feature of this product is that just 1 page of its web-based configuration works in Firefox. All the other pages fail when you go to submit new values in their forms.

Aside from this idiocy, the sound quality is what I expected from Polycom: awesome. But for $800 I want “Address” to be renamed to “Username”.

Rant: Companies who claim 24/7 support but really just have an online FAQ

Thursday, August 10th, 2006

I’m currently having a problem with my Line6 PodXT Live. This company advertises that all of their products have 24/7 technical support. Yet if you call their tech support phone line, you learn that they are currently closed. What they really mean is that their web site is operational 24/7. Holy Moses! A company with a 24/7 accessible web site! The pains Line6 will go to for their loyal customers!

Have you ever had a problem that was actually solved by viewing a company’s online help site? I can count how many times that’s happened to me on 2 fingers. This problem isn’t one of them.

MacBook Pro & 30″ Cinema Display problems

Wednesday, March 15th, 2006

So as I posted yesterday, I’m having some problems with my 30″ Apple Cinema Display and my MacBook Pro. Random flashing green speckles, red lines, white dots, etc. They’re very noticeable on a black background. Anyway, I did a little troubleshooting today and the problem seems very much to be a design defect in the MacBook Pro itself.

Connecting to another MacBook Pro (a 1.83GHz model; mine is 2GHz) produces the exact same problem
Connecting to a PowerMac G4 does not produce the problem
Connecting to a PowerMac G5 does not produce the problem

So, if I connect to any Macbook Pro, I get the problem. If I connect to any other computer, I don’t.

Other people having the problem are here and here.

My gut tells me that this is something they can fix in software. I hope they get a fix out soon.

Spotlight support for Entourage!

Tuesday, March 14th, 2006

Today Microsoft released an update for Office 2004 for MacOS that includes Spotlight support for Entourage. This is wonderful because as there isn’t another full Exchange client for MacOS, I’m forced to use Entourage, but had to give up Spotlight. I believe Microsoft originally stated that Spotlight support wouldn’t for a long while; I guess they got it out early (!! this is Microsoft we’re talking about)

In other news, I received a 2GHz MacBook Pro with 2GB RAM. It’s a very nice machine. It is faster than my Dual 2GHz G5 in almost every way - yet it’s a laptop. So far the only real problem I’ve had with it has been using it with my Apple 30″ Cinema Display. I’m getting a bunch of green and blue speckles all over the screen that sort of come and go. They’re very noticeable on a black background (like Terminal). The interesting thing here is that I don’t have the same problem when I connect it to my PowerMac. Once I tried to use a dual-link DVI extension cable with this monitor and it produced the same digital artifacts. This time I’m plugging directly into the MacBook. It’s a bit annoying. There are a few reports in the Apple discussion forums of other people having this - it seems like it may possibly be a software problem.

Blackberry Enterprise Server of DOOM

Monday, February 6th, 2006

I had a tough time deciding if this should be in the “IT Rants” or “IT Joys” category. I really love my Blackberry and the functionality the Blackberry Enterprise Server provides is awesome, however this is the second it’s blown up. And anything that ever blows up should be classified under Rants.

So.

I reboot the server BES is installed on. A normal reboot, as in, start, shutdown, restart. I did this because I needed to move where the server was in my “virtual infrastructure.”

Server boots back up. BES is dead.

The long story:

All the work I had scheduled for last Saturday night, starting at 8PM, was done by 11PM. Only 3 hours seemed pretty good. But then I went to test BES and saw that it was dead. I’ve been through this before, so I tried everything I knew until I absolutely couldn’t think of anything else. After all the googling and browsing of RIM’s support site. By the time I was done with all of that, it was 5:30AM. Yup, I’d been at it for 6.5 hours. Then I say to myself, “okay.. let’s call RIM for support. I know our support contract is only for 8am to 5pm, M-F, and this is early Saturday morning, but they probably have some option where I can pay them $245 (like Microsoft) and get support right then and there, outside of my contract.” Well it turns out they don’t. Unless you have a 24×7 support contract (”T2 or better” in BES-speak) you ain’t gettin’ no help no matter what. No matter how much money you’re willing to give them. It ain’t happenin’.

One of the (very few) things I like about Microsoft is that no matter what time of day, and no matter what Microsoft product you’re using, and no matter which unsupported way you’re using it, there is ALWAYS a skilled living, breathing human being available to help you. The person may be on the other side of the world but they’re there to help you. I’ve used this service about 4 times now. I only use it when I’m absolutely out of options, and by then the $245 they charge is pretty acceptable. One time I was on the phone with the same rep for 9 hours (not exaggerating) fixing an Active Directory problem. 9 hours of Microsoft-inside tech support for $245? Yes, that’s good. Well, the cost was worth it. Having to work on a Microsoft problem for 9 hours is definitely not worth anything. It’s a pity my servers are all virtual now because there’s nothing physical I can punch, out of frustration. (I could kick the ESX server they’re running on, but that wouldn’t be fair!)

So then I went on RIM’s support site and saw that you can buy 24×7 support right now, online, with your web browser. I saw this and felt very happy - I could just upgrade my support account right now and be able to call them. I go through all the checkout steps. At the end, I’m told that they have received my order, and it will be processed in 3 to 5 business days! No where else in this process did they say I wouldn’t have instant access.

At that point, it was 6AM, I’d been at it since 8PM, I was totally wiped out and out of options, so I went to bed.

I woke up at 5pM the next day. The only thing good about an overnight IT battle is how you feel after your 11 hours of sleep.

But the feeling was short-lived; as I awoke, I remembered that BES is still broken. As is common for me, I had dreams about the problem, and I actually came up with a few more things to try.

I would build a totally new server, dedicated solely to BES. Thanks to my VMware Virtual Infrastructure, doing this, from home even, was extremely easy. I had a new Windows 2003 Server up and running at about 7:30pm, having started the work around 6:30. (I didn’t have a template. But I do now.)

Everything looked good. I coped the BES database over, attached it in MSDE, installed BES with the same SP and hotfix level I had before. Went to start up the Blackberry Manager. And I got the same error I had on the original machine:

“Unable to retrieve the server’s distinguished name or allocate a MAPI buffer for this server.”

All that work for nothing. Well, not entirely nothing. It was good I had it installed on a separate server now, because while fixing the problem with RIM support, I had to reboot it about 20 times. The machine it was originally on was a domain controller and the DHCP server, so having to reboot it that often would really stink.

I was on the phone with RIM support for 3 hours, 45 minutes. We went over every single possible thing they knew of and I knew of to try and fix it. Absolutely none of them worked. Then the RIM support guy suggested we make a new username for BES to run under. We were using BESAdmin before, this new one I named bbadmin at his suggestion. Soon as we switched everything over to that new user (which involves a lot of tweaks all over active directory, exchange, and the OS), the server ran perfectly.

So, in the end, the problem was some sort of corruption in the BESAdmin account. How a reboot could cause this, I don’t know. It was a normal reboot. No errors anywhere. Scandisk found no problems. The SAN move went fine.

This has been added to my list of why Windows sucks. There are things it’s good at, yes. But when Windows is bad, it’s way, way, way badder than anything else in existence. I ask you. Has there ever been a single problem in your life that WASN’T IT related and took 16 hours or more of constant labor to fix? I’ve lost so many weekends, so much of my life, to Windows.

Acronis TrueImage

Tuesday, January 10th, 2006

Yesterday I had to image a new desktop machine and like always, I turned to Ghost. It’s what we’ve always had and I keep using it. This time, however, no matter what I tried, I couldn’t get it to read past the first 45MB of the hard drive. Support was useless. Then I remembered hearing at my VMware course that people were very happy with Acronis TrueImage. So I went to their web site and tried the demo. I was hooked.

First of all, no more fucking Ghost boot disks. I can’t tell you how sick and tired I am of hunting down NIC drivers and making boot disks. It’s horrible. Acronis lets you burn a single CD (a CD! not a floppy!) that has support for just about all modern hardware. Ghost comes with a lot of drivers, but they’re all DOS based, and hardware produced in the last few years just isn’t supported out of the box.

Acronis’ stand alone imaging CD is Linux-based, too. It lets me browse my CIFS network and easily find servers. It lets me write images to my NetApp, something Ghost was never able to do, for some reason. And it’s faster. And they let me buy it online and download it right away.

Thus ends my use of yet another Symantec product.

The Unfathomable Greed of Brocade

Tuesday, January 10th, 2006

I finally managed to get my hands on some fibre channel switches. Two Brocade Silkworm 200Es. They seem to work pretty well so far. Since trying to install one, however, I have been mortified by the greed Brocade seems to get away with.

Element of Greed #1:

It’s a $4k fibre channel switch. Pretty expensive. IT DOESN’T COME WITH ANY RACKMOUNT HARDWARE. Not even a pair of L brackets. Nothing. Companies use this equipment to construct extremely stable and reliable storage systems; this device is ALWAYS going to be mounted in a rack, yet they don’t include the rackmount hardware to do so, unlike every other vendor on the planet. To their (dis)credit, they threw in a little pack of 4 stick-on rubber feet.

After fuming for a while, I realized that it’d be easier to buy rackmount hardware for it instead of returning it. That’s when I discovered that their rackmount hardware kit costs $250 per switch. The fuming began again. The kits came in today, and they’re pretty good and solid. They’re easily strong enough to support a full 3U server. But all I need them for is to mount an 8 port 1U fibre channel switch! It’s no bigger or heavier than an entry-level Ethernet switch….

Element of Greed #2:

Above, I said that the switch I bought had 8 ports. That was a lie. It really has 16 ports, but 8 of them are disabled. I only paid for an 8 port switch, so I’m not shocked or anything. If you want to use the other 8 ports you need to buy a license for them from Brocade. I don’t get it - the ports and their associated hardware are already there, I’ve already paid for them. Why do I need to license them?

NetApp does similar things too - you buy something from them, and if you want to add more functionality (like NFS or CIFS support), you have to buy a license for it. I can understand this though - they spent money developing software to support those protocols and they need to recoup their costs. But is there any difference in the firmware necessary to support 8 fibre channel ports vs. 16?

In other words, I’ve paid Brocade for 16 ports. I’ve paid them for the software the switch comes with. There is no extra cost for them to support 16 ports over 8, if the box already has them. Why on earth are they doing this? Well, I think the title of this post explains it.

An 8 port fibre channel switch from Brocade: $4000
Lowest-end support for said switch: roughly $500
Finding out that you paid for functionality you can’t use and that you have to pay more just to mount the damn thing: priceless

Bakbone NetVault

Tuesday, December 20th, 2005

I’m using a demo copy of a backup program named “NetVault” produced by a company known as “Bakbone”. It was suggested to me by my CDW representative as an alternative to Veritas. As you probably already know, I’m not that happy with Veritas.

To my surprise, they offer a Linux version of the NetVault server software that has all the features of its Windows counterpart, including things like Exchange mail store backups, MS SQL backups, etc. That right there is enough to get me to try the product: to not rely on Windows for backups anymore.

The GUI is pretty straight forward to use. In my case, the backup source consists entirely of network shares and the destination is an Apple Xserve RAID. In other words, I’m backing up to hard drives. Setting this up in Bakbone is slightly different than other products I’ve used… you have to create what they call a “virtual tape library”. You specify how many tapes you want (each tape = one file) and how big each tape is (the size of the resulting file). The drawback here is if you, say, create enough tapes to fill up a 2.6TB partition, it will take a long time for this operation to complete. It creates each file right away, filling it with zeros. 2.6 terabytes of zeros. It took a long, long time.

However, the pro to this is that as far as the rest of NetVault is concerned, you’ve got a real tape library. The virtualness of it is hidden from the rest of the system. As a test, I’m having it back up all 2.5TB of data on my NetApp. If it goes well I’ll probably try to buy it. The Veritas desktop agent is really slow and consumes a lot of resources, and I’d like to get rid of it.

NIC teaming / 802.3ad in VMware ESX Server

Tuesday, December 20th, 2005

Note to self: When you remove a NIC in ESX from a bond, and you don’t inform your switch of this change. Strange Things Occur.

This prompted me to write a little something describing how to set up NIC teaming in ESX Server. For the uninitiated, NIC teaming, also known as link aggregation or 802.3ad, is a way of grouping network interface cards to improve reliability (redundancy) and performance (transfer speed).

To make this work well, you need to do some configuring in ESX and also in your switch. Like always, my switch example will be for a Cisco IOS-based Catalyst, though the principles involved in 802.3ad are pretty simple and standardized so you should be able to apply it to any other switch with the 802.3ad capability.

Enable 802.3ad NIC teaming in VMware ESX Server

NOTE: This example involves using the ESX MUI; as of the current version of ESX and VirtualCenter, you are forced to use the MUI for these changes.

1) Login to the MUI and click Options. Click Network Connections.

2) Assuming you already have a virtual switch, add an unused outbound adapter to your virtual switch. This is pretty much all you need to do on the ESX end of things - you now have a bond. You can bond more than 2 NICs as well.

3) In your switch, you must configure a port channel (Cisco-speak for a 802.3ad team), and then assign specific hardware ethernet ports to it. You also need to set up trunking on the port channel, if you want to use VLANs in your VMs. Like so:


ZORAC# conf t
ZORAC(config)# int Port-channel1
ZORAC(config-if)# switchport trunk encapsulation dot1q
ZORAC(config-if)# switchport trunk allowed vlan 1,2
ZORAC(config-if)# switchport mode trunk

The above creates a port channel. Now we’ll assign ports GigabitEthernet0/1 and GigabitEthernet0/2 to the channel.


ZORAC# conf t
ZORAC(config)# int GigabitEthernet0/1
ZORAC(config-if)# switchport trunk encapsulation dot1q
ZORAC(config-if)# switchport trunk allowed vlan 1,2
ZORAC(config-if)# switchport mode trunk
ZORAC(config-if)# channel-group 1 mode on
ZORAC(config-if)# exit
ZORAC(config)# int GigabitEthernet0/2
ZORAC(config-if)# switchport trunk encapsulation dot1q
ZORAC(config-if)# switchport trunk allowed vlan 1,2
ZORAC(config-if)# switchport mode trunk
ZORAC(config-if)# channel-group 1 mode on

Now GigabitEthernet0/1 and 0/2 are in a 802.3ad team. You may also want to use the below snippet to configure how load balancing will work with your team:


ZORAC# conf t
ZORAC(config)# port-channel load-balance dst-ip

This will balance the traffic going into the server based on its destination IP address. Load balancing settings for traffic going out of the server are decided by ESX and are also configurable. The default is “out-mac”, where ESX load-balances based on the destination MAC address. Using out-ip instead can improve network performance for VMs that produce a lot of network traffic. Traffic gets distributed more evenly across all the links in a team. However, your network switch has to support this. If you’ve got a Catalyst configured as above, then you’ve got the support.

To change ESX’s load balancing to out-ip, do the following:

1) Determine what the name of your team, or, bond is. The easiest way I’ve found to do this is to run this in the service console:


[root@esx root]# cat /etc/vmware/hwconfig | grep bond

You’ll see a few lines appear, mentioning either bond0, bond1 or something similar. Remember which bond it is.

2) Add the following line to /etc/vmware/hwconfig. Check to see if you already have a similar line - I didn’t but you might if you’ve attempted something like this before:


nicteam.bond0.load_balance_mode = "out-ip"

Be sure to put in the correct value for “bond0″.

Now here’s the catch: changes to /etc/vmware/hwconfig are not read until you reboot, and no one wants to reboot an ESX Server. You can activate the change immediately by typing the following command into the service console. Again, replace bond0 with the name of your bond:


echo "nicteaming load-balance out-ip" > /proc/vmware/net/bond0/config

It took a lot of googling for me to figure out exactly what you had to echo into config to make the change immediately. Hopefully this post will make the answer easier to find :)

That’s all! You’ve now got an 802.3ad NIC team running with IP-based load balancing on incoming and outgoing traffic. VMware has published a white paper about this subject, which you can view for more information.

Trend Micro Anti-virus

Wednesday, December 14th, 2005

I just finished deploying “Trend Micro Client Server Messaging Suite for SMB”, aka Trend Micro’s networked antivirus product. Despite the convoluted name, it’s AWESOME. I can’t imagine how they could make deployment any easier… I login to the admin web site, select all the computers in my AD domain, and click install. It does the rest. It even uninstalls whatever AV product might be on a computer before it installs itself. (Goodbye McAfee!)

I’ve already had one report from someone saying they’ve noticed that their computer is running faster. That’s Trend’s other claim to fame: low resource utilization. It’s also already found viruses that the other two products have ignored. And the cost for licensing the server and 51 clients was very small, especially compared to what McAfee and Symantec charge for their products.

Trend’s admin tools are wonderful. It’s all done through your web browser (and you can use any browser you like, so long as it’s Internet Explorer….). You can set policies like “no one can uninstall the AV client without a password”, “no one can change the update settings”, etc. It will automatically download updates from the Trend server on your network, to save your outbound bandwidth. If it can’t find a Trend server (for example, a laptop that someone brought home), it will then download over the Internet from Trend directly.

So, in summary, Trend Micro’s AV product is easier to deploy, easier to manage, faster, more efficient, better at catching viruses, AND CHEAPER, than McAfee and Symantec.

I’m really, really happy that I don’t have to deal with Shitcaffe or Pissant-ec ViruScan anymore. The next P-o-S IT product I hope to eradicate from my life is VERITAS BACKUPEXEC, which I’ve ranted on before. I read an article where someone was complaining about the same stuff in BackupExec that I was, and then he went on to mention that he switched to Retrospect’s corporate product and was quite happy. So, today, I hope to try a Retrospect demo. The absolutely disgusting thing is that I can get a competitive upgrade to Retrospect, if I choose to go with the product, for $400, and that includes unlimited licensing for servers and clients. We paid way, way more than that for our Veritas licensing, and we’ll need to pay even more when we add more employees. So, if all goes well, Veritas can go the way of Mcafee.