Skip navigation

Monthly Archives: March 2016

Now that I got your attention with the previous post, I will encourage anyone interacting with a vendor (my company included) to hold them accountable to deliver over the long haul.  As an industry both from vendor and consumer, we have to get better and here are a few thoughts that I encourage you to consider:

1 – Don’t let your vendor present to you. Make them whiteboard the solution.Scratch Pad P34

In my previous life I was a manager for an SE team and I invited one of my security “experts” into an important customer meeting. He had a set of slides that he had accumulated from corporate. He walked through the slides talking about the solution and then the customer asked a question, to which he said, “I don’t know.” The customer was polite enough to point out that his slide had the very answer, but they were looking for clarification. You see slides are the crutch of any organization. Slides are the marketing department run wild to make some of the most ugly solutions look like a work of art. They make the lipstick on the pig actually look attractive. They have animations and custom graphics that lure you into believing that everything works. After all if a company spent that much time and money on building these works of art, then the solution has to work equally as great – right? Nothing could be further from the truth. PowerPoint engineering is not a substitute for a workable solution – don’t fall for that trap. If the vendor really knows what they are doing, make them whiteboard the entire solution and talk to you along the way. Take away the crutch and see if they can run.

2 – Test the product at scale

Quite frankly anyone can make a four switch solution look to work well – but when you fully load it, things can go terribly wrong. Case and point, I was told about some testing the other day on another solution, where when the customer loaded up about one hundred mac addresses on a switch, periodically the switch would flood all traffic on all links. It was what appeared to be a lookup miss in the mac table lookup process. While you may not notice this with 10 connected hosts – I guarantee you will with 10,000 hosts. Decide just how big your network will become in terms of hosts, VLANs, VNIs, L3VPNs, VRF, routes, racks, BGP peers, MLAG’ed hosts, etc. Tell the vendor that you want to see this at simulated scale in their lab.   Do a proof of concept. Most vendors even offer this in a remote fashion where they build it and you can test/break it remotely. Make changes to the environment and see how it reacts. I encourage you to pull linecards, reboot switches, do upgrades, etc.

3 – Call their tech support with a problem

Have you ever been working late in the evening and hit a problem – so you call tech support because your change window ends in 2 hours and you have to get this working? You pick up the phone and call – you are subsequently put on hold waiting for the next available representative. Once you get to a live person, they are the equivalent of “reboot the box and let me know if the problem goes away.” 90 minutes into the call and 30 minutes prior to your change window expiring, you get to someone. By this time, you have to revert your change, mark it bad, and start to collect documentation to go before your change review board as to why the change was bad. To add insult to injury, three days later you get an email back from tech support as to the exact configuration change that is necessary to work around a bug you are hitting. HOW FRUSTRATING! I encourage you to call any vendor’s tech support and give them a question. Is the person answering the phone the one to answer your problem, or are they a glorified metric keeper – you know the ones that can say they answered your call within x number of minutes. I don’t need you to answer my call, I need you to fix my problem.

4 – See it, feel it, experience it

I once had a very smart executive tell me that “if I can’t see it, feel it, experience it, then it didn’t happen.” In other words, get the equipment in-house. Do things to it without the vendor around. Upgrade it and see how long it takes. Pull cables and see what happens. Change the link speeds, do all ports flap? Patch the operating system, extend the operating system, kill things, etc.   In addition to the product, I encourage you to talk to the leaders in the company. Do they take time to speak with you or are you not important enough to them? Find the people that build the product and software and ask them about it. I am always thrilled when I get to watch a vendor talk about their product like they are talking about their own child. Do their eyes light up when they discuss the greatness? Experience the solution in all aspects. It is only at this point that you can truly assess the product you are about to buy.

Yes, I know this is long and yes I know you may be tired of hearing from me by now – but we can be better. Yes I work for a vendor, something that I am proud of. No we aren’t the right product for everyone and I am glad to tell you that if it applies. However if my organization is not living up to this standard, we want to know because I can assure you that it is not what we are about, just go watch Ken Duda’s video on quality to see how we operate.

In the end, if you are unwilling to invoke any of the aforementioned suggestions, you can ride off into the sunset and be perfectly content with mediocrity. After all, my dog was perfectly fine with eating her own vomit and lived a long life that way. I did find that when we watched her closely we could dramatically short cut her illness if we simply removed her from the situation when it happened – are you willing to remove yourself from your current vendor when bad things repeatedly happen or will you continue to choke down the unpleasant result?

I had this dog. She was a good dog, but she did some things like all dogs do that disgust me. Every time she would get sick, we would have to run to find her and quickly pull her away from the vomit so she wouldn’t immediately lap it up again or the cycle would continue. Now that you are sufficiently disgusted with this behavior…This is the way that I feel when customers tell me that they continue to believe the promises of vendors that don’t deliver. They believe the PowerPoint engineering that are beautiful works of art – but nothing more. They believe the executive promises that they will get it right this time. When are we as an industry going to quit blindly believing what we are told and demand to see it, experience it, feel it, touch it, and break it?

I was presenting at a customer the other day and a short way into the presentation we got to talking about the problems that the customer had in their network. They were most interested in how our architecture and design specifically was going to solve their problems.  This particular customer was interesting as we talked through some of the challenges. You see it appears that they had been sold the proverbial “we will get it right eventually” tactic. I can’t tell you how frustrating it is to hear customers go through years of pain and agony. Years of changing architectures each with the promise of fixing all the things that the previous design couldn’t. Years of forklifting one platform for another because that one has a bug in the hardware or there is a new “Special” ASIC that they need to take advantage of. I even had a customer suggest that they were going to replace a vendor’s custom ASIC solution that had a HW bug with a brand new version of a different custom ASIC versus pushing towards an off the shelf merchant silicon solution. AHHHH – holding my tongue at this point was tough!

The few, the brave innovators of our industry, are willing to take the risk. Unfortunately they are only willing to do that after they have been burnt so many times that they have a learned response to their current vendor. When I walk into a conference room, I look for the most bleary eyed person in the room. That is the poor guy or gal that has to spend weekend after weekend troubleshooting problems, testing code version only to find YAB (Yet Another Bug), and doing painful upgrades in the hope that the next version will be better. They are the soldiers that get burnt out – but they are the ones that often hide the fact that the organization is eating its own vomit. Sure, just like with my dog, I sometimes make it in time to pull it away, while other times I am too late. These guys and gals are the equivalent, they hear that fateful sound of the network having problems and they devote their personal time and energy to fix the problem and limp along another day.  They are the heroes of the business and they deserve better!

In my next post, I will suggest ways that we can all avoid these pitfalls and make our networks better.  Stay tuned for Thursday’s post 🙂

You may be reading this post thinking it is about some management philosophy – sorry to disappoint you!  This post has a two-fold purpose 1 – as a response to an article I read last week  (https://www.linkedin.com/pulse/why-network-industry-has-been-stuck-1980s-ciscos-embrace-joe-howard) and 2 – my new role that I picked up in the last month.

For those that don’t know, I devoted a lot of my pre-children days to certification (which I highly encourage anyone getting into IT to do).  I posted previously that I got all the Microsoft and Novell Certifications that I could and I wanted a new challenge, so I undertook the R&S CCIE.  This was back in the days of the 2 day lab and scenarios that at that time were highly applicable to what I was doing day in and day out.  By the time I decided to take the Security IE, it had switched to a one day exam that was more of a “stump the chump” mentality in that what was tested was very rarely implemented in networks but was a lot of required learning nonetheless.

As I read the above post from Joe Howard (which is a very good post) – I was struck with what I get to experience many days in my current job.  I get to go visit a lot of customers and no less than once a week I hear about some crazy design that makes no sense and in reality reminds me of a CCIE lab – overly complex to solve a relatively simple problem.  What I believe happened is that customer networks have slowly become the CCIE’s practice lab – not from solving problems but from creating unnecessary complexity to ensure job security and/or prepare the individual to pass the lab.  At a minimum the art of simplicity was thrown out as the first step in solving a problem.  This in turn creates a patchwork of design modifications and point in time fixes that over the course of time becomes a big ball of mud.

I was at a large customer the other day that I have worked with over the years.  They had the simple edict – no proprietary protocols EVER.  This adherence to discipline in itself helped keep the network clean and sleek.  As people have moved on, proprietary things have moved in and complexity ensued locking in yet another customer.  I talked with another customer the who had a full MPLS network inside of their relatively small datacenter.  I was shocked.  When I asked why – there was no valid reason that VLANs couldn’t have solved his problem.  Combine this with yet another example of a customer that implemented hundreds of static routes throughout the network and complained that he could never go on vacation in fear that something would go wrong and he wouldn’t be there to fix it.  We have got to stop building complex networks in the name of job security and training for our IE lab. I believe that the IE has made networks a work of art versus solving the business demands.  The industry is moving on and some of the most beautiful designs I see are those of the cloud providers who are incredibly large, but incredibly simple.  This equals more stability and greater ability to orchestrate everything they do.  Several that I spoke to don’t ever plug a laptop into the devices, everything is automated.  If it can’t be automated it doesn’t go in.  Complexity is very very hard to automate.  To drive my point home about vendor introduced unnecessary complexity, I went to a few different vendor’s sites the other day and one had over 100 data center design guides in some flavor – how in the heck do you decide how to put that all together? No wonder it takes an IE level person to make this stuff work.  The industry can do better here!

Now to my second quest.  I recently took over the leadership for training and certification at Arista.  Working with some great people like Gary Donahue who is a legend in the networking industry with his Network Warrior and Arista Warrior series of books.  Gary and I were talking about launching the next line of Arista professional training and exams – one which will be an “Expert” level lab exam.  What I don’t want the exam to be is a reason for someone to introduce unnecessary complexity into a production network to show how smart they are. We are thinking that half of the exam will be about deep knowledge of protocols and building data center networks, but the other half will be about automating and orchestrating capabilities thus forcing simplicity in design. We plan to include items like leveraging the unmodified Linux capabilities to solve real problems, extensibility in fixing and introducing new capabilities, and then of course troubleshooting, and last but not least, using DevOps to detect and fix the problems before they ever cause a problem.  Yes we want to unleash the proverbial “Chaos Monkey” in the lab and see how well the network stands up.

Network

We are early in deciding what the lab will become.  We are getting together some of the foremost experts in the company to help us build scenarios and ultimately the lab.  I won’t apologize for it being brutal.  However my hope is that it is a useful brutal to drive simplicity versus a reason to implement unnecessary complexity in real networks with little return.

Yes the journey for both of my CCIE’s was worth it.  I learned a ton and implemented much of my routing and switching IE in real life.  It is time for the next generation of learning and development where we encourage creativity – but creativity starting with simplicity to solve the problem.  In the end if you can’t orchestrated it and maintain it – you might want to think about a different approach to solving the problem.