Skip navigation

Category Archives: SDN

Given all the information out there this week about Intent-Based Networking Systems (IBNS), I always think it is good to separate the “martketecture” from the reality. We are embarking on a new era that could rival that of the days of SDN as the “future of networking.” We will see more “PowerPoint engineering” in the coming weeks with the associated tweets and blog posts that will make even the most skeptical network engineers question their protocol knowledge and existence.

Let’s start with what is “Intent-Based Networking?” I actually love famed Gartner Analyst, Andrew Lerner’s definition of what is an Intent-Based Networking System:

  1. Translation and Validation– The system takes a higher-level business policy (what) as input from end users and converts it to the necessary network configuration (how). The system then generates and validates the resulting design and configuration for correctness.
  2. Automated Implementation– The system can configure the appropriate network changes (how) across existing network infrastructure. This is typically done via network automation and/or network orchestration.
  3. Awareness of Network State– The system ingests real-time network status for systems under its administrative control, and is protocol- and transport-agnostic.
  4. Assurance and Dynamic Optimization/Remediation– The system continuously validates (in real time) that the original business intent of the system is being met, and can take corrective actions (such as blocking traffic, modifying network capacity or notifying) when desired intent is not met.

Even in a recent article Lerner goes on to describe a vendor’s recent IBNS launch:

“It’s a platform that should enable intent driven network management in the future,” he says. “Except for some discrete, tight use cases around configuration, it’s not quite completely glued all together yet.”

So there you have it, vendors already claiming that they have an IBNS solution – which is really a bunch of disjointed parts – sound familiar? Aren’t we really jumping the gun? The same thing happened with Software Defined Networking – everyone claimed to have an SDN solution. In fact a common question I would get is “what is your SDN solution?” Ugh, let’s play buzzword bingo and see who is left standing.

As a kid one of my favorite shows was Looney Tunes where Wile E. Coyote would chase the Road Runner endlessly utilizing every method that he could to catch that bird.

In the picture above, this was the standard approach – he would lite the fuse on the rocket to go fast only to blow up when that approach reached its conclusion. You see Wile E. had a great intent – to catch the Road Runner. However he was operating from a flawed platform and approaches. You know how the story goes – again and again he tries and fails, at times blowing himself up, other times he would fall off a cliff. The moral of this story today is that his intent was great and maybe even the method seemed like a good idea, but he lacked the basic understanding of how things worked in his endeavor.

Another popular perspective is Maslow’s Hierarchy of Needs that was updated around 1954. Maslow’s theory suggests that the most basic level of needs must be met before the individual will strongly desire the secondary or higher level needs

Being a bit of a psychology buff, this is fascinating to me because it very simply describes human behavior. I began to wonder if in fact that we could apply the “Hierarchy of Needs” to other parts of life.

So after much thought in the Intent-Based Networking space combined with my history of Service Management, Application Development, and 20 years in Networking, I decided to propose “DT’s Hierarchy of Networking Needs”

Let me explain each of these layers below:

Quality System

You have to start with Quality! Recently I had a customer comment to me that it was beautiful that when things broke, they broke and recovered the way we expected. You see things are going to break, bugs are going to happen, and even cosmic rays are going to cause SEUs. If a vendor tells you that things like this won’t happen they are lying to you. It all starts with having a good quality system that you ideally can update and patch without taking down the entire system. It begins with confidence that when you do an upgrade that you aren’t going to step into the quagmire of new bugs that lie in waiting for you to trip on them. As Ken Duda once said “If the network isn’t working then nobody is working!” So the basic principle that aligns to that of Maslow is you have to have the physiological needs met – in this case, high confidence that things work as expected with high quality.

Basic Features/Capabilities

Anyone can build a switch that can pass packets between two interfaces – if you can’t you don’t deserve to be in networking. After you can pass packets, you have to be able to do it under stress and scale. Then you start to organize things into a full system where devices talk to other devices. In order for this to work, you need to have some level of standards based protocol support such that everyone plays nice in the sandbox. Let’s face it, if you need STP support and one vendor doesn’t have it, you are up the proverbial “creek without a paddle.” Likewise if you are routing and you need BGP to handle the scale of routing – then that becomes a basic functionality. You must have a basic set of features and capabilities to make a system interoperate correctly.

Telemetry/Visibility/Management

Next after building a network where everything works together, you need to be able to manage the system for when things do happen. Those things could be misbehaving applications, cables going bad, NICs acting up, and even the occasional bug. Being able to see what is going on in the network is essential. We are moving in a new direction with things like streaming telemetry, which will be the death of SNMP (finally). This telemetry will also be a basis for us to engage in machine learning in the future as a critical data feed. You also need to be able to extract not just control plane information but you must be able to mirror ports and aggregate these mirrored ports for visibility for security, inspection, and troubleshooting. So the third need becomes the Telemetry/Visibility/Management of the network.

Programmability

The era of SDN ushered in new forms of programmability and automation that had been unheard of previously in the networking arena. You should be able to run your network with similar systems to managing your servers – in reality it is Linux under the cover so why wouldn’t you? We need to be able to inject our programmatic will into the control plane to steer packets in ways that are not in the natural order, we must be able to query the systems and leverage this information to provide proactive notifications of problems before they ever happen. Programmability goes way beyond that of a CLI – in fact I would say that the CLI will slowly age out in favor of programmatic/API integration into the network. The network is yours, so shouldn’t you be able to do with it what you want, versus a vendor telling you “only do it this way?”

Intent-Based Networking

We arrive at the top of the pyramid where we have reached the main goal in our networking lives – that of signaling the intent to the network and having it “auto-magically” instantiate this optimized network that has the security policy that you intend and it is automated in every way thinkable. Yes, I agree that it sounds like a utopia that we all would like to live in…

BUT – can you image trying to instantiate a network configuration when you have no confidence that the next bug you are going to trip over will loop your overlay and crush your users? Can you imagine trying to instantiate this network without well-formed APIs that cross vendor boundaries to make sure one vendor interoperates the same way with another? How about instantiating a network that is opaque in its monitoring and management capabilities? The reality here is that we have a lot of work as an industry to get the first 4 layers of needs satisfied. Yes I have an opinion as to what my current employer is doing and I do believe it is spectacular to get those pieces in order in a way that doesn’t lock a customer in. Customer’s should want to stay with a vendor, not because of some Vendor Defined Networking model but because they do have the highest quality in the industry, with the necessary features needed, combined with great telemetry and visibility, AND for the rainy day where I want to do something spectacular – I can program my way to the finish. This in my mind is where we as an industry must focus. It seems easy to create PowerPoint slides and videos of how this IBNS world will solve everything, but I ask you first to train and run the race – don’t just step over the finish line and declare victory!

 

I have spent nearly 20 years in networking. In fact as I was cleaning out my store room last night I found a reminder of one of the early certifications…my Novell CNE 5 (I have a 4.11 cert somewhere).

Novell

This is where I cut my teeth – building networks large and small. Building servers that at the time were the best in the business (Compaq Proliant).  At that time 10-BaseT networking was new and I even did some 100VG-AnyLAN.  Early in my career I was known for cleaning up the messes. I used to joke that I was going to buy a fireman suit because I was always putting out the fires. The reason I was always called in to fix the mess, was that I was unwilling to compromise. I was unwilling to do half the job that needed to be done. I was unwilling to walk away with something broken. My parent’s called it stubbornness, I called it character 🙂

Fast forward almost 20 years and a few management jobs later including a stint at a 90 year old company and I found myself asking why? Why was I willing to compromise to get something lesser than what I know was the right answer. Somewhere along the way we let mediocre seep into what we do and then mediocre becomes the norm.

One of the early requirements of my networking career was “no proprietary protocols.”  As much as I would like to say this was my idea, it was actually the idea of one of my customers who at the time had the largest private WAN in the world. A WAN that I was fortunate enough to design and build twice in my tenure (over 9 years)…all with no proprietary protocols. We even went so far as to implement RIP on a large scale because we needed something lightweight for dial-backup and EIGRP would have been easier. I have had other customers over the course of time that this requirement was softened and then they built something that only one vendor can provide…and then they are stuck. I was at a different customer this last week and they told me that they had a vendor that convinced them that EIGRP was the best thing…so they listened. Two years later they were basically held hostage in pricing because they had implemented something that only one vendor could do.

So back to my why compromise? The pressures being placed on the network today are bigger than ever before. The people requirements are getting even worse. Forbes estimates that “75% of IT time and activity are spent keeping the lights on.” An Avaya survey reveals 80% percent of companies lose revenue when the network goes down (pretty sure it is closer to 100% today). They also found that 1 in 5 companies fired an IT employee as a result of network downtime. So we are asking people to do more with less. The network being down in the financial area can cost an average of $540,358 per incident…but yet we ask employees to compromise. Would you be willing to choose a mediocre solution if your job depended upon it? Would you want options to hold your vendors accountable to one another? Would you throw a networking company out for poor code or product quality? If you choose a technology that is not 100% interchangeable – you will.

Dilbert

If you look at network silicon pre-2006 – speeds weren’t increasing with Moore’s law? This is because custom silicon (that built by the networking vendors) was focused on features. Feature keep you locked in…speed is fleeting. Fast forward to today and with the inclusion of merchant silicon we can now see the speed increases (chart below):

10GChart       Mooreslaw

10G density is on the same trajectory as Moore’s and the gaps in features used are getting less and less. So the silicon is the same, the features are similar, how do you differentiate? The answer in my mind is quality. Quality of the software, quality of the support, quality of the hardware, quality of the company. Don’t get me wrong, some vendors have the ability to get more out of the same chip than others – and you should evaluate that…but if you are 1 in 5 employees that is about to lose your job, shouldn’t you care about quality of what you are building?

If I am building a Data Center today, here are the criteria:

  • Highly Redundant and Available with 6x9s uptime (most apps want 5x9s)
  • Highly Automated and orchestrated from day 1 with minimal human CLI interaction
  • Vendor agnostic – where I can swap any spine group or leaf group out for another vendor without losing functionality
  • FAST – I want to upgrade in place when new speeds are available without a new architecture
  • Extensible to solve my business problems. As a vendor you build the network, as a consumer I know what I need it to do, so don’t limit me from doing it
  • Cost effective – I am willing to pay more for these things, but there should be competition to keep the price competitive
  • Simple – I shouldn’t have to retrain my people to make a change

I am sure there are others that you all would include.  This is a compilation of what I see in the Cloud providers and the most critical enterprises. However, if you are putting a bullet in the chamber and spinning it roulette style, then as networking professionals we should be unwilling to compromise from building what is the best.

I tweeted out a few months ago that “when politics and PowerPoint win – network engineers lose.” Nothing could be more true!  I encourage you to listen to networking vendors, but also encourage you to test it for yourself.  It is only then that you can tell the difference between PowerPoint engineering and true quality engineering and thus decide if your are compromising or not.

Have a happy day!

 

I was recently asked by a customer to give my perspective on a number of topics including Software Defined Everything.  I will be making a set of blog posts on this and other industry perspectives in the coming weeks.  For now – let’s talk Software Defined Everything!

For those running in IT circles there are many different categories of “Software Defined.” Today there is software-defined networking (SDN), software-defined storage (SDS), and software-defined data center (SDDC) to name a few.

One of the best lines that I have ever heard was from Carl Eschenbach – VMWare’s President and Chief Operating Officer – “Software doesn’t run on software the last time that I checked!”  This is very true.  However software does give us a degree of nimbleness that can ultimately help with solving the business problems.  In fact if you look at the Cloud Computing models that are out there today – they are largely due to the agility that comes with software.  At a recent event I saw a slide from IDC that accentuated the value of software and openness (recreated below)

StartingTech

The reality is that software coupled with Cloud Computing has enabled new business models like no other.  As I visit with many different cloud providers even their own models focus on the ability to capitalize on agility that they have and their competitors don’t.  Specifically customers tell me that Software Defined Networking is about “better orchestration and operational efficiency.”  This is also synonymous to openness to choose the best solution in the network.  Let’s face it – no single vendor is capable of being #1 in all areas.

The other idea around SDE is that you can completely abstract the software from the hardware.  While this has been tried with OpenFlow, we haven’t seen it take off in the general enterprise.  We also see this with the idea of “white box” switches that run a version of a Network OS.  The challenge I see here is how do you handle problems.  For instance –

Quanta LY6 switch has memory parity error _soc_mem_array_sbusdma_read: L2_ENTRY.ipipe0 failed(ERR)

Is this a hardware error or a software error?  Obviously this is on a Quanta white box switch.  How should the software react to this parity error?  Is it a single event upset or a more troubling error?  In the end parity errors suck!  Yes I know that cosmic radiation taking your network down sounds like a story in a science fiction novel, but they do happen.  In fact in networks with 15-20K devices they are likely to happen to the tune of 1-2 a week and that is assuming that you have built the device to be as protected as possible.  The reality is that software needs to account for these errors and minimize their impact.  Otherwise your switch is rebooting or worse yet – hung!  So as much as we would like to completely have the hardware abstracted from the software in network switches, the blast radius is often too large – where if you lose a TOR switch at least 48 physical hosts are affected.  Worse case the switch having a parity error causes instability elsewhere.  Think for a second how much of an impact a chassis that experiences a SEU in the control plane would have.

Lastly a recent article in The Register declared that “software-defined networking is making users grumpy rather than delivering promised benefits.”  I have my opinions as to why this is the case and I will share some of that insight with some facts from a  recent Gartner poll in the future.

I came back to networking after a hiatus where I spent time in  Service Management and Software Architectures at a customer.  SDN to me was the perfect storm where a highly programmable network could be achieved without the archaic ways of old with “screen scraping” and antiquated SNMP.  I once had a problem in my Cisco days where we wanted to have two links out of a remote office.  One link was a DSL link and the other a T1 link.  We wanted the delay sensitive traffic to traverse the T1 and backup/bulk traffic on the DSL link provided everything was up.  We wanted the mission critical traffic to switch to the DSL link if the T1 was down.   In order to achieve this with the CLI, I needed to do GRE tunnels with IPSEC tunnels on a secure T1 and the DSL circuit.  Sure we made it work, but troubleshooting it was somewhere next to impossible.  The real solution would have been a simple “If Then Else” statement in the logic of the device – but I couldn’t do this because of the CLI.  Now as we approach 2016 we are seeing SD-WAN solutions solve this problem and more.

To me Software Defined – is all about programmatic access to be able to extend and grow a solution based upon your needs.  Yes this means APIs and even SDKs for every solution.  At the end of the day, the customer knows how they want to use a solution.  You never know, sometimes customers actually do some pretty cool things that you never expected.

Have a Happy Day!

I have had the opportunity over the last couple of weeks to deliver several SDN presentations to different audiences.  As I prepared for the presentation I kept coming back to one truth – the network has always been software defined.  What changed is that certain vendors decided to take that software and put a configuration interface on top of it that gave you limited access to the functions and capabilities of the devices.  In other words you could say that networking has predominantly been CLI defined or vendor defined based upon the features and functions that are exposed to the customer.

As I thought back to my career, the majority being a network engineer, I was struck by this construct.  When I was doing development every new problem could generally be satisfied by code…software…back then in an editor and compiler.  So why shouldn’t I be able to solve my networking problems the same way?  I understand that there are standards and protocols that help protect us from bad code by an end-user (because application developers always write perfect code…right?).  However if I know how I want to manipulate the flow or control plane – shouldn’t I be able to do it given the risks?  The answer previously has been “NO” – but that is changing.  With the advent of EOS and extensibility the ability to manipulate SysDB and the control plane in general has opened the possibilities.  I understand that other companies may be running to the “bolt-on” SDN approach where they build APIs that will interface with the hardware, but these will ultimately be flawed unless you truly have an architected operating system that interfaces in an OPEN way with the underlying hardware and software.

So I am not an SDN purist that thinks that Openflow=SDN.  In fact I am much more in the camp that if you have a highly programmable platform the agent that you use to push control plane information to the hardware doesn’t matter.  You could use Openflow, Closed Flow, Darrin Fl0w – it doesn’t matter.  What does matter is that you can translate the flow or forwarding entries into information that the hardware can then populate its various elements to switch the packet efficiently through the device.

In short – I came back to the networking industry to combine all the great things that I had learned in the application world over the last 5 years.  The intersection of SDN or Programmability with networking provides for endless opportunities as to where we can take application control of the environment.  You see somewhere networking lost its way and became CLI or Vendor defined networking…not any more.  The new revolution is to open this programmability for those that want it through the use of OPEN APIs and interfaces.  Now I can have a truly business defined network or customer defined network…all due to enabling the programmatic inclusion of the underlying hardware and software.

That’s all for now – off to Cleveland for two more fun days of customer meetings!