As a security consultant, in the last years I’ve been involved in a good amount of network firewall migrations.
As technology evolves and performance increases, it is normal to decide not only for a hardware upgrade but for a complete migration to a different vendor that is a real challenge.
Furthermore, the top players Juniper Networks, Checkpoint, Cisco, Fortinet, and Palo Alto are pushing hard to gain more customers, leaving us engineers with the hard work to do.
Why is replacing a firewall so critical?
Well, because for a successful transition, all the seven OSI layers need to work well, from physical connectivity to application level!
And believe me, there are always some problems that you’ll have to take care after a migration.
In this article, I’m going to share my experience and a few tips on how to migrate your firewall… without getting a migraine.
Throughout the years, I’ve seen people doing it right and people doing it wrong, but I identified a pattern on the successful migrations.
The eight steps of the successful firewall migrations are:
- Learn the new technology
- Review current firewall configuration
- Configuration translation simulation
- Acceptance tests
- Declare frozen zone
- Configuration translation
- Monitoring phase
Let’s analyze each of them.
1. Learn the new technology
You don’t want to replace your old, loved firewall with a black box you’ve no idea how to use, right?
Also, you do not wan to be unable to troubleshoot a simple problem because nobody knows how.
Or even worse, you do not want to conduct experiments on the production environment, trying to nail down an issue disrupting the traffic…
To avoid all this, everyone involved in the firewall administration has to go under a training plan, familiarize with the new technology, get to know the features, learn how to configure them and how to do troubleshooting.
The best way is to follow a vendor training, or ask your system integrator/consultant for a custom-made training that fits the requirements of your network and team.
When this is not possible, some good ol’ self-study on the manuals will be extremely useful too.
2. Review current firewall configuration
I almost never saw a firewall configuration that didn’t bloat over time.
The daily activities are done in a way so that more and more rules are added to the rulebase, old services are never removed, and often over-permitting policies are allowing more traffic than they should…
So, what’s the best occasion than a firewall replacement to start with a clean configuration?
You don’t want to change firewall and configuration in one go: that’s recipe for a disaster.
Remember to always change one element at a time so that you know how to get back.
That’s why I recommend to review the old firewall configuration long before the real migration.
Audit the current configuration; remove all the unused address objects, services, and networks. Most of the firewall management tools, such as Juniper Networks NSM (brrrr!) and Checkpoint SmartCenter, allow this operation in a few clicks.
Perform an analysis of the current rulebase; which policies are actually in use and which ones are old and can be deleted?
From my experience, this differs from company to company.
Some enterprises have firewall administrators who have been working there for decades. They have the memory of an elephant and can tell the history of every IP in the network (often a class B!).
Some other companies have no idea why rules are there or if they are in use.
A simple way to perform a rulebase analysis is to check the hit counters (that is – how many times a policy has been hit by traffic) of each rule.
This is often an option that can be enabled on a rule basis (like in Juniper Netscreen and SRX), so keep in mind two things:
- Hit counters have a CPU impact in most firewalls, so pay attention on the extra load. If the rulebase is quite long (ex. more than 100 rules) then it is better to enable the hit counters feature in blocks of a few rules each time and to keep the performance monitored.
- Some rules are hit every second, some once a day (backup?), some once a week (network scan?), and some once a month (payroll systems?). So, you got the idea: this process requires time for a complete assessment of the used rules!
Never the less, it’s worth considering this as part of the cleaning pre firewall migration.
3. Configuration translation simulation
The configuration of your current firewall needs to be rewritten using the syntax of the new one, right?
How much time will that take? Do you need automated tools? How reliable are they?
It’s better to find all this out in an early stage of your migration project.
So, I recommend to plan some time to test the migration of the configuration.
The basic setup can be already prepared in this phase:
- Interface Settings (physical, logical and IPs)
- Routing (dynamic routing protocols or static routes)
- High Availability Setup (clustering)
- Management Settings (users, remote access, AAA, SNMP, and Sylog)
Those elements are not likely to change frequently.
Then, there are policies, objects, services, NAT, and VPNs as well as whatever your current firewall is doing.
It can be a big config to migrate, so the question will be: manual work or automated tool?
The decision depends mostly on how many rules you have. In a firewall with 1,500 rules, the manual translation is hardly an option, and if attempted, it will surely have some human mistakes.
Very often, using an automated script will be the preferred option.
A note of caution based on my experience: those tools will translate perfectly 90 percent of the configuration, but inevitably there will be 10 percent of errors remaining that you’ll have to find and fix!
For example, some CLI syntax accepts spaces in object names, some don’t. And an automated tool may not be that smart to know.
But the problems are not necessary because of a bad tool.
Once, a customer was using very creative ways to declare NAT rules, and the tools were failing big time in translating the FW admin creativity.
Another time, a customer attempted a migration using a migration script found online on a website, but it was from a very old version of software and turned out to be a disaster.
In this simulation phase, check how long the process takes, and what are the critical configuration elements that will need revision.
Something to pay attention to:
- NAT: Make sure you understand the packet flow of the old and new firewall technology. Some firewalls do NAT before policy check (Juniper), some others don’t. Get this wrong and you’ll realize it very soon!
- Services timeout: Many firewalls use custom service timeout for specific applications. And this has to be translated in the new configuration to avoid weird connectivity problems.
- Application extensions: Call them Fixup, Resources or ALGs. What they do is look into the traffic stream to open dynamic ports needed for protocols such as FTP, h323 or SQL… Those are, more often than not, the source of problems. Check carefully to determine if they are enabled and how they are configured in your current firewall; then try to match the same settings in the new one.
4. Acceptance Tests
Do you remember what you’ve learn in step one?
The idea here is to test that the basic setup is working fine and that the configuration just created is working.
I normally write an Acceptance Test Plan (APT) that is just a simple list of tests with the expected result.
The main focus is on High Availability (HA) test cases: What happen if a link fails? What if the whole box dies? What if…? Get creative.
The same applies for other aspects that can be tested, but realistically not everything can be tested now; otherwise it would be too simple!
Test as much as you can, and report back in the APT document what’s OK and what’s KO.
I found this phase particularly interesting for the customers because it’s the first opportunity to really play with the box and get familiar with operating the new firewall.
5. Declare frozen zone
At step three we figured out how long the config migration will take.
Now it’s time to declare a frozen zone that is a period of time where any change in the current firewall configuration, such as new policies, or change of existing objects, are avoided or at least tracked very carefully.
This is to avoid that, while preparing the new configuration, new changes are done on the current firewall and lost in… translation. 🙂
At the beginning of the frozen zone, a copy of the current firewall config will be taken for the next phase.
6. Configuration translation
In this phase, you repeat what worked in the previous simulation, but pay 3x more attention to every step.
This is the more critical part of the whole project because, if the new configuration is done properly, the migration itself will be smooth!
This is also a good time to plan and write down a roll-back procedure.
Imagine this: things turn awful during the migration, the maintenance window runs out of time, you’re dead tired and you have a headache.
But, you still have to roll-back to the previous firewall and make sure everything is working!
So, make sure you have a written procedure you’re comfortable with.
I don’t need to tell you this has to be done in a maintenance window, right?
Just pay attention that not all the networks have the lowest utilization at the same time; sometimes it is during weekends, sometimes it is at night, sometimes it is just after office closure.
Anyway, a good suggestion especially when working in small enterprises, is not to announce the firewall migration to users.
If you have to inform them, just mention some network maintenance but avoid the word “firewall”.
This is because the typical user associates the firewall with that annoying software in his/her PC that pops up asking to click something to continue.
For any problem they’ll get the morning after, they’ll blame the firewall, so ignorance is bliss… at least for them!
Who really needs to know about the migration is the application team.
Folks responsible for the services (e-mail, web, database, etc.) must test that everything is okay.
My best advice is to ask them to test the applications before and after the firewall migration.
This is because I ran into a funny situation when after a migration I was told that an FTP server, decommissioned two years before was not reachable…
Let them check before the migration too; are the services okay?
8. Monitoring phase
If you still remember the beginning of this article, I said there are always some problems that you’ll have to take care after a migration.
So, it’s crucial to plan a monitoring phase with the technical staff alerted and ready to fix any issue.
For mid-sized enterprises, this may require to structure a post-migration support in order to avoid the poor FW administrator getting stuck on the phone instead of working on fixing issues.
I recommend having someone receiving the support requests, filtering them, and ordering them by priority.
This is particularly important, as the NTP server is normally not as critical as the E-mail server, and you want to focus on the important things.
When does the monitoring phase start?
Well, I would say that starts from the moment the new firewall is receiving traffic.
The crucial moment is the end of the maintenance window; are the critical services up and running?
If there’s a problem there, it might signify that a roll-back is required, and that smells like failed migration… and, I saw a few roll-backs; nobody likes them!!
When does the monitoring phase stop?
This is harder to say, as it really depends on network and customer.
From my experience, from one to three days is common, but indeed problems tend to manifest either immediately or after even longer times, when some applications will wake up.
That’s it folks, I hope you found those steps helpful and good luck with your next firewall migration!