If you work in the I.T industry chances are you have previously (and possibly still do work) in an environment that has limited visibility of what is happening on the network. For example, a lot of engineers would struggle to tell you which applications on their network are consuming the most bandwidth. They might also know that they need to implement QoS but cannot determine the amount of bandwidth each application requires. They will be able to tell you which way traffic is being routed, but they won’t easily be able to tell you what that traffic actually is. They also find capacity planning difficult as they have no historical data. This is not to say that the engineers lack the knowledge required to do their jobs, even the most capable engineers face these same issues. The problem is that they don’t have the right tools for the job.
Today I’ll be reviewing a Software as a Service (SaaS) solution called Kentik Detect (free trial available) which aims to resolve all of the issues mentioned above. Before I take it for a spin though let’s first talk about what it is. To quote Kentik’s Knowledge Base:
Kentik Detect is an open, scalable platform for collecting, analyzing, and visualizing network traffic and performance. Providing instant access to both real-time and historical data, Kentik Detect alerts operators to performance issues and attacks while providing fast, simple tools that help isolate, identify, and explain unusual activity or behavior. Our purpose-built data platform sets up in minutes and integrates with each operator’s own tools and systems using industry standard SQL and the Kentik REST API. Kentik Detect’s portal is the user interface that allows you to configure, query, and control alerts, mitigation, and tuning.
In other words, Kentik Detect is a NetFlow collector with an extremely easy to use front end which allows you to gain a deep understanding of what exactly is happening on your network. When you combine this with their alerting system and BGP analytics, you will be asking yourself how you ever got by without it.
As Kentik Detect is a SaaS, you don’t need to worry about time-consuming installations, updates or backups, nor do you need to be inside your company network to access it. Everything is done for you.
What is a NetFlow Collector?
NetFlow is a protocol which was developed by Cisco but is used by most, if not all network equipment vendors. It is enabled on devices such as routers and switches to collect data on unidirectional streams of traffic. The data they collect consists of the following fields:
- Source IP address
- Destination IP address
- Source port number
- Destination port number
- Layer 3 protocol type
- ToS byte
- Input logical interface (also known as ifIndex)
- Other information such as egress interface and BGP next hops can also be recorded.
The above-mentioned devices will then forward their records onto a NetFlow Collector, which stores and presents them in meaningful ways.
Note that the primary protocols Kentik Detect supports are sFlow, IPFIX, and NetFlow v5 and v9. SNMP is also used to enable the product to determine interface names and descriptions as well as to validate flow levels.
Kentik Detect Review: First Impression
The first thing I look for when I’m using something new is the documentation. Sure, I could just jump straight in, have a play around and see if I can figure it out myself but if I do that I’m bound to miss something that I would have otherwise found had I read the documentation. It is, for this reason, Kentik Detect’s documentation was the first thing I looked at and I wasn’t disappointed. It’s easy to navigate, it’s straight to the point, and it’s thorough.
After reading the documentation, I took the product for a test drive. It was at this point I realised why the documentation was so easy to follow – and that is because the product itself is so easy to use. As I will demonstrate in the subsequent sections of this post, the user interface is completely intuitive. Everything is where you’d expect it to be and does what you expect it to do. Don’t get me wrong, though, easy to use does not mean dumbed down. While most users will get what they’re looking for with a couple of clicks of their mouse, advanced users can interrogate the Kentik Data Engine (KDE) directly and generate graphs based on their own custom SQL queries using the Query Editor.
Kentik Detect’s Data Explorer is where a lot of the action happens. The main options you will find here are –
- Dimensions: The pieces of information you’d like to report on. For example, traffic grouped by source country and destination country. You can specify up to eight dimensions at a time. See the image below for a full list.
- Metric: How you would like the information matched by the Dimension displayed, e.g., Packets per second, Bits per second, Unique IP addresses, etc.
- Time: The time frame you’d like to run the query on.
- Filters: Traffic you would like included or excluded from the query.
- Device: A list of devices you would like to run the query against.
- Display: How you would like the information displayed.
I will demonstrate these options and effects in the following section of this post.
The below image displays all of the “Dimensions” which are available at the time of writing:
The image below shows the default query being run against a device named cat2_cloudhelix_com. As the heading suggests, the graph is of the Total data throughput for this device. This is a great high-level view of the what the device is doing, but let’s say we’re after more detailed information. For example, which countries is this traffic being sourced from and destined to?
Let’s remove the “Total” dimension and add the “Source: Country” and “Destination: Country” dimensions.
That’s better. After applying the new dimensions, we can see that the majority of traffic is sourced from and destined for hosts inside of the US. However, let’s say we now want to get an idea of where in the US these hosts reside. To achieve this, we can apply a filter which shows only US to US traffic. Once we have done that we can use the “Source: City” and “Destination: City” dimensions.
To apply the filter we can either write it ourselves, or we can use the handy “Show Options” field to the right of the results. Doing so brings up a menu which allows us to, among other things, apply a filter to the selected result. To ensure only US to US traffic is displayed, we’ll need to select the “Include” option. If we were to select the “Exclude” option, all source and destinations except US to US traffic would be displayed.
The image below shows what happens when we make the above changes. Under the “Filters” heading we can see the filter that was created for us by selecting the “Include” option on the previous page. (While manually writing a single filter such as this wouldn’t be all that time consuming, you will find it a pleasure to use the “Show Options” method when applying multiple filters.) The image also shows us the result of the newly applied filter.
As a final example, let’s say we want to see the following:
- The same data as above, excluding the top three results.
- To achieve this, we’ll need to use the same technique as demonstrated above using “Exclude” instead of “Include”.
- Change the graph to a bar chart.
- Change the “Display Type” to “Time Series Bar Graph”.
- Compare today’s results from results that were taken during the same time window (21:32 to 22:32) one week ago.
- Switch the “Historical Overlay” on.
The image below shows the results of applying the changes listed above. As the software did all the “filtering” heavy lifting for me, I was able to apply these changes in seconds as opposed to minutes which would be the case if I were forced to copy and paste the city names into the filters. This is exactly the sort of efficiency you need when working on production networks.
In regards to the graph, it now looks like this:
Once we’ve got the data, we’re looking for we can choose to export it (PDF, PNG, or JPG) and/or add it to a Dashboard. Speaking of which, let’s take a look at Dashboards now.
Dashboards give you an overview of what’s happening on your network. As they are completely customizable, they can be as detailed or as high level as you want. The other great news is that the “Time”, “Devices” and “Filters” that we saw in the “Data Explorer” section above are also available here too resulting in a consistent experience across the board.
Alerting is an extremely important feature to have when working with production environments as it brings your attention to problems you might not have known about otherwise. Kentik Detect comes with a number of preconfigured alerts as well as the option to write your own.
Each time an alert is triggered, the event as well as related details (such as alert name, criticality and start time) are stored in the Kentik Data Engine (KDE) and can be viewed in the “Alert Dashboard” as per the image below. Each subsequent trigger of this same alert is added to the history of the original alert thereby enabling you to identify ongoing issues.
In regards to alert notification, Kentik Detect provides the following options:
- Alert Dashboard
- URL (JSON)
As well as showing you a record of all the past and present alerts, the Alert Dashboard also allows you to view the details of each the said alerts by clicking the “Event Query” button. Doing so will open “Data Explorer” on the date and time range of the offense and will also apply a filter so that the offending traffic is displayed. The image below shows what happens when the “Event Query” button next to the entry highlighted in the previous image was clicked:
When you are using BGP and you want to view the way in which traffic sourced from, destined to or transiting through your network is routed, typically you would jump onto one of your BGP routers and/or use a BGP Looking Glass servers. The problem with these methods though is that they do not give you a clear view of the volume of traffic that is traversing each path. This is where Kentik Detect’s BGP Analytics feature comes into its own. By peering with Kentik, you enable it to correlate your NetFlow traffic with your BGP routing, and the results are very impressive.
Through the use of Sankey diagrams, line graphs, and tables, the BGP Analytics feature clearly conveys information related to BGP Paths, Transit ASNs, Last-Hop ASNs, Next-Hop ASNs, Source and Destination Countries. Viewing the specific details you’re after is made easy as you have the ability to limit the output to specific interfaces and/or ASNs.
Kentik’s standard service offers 90 days of history, non-aggregated data which is fantastic considering the average customer has 20 billion records saved. If those figures aren’t impressive enough, queries to their SSD backed, custom built column store database are completed in less than two seconds. If you would like to read more about their database, have a look at The Kentik Data Engine whitepaper.
It is quite common these days for companies of all sizes to fall victim to Distributed Denial of Service (DDoS) attacks whereby malicious internet users flood their victim’s network with such a high volume of traffic it knocks their networks offline. Kentik’s DDoS: Separating Friend from Foe and DDoS: Source Geo Analysis posts do a fantastic job of explaining how their service can be used to identify the source of DDoS traffic. Once the traffic has been identified, engineers can work with their ISPs to have the traffic black holed before it reaches their network, therefore, preventing their network from being knocked offline.
What I Liked
Kentik Detect has got something for everyone. It’s very easy to use while still providing the extremely detailed information network engineers need in order to maintain their networks. The Dashboards feature is great for NOCs where a high-level view of the network is required. Then there’s the ability to export the information to images and/or PDFs which will come in handy when preparing reports for management and clients alike.
What needs Improving
I found that the “Alerts” page could be improved in few aspects. Some alerts which have been raised on the account I’m using have no option to view the event history, event details or event query while others have missing entries (e.g., the “Key” field). There’s also no way to filter or sort alerts which makes sorting through them a bit difficult, in case the number of them gets large.
The depth of information Kentik Detect provides, is invaluable when it comes to understanding what is currently happening on your network, and what has happened on it previously. When this information is used in conjunction with the alerting system anomalies are quickly detected which enables engineers to take a proactive (as opposed to reactive) approach to the running and maintenance of their network.
If you’re not using Kentik Detect already, I strongly suggest you take them up on their free trial offer. You won’t regret it.