Using Automated Network Detection & Response to Visualize Malicious IT Events Within Power Systems - Video Text Version
Below is the text version for the video Using Automated Network Detection & Response to Visualize Malicious IT Events Within Power Systems.
Erfan Ibrahim: Good morning. This is Erfan Ibrahim of the National Renewable Energy Lab. Welcome to December's Smart Grid Educational Series webinar. Today, we have the pleasure of having folks from ProtectWise speak about network detection, anomaly detection, and then also prevention, and also very important in that area is the whole area of visualization of what's going on in your infrastructure.
So I'm going to share with you some observations from our past three years of doing research here in this space at the National Renewable Energy Lab that I think will be a very good preface for this presentation. As you are aware, with the increased digitalization, networks are growing very rapidly in enterprises, and it's very difficult for any organization to know fully what assets they have available, and how are they networked with each other, and also, what are the cyber vulnerabilities that are there that they are not aware of because of this slowly evolving infrastructure that is not documented in the best way?
So one of the challenges that industry is facing is that when we come up with ways of mitigating risk in a lab environment, and then we try to extrapolate it to the practical world, a lot of those best practices fall to the side, as individuals will bring their own creativity, their own ways of doing things. And so it's very important to have constant sharing of information and education and training to create a group of like-minded people who will design and build and run networks in a consistent manner. There are too many degrees of freedom today in the enterprise network.
So I'm going to share with you a couple of ideas, and then I'll pass it on to Gene.
First and foremost is the need for network hygiene. Network hygiene means that data should only move in networks on a need basis, and access should only be given on a need basis, and need is defined by use cases. Use cases are transactions that occur between two humans, between two systems, or between a system and a human.
So as an enterprise – the information technology people, the operational technology people, the business leaders, the users – all should agree what those established use cases are. Figure out which nodes need to communicate to support those use cases and only allow those protocols to communicate between those pairs of IP addresses. If you allow access to more than that, you are then creating attack surfaces. So this old paradigm of putting a big fence around with air gapping and firewalls is not going to be sufficient. You have to lock down your network and make it very quiet, quiet meaning that data is only moving because it's supporting a use case. Minimize broadcast tones, multicast packets, and things like that.
When you create an environment like that, with strict policies on role-based access control at firewalls and at – and with network access control list policies on the switches, then you create a hygienic network in which the next level of sophistication of cyber security starts working, which is intrusion detection, because anomalies are behaviors that are not normal, and in a very quiet network, those anomalies stick out like a sore thumb, and then it is possible, with both malware-based, signature-based for malware detection, IDS systems, as well as context-based IDS systems, start working very effectively with minimal false positives.
In addition, you also want to put some hardware layer filters on very specific transaction where the degrees of freedom are very few, and if the command that is being requested is not – should not be allowed, or doesn't fit in that very narrow definition of what's allowed, then that should – that packet should be dropped. And this is particularly very important as connect distributed energy resources to the grid, and we have millions and millions of IP nodes belonging to third parties that are trying to link to asset owners, like utilities.
So with that background, now that we've talked about network hygiene and context-based intrusion detection and also the signature-based malware IDS systems, then we have some semblance of control over our cyber security posture. So to talk about network detection and anomalies, we have here Gene from ProtectWise, who's going to talk first conceptually about the subject, and then show how ProtectWise technology helps support that. So with that, Gene, the floor is yours.
Gene Stevens: Thank you, Erfan. Thank you, everybody, for joining us this morning. As Erfan mentioned, my name is Gene Stevens. I am the cofounder as well as the CTO of ProtectWise. We are an enterprise and industrial security company based out of Denver, Colorado.
So what I'm going to do this morning is walk you through some aspects about power-oriented systems and threat detection and response with an emphasis on visualization. I do want to let everybody on this webinar know that I am severely under the weather this morning, and I have a bad cough, so if I mute myself for only a second, that is spare us all from unnecessary noise on the line. So thank you so much for your patience.
So let's go ahead and get into this, a little bit of our agenda for this morning. We are going to do the nearly obligatory review of the Ukraine power grid cyber attacks from 2015/2016. We will then move on to analyzing the current landscape in IT and OT security. And we will talk about innovation solutions to secure power systems. This will be – I will talk to you a bit about some of the technology myself and the team at ProtectWise have built.
Then we'll move on to the visualization aspects of it, and some case studies, and then finally, we will finish with some question and answer. So I'm going to target talking for about 40 minutes, and leave some time left over for question and answer. Joining me on the line today who will be participating in the Q&A session in particular is David Hatchell, who is running the industrial security business for ProtectWise. So without saying anymore, let's go ahead and jump into this. And by the way, I'm very excited, because I don't tend to be able to do webinars very much these days. The team normally does this, and so it's been a while, and I'm really excited to be able to do this again.
Let's talk a little bit about the Ukraine power grid cyber attacks. Now I'm sure that most of the people on this line are pretty familiar with this story. We're just going to walk through it somewhat briefly, talk through some major elements, and use it as a staging ground to illustrate and establish a common understanding as we go into other characteristics of the IT and OT landscape.
And if you remember, in late December 2015, the Ukrainian Oblenergos – excuse me – reported a service outage to their customers. A number of substations were disconnected for a few hours, and apparently, this attack also impacted additional portions of the distribution grid, and forced operators to switch to manual mode.
The media at the time reported on it, and determined that a foreign attacker remotely controlled the SCADA distribution management system. It was later found that three different energy companies were attacked within about 30 minutes of each other, and which resulted in outages affecting roughly nearly about 225,000 people effectively with loss of power. So the Ukraine officials, as you may remember, claimed that the attack was actually a cyber attack, and that the Russian security services were responsible for the incident.
We are not going to be participating in attribution at all in this talk. It is going to be sufficient for us to note that these attackers demonstrated themselves to be sophisticated, adaptive, and very well-resourced. Of course, the US government and private companies also participated in this analysis, and I want to give special credit for some of these illustrations that we're going to be using this morning, they're sourced from the E-ISAC. This analysis is available online. I would encourage you, if you have not read it, to please take time to read this and to also recognize that the writing on this has been used to help scope some of this conversation as well. So thanks to the team on that end who put this together.
So some things worth noting in this, however, are some really high level key takeaways are that this attack represented a disruption in physical process. This is very special. This attack – the attackers specifically targeted the distribution substation systems, and this is interesting because these would normally be out of scope, right, for NERC CIP regulation. Additionally, the attackers used traditional attack methods, such as use of kill discs for manipulating master boot records, etcetera, along with ICS specific tradecraft – for example, developing firmware specifically for the serial to Ethernet converters.
Additionally worth noting is that they used a mix of nontraditional methods, such as denial of service on the call center in one of these cases, and a combined attack in order to disrupt this power company, and that – also worth noting at a very high level is that the attackers gained a foothold in the IT networks of these electric companies, and they used this foothold and this persistence in order to harvest the credentials and information to access then the ICS network, and we'll get into more of that here in a moment.
Probably worth stating that all things considered, these attacks are relatively low impact, given how they were executed, a few hours and a couple of hundred thousand people were affected. Conversely, the companies who were involved in this will probably rate these as very high in terms of the reliability of their systems and impact on business operations. So please, as I mentioned, read the report.
Let's break this down a little bit further. So with regard to this Ukraine power grid cyber attacks, there's a set of certain interesting capabilities. The attackers demonstrated a variety of different abilities and approaching, including things such as phishing emails, the manipulation of Microsoft Office documents that contained a malware to gain a foothold. They also demonstrated – they produced some variants of the Black Energy 3 malware, which has many advanced mechanisms. Obviously, Black Energy goes back quite a few years, but version 3, which was relevant at that time, had features such as defense against virtual environments, so your traditional anti-sandboxing, anti-detonation detection. They had defense against debugging methods. They had numerous checks inside of the software that would kill the program if it were to detect any security functions or countermeasures in the environment.
The attackers also were able to develop two SCADA hijack approaches, one a remote SCADA client software, and another being a report admin tool at the OS level. And the attackers demonstrated willingness to, one, target field devices at substations; two, write custom malicious software – for example, the serial to Ethernet converters that we just mentioned; and three, in one case, and this is pretty interesting, they were able to use the telephone systems in order to overwhelm the call center.
So major technical components here, some of these outlined here on this diagram, which comes from the same report, that spearphishing was used to gain access to the IT networks. We were able to – the researchers were able to identify the existence of Black Energy 3 at each of these companies. There was theft of credentials from the IT networks. The use of VPNs to enter the ICS networks, the use of remote access tools for issuing commands similar to how an operator HMI would. There was also the existence of serial to Ethernet communications devices impacted at the firmware level. There was also use of this modified kill disc that we mentioned a moment ago to erase the master boot record of these systems, as well as targeted deletion of logs. We also saw the utilization of UPS systems to impact connected load with the outage, and then, of course, the call center DOS, denial of service.
So regarding the opportunities that were present here for the attackers, there was no real network security monitoring in these environments. There was no intrusion detection. There was no – really any technology existent to prove or identify that some of these things were going on.
Additionally, the attacker had a huge amount of time in which to persist, and then plan the attack, on the order of months, six months or more. That is a tremendous amount of time, meaning that the attackers don't have to have everything figured out up front in order to be very effective.
So let's move on. With regard to this attack, you – people on this webinar are probably somewhat familiar, if not largely familiar, with the concept of the kill chain and kill chain progression. It was popularized by Lockheed Martin a number of years ago. They called it the cyber kill chain. But it's kind of evolved and taken on a life of its own. Many different organizations break things out a little bit differently. This is the E-ISAC report's breakdown in the kill chain stage. You see a seven stage approach in here during stage one of the intrusion, which is largely on the IT side, and then you see several stages being enacted in stage two, which is the actual target ICS attack.
Now with regard to this, it's worth noting that it's very common to see a lot of these OT attacks actually start out as an IT attack. So this convergence between IT and OT is something that kind of has to happen, because one is being used to compromise the other. So some key takeaways from this, and there's a lot you can read in this report, but the attackers worked systematically. They penetrated the IT and the OT barrier of the organization, once they understood how to harvest credentials in order to gain VPN access to the system.
Second, the long periods of time between the C2 and the execution of the ICS attack allowed them to develop, then test and deliver the custom modules – for example, the ICS firmware – needed for the ICS attack. This is a really powerful opportunity for the attacker.
And third, that there was a need for packet capture replay and signatures in this environment, because a lot of this type of long-baked attacks, it would not be enough to look a single time with the evidence you're able to gather at a single point in time. In order to be able to research and affirm the findings here, you had to be able to go back in time over and over again and reanalyze the past, reanalyze the data, and look at it again and again with new findings in order to come to appropriate interpretation.
So like I said, I mentioned it's worth reading the report. There's a lot more information in here. We will, however, during – later in this walk, we will pivot to our visualizer, where we will show you a kill chain oriented approach as well. We think that movement along the kill chain axis is a really good way of identifying different stages within an attack, and seeing that progression gives you a sense of motion and movement that suggests where the attacks are going to go, which is a really interesting take on how you would do attack prediction without needing to necessarily predict the exact outcomes.
So let's – that's kind of history. Most of us are somewhat familiar with that. But let's have that sort of the baseline. Let's talk a little bit about IT personas, roles, responsibilities. I'm sure that the people here have a mix of OT as well IT background, and maybe even some of us have a mix of being concerned about both of these at the same time. And traditional IT, the responsibility, they tend to have a certain amount of governance responsibility, oversight, with regard to the operational technology aspects of the business. However, they generally do not have any authority or control over the OT. I see this shifting a little bit in some really seriously lean forward organizations, but in general, no authority or control.
The IT team is really focused on preventative measures, which are largely rejected by OT. And this makes a lot of sense, that the OT teams would reject this. Something in an IT world, for instance, an email not getting delivered because it was blocked from being sent, is one thing. That's pretty easy for a business to survive such an event. Having function codes or communications dropped between controllers, that can be pretty disastrous for a plant or a remote facility.
The IT team in general actually has no real visibility in what's really happening in these OT systems. And in general, and to the credit of the OT teams, there is a lot of distrust, because side effect oriented IT measures are very negative in the OT environment, and in general, historically speaking, it has been a bad idea to take the same approach in both scenarios.
Let's talk about the operational tech side of things. In this environment, availability reigns supreme. Cyber security is wholly secondary. The number one goal of that business is to maintain the continuation of services, the availability of those services. And in general, when IT approaches these environments, stuff tends to break. We've seen plants go offline because of minor IT patch going bonkers. And so because of that, that generates a lot of resistance towards that, which is really understandable.
With regard to compliance and risk, I mean, this is kind of what runs the company. I need to plan on what to secure tomorrow, and can't everyone just get along enough to finish this audit? The compliance/risk auditor has a different hat on. The officers in these different areas, they all wear different hats, have different perspectives. And they are now at this point in time that we are inhabiting learning how to get along, learning how to understand each other's concerns, and learning how, more importantly, to develop technology that is strategically valuable in tying these environments together, so that we can see that pivots from IT to OT attacks as the really important pivot, but we want to be able to trace that together in one coherent system.
So let's talk a little bit about these classic priorities. In some ways, this is a little bit redundant, but – from the previous slide, but you get the sense that in a sense, the priorities are nearly reversed. We go, on the IT priorities, from confidentiality to integrity down to availability as a last concern. It would be exactly the opposite in the OT environment, where availability is a top concern, and confidentiality is probably the least concern.
However, and this is something that we've seen in many of our accounts, is that the common ground between these teams has to do with safety, and in fact, for ProtectWise, our largest growing industry right now is the energy sector. That is the fastest growing segment of our business. And a lot of that budget that is being created for cyber security in these industrial facilities is coming out of the safety budget, where the availability and health function of the systems and safety are intertwined.
Okay. Let's move on. Regarding some – there are some really serious detection challenges in industrial environments. In many ways, this is like stepping back in time, when the IT world had very little in place with regard to cyber security detection and controls. There's a lot of things that are hard to address. I mean, these are known challenges. There's a lot of creative thinking being done inside of the industry, and it's really kind of a fun time to be alive as a researcher. It's a really fun time to be alive as a technologist. But it's also a pretty serious time in which to be acting as a professional, because these systems have challenges that are not easily overcome.
So for instance, with regard to ICS, industrial specific protocols, there are major open and proprietary protocols which were designed for industrial use with no off. There is kind of a sense of security by obscurity, as well as this grandfathered notion that, well, this system should be air gapped. But like Erfan said, that doesn't actually work anymore. And so that assumption is no longer safe, and the lack of off in these environments is a pretty big deal. It's a big challenge for these organizations.
With regard to static network environments, these networks have long continuous flows with no ability to really reset communications. Now we noticed this when we're in these environments, that we see network connections that happen between industrial systems that last for over a year. You know, they just go on indefinitely. And unlike in the IT world, where they're used to being able to reset network communications, you can't reset this network communications. You can't drop it and recreate it. You cannot assume that missing a few milliseconds even will be safe for the system on the other end that's missing commands or missing heartbeats or missing other things. That's not an okay assumption. So we cannot simply drop communications and let healthy communications reestablish themselves.
So regarding asset and vulnerability information, there is not a find and fix mentality like there is in IT. And in fact, because of this, that means that there's very limited information available with regard to assets, databases. A lot of the people we have worked with have resorted to simply manually typing assets into spreadsheets, and trying to make sense of it that way. It's really kind of limited, and the team is underserved by a lot of the technology that they have.
Related to this, there is very minimal logging and detection capabilities in these environments. For example, devices like the serial to Ethernet converters, they don't have the ability to log or use SNMP. These are basic things that are pretty common on the IT side that are nonexistent on the OT side. So this is a challenge, to be able to passively harvest information about how these systems are performing.
We also see a lot of purpose-built devices, for example, devices like TLCs or RTs, operating of real time OSs or microcontrollers. There is a lot – there's a huge amount of variation and a lot of difficult challenges with regard to how you manage these things.
Oftentimes in these environments, it's a real challenge that security is often provided by the automation vendor. They provide security controls realistically with very minimal ability to monitor. Config files are not open, etcetera. They're not really – they're designed oftentimes with a basic assumption that this is the information you want to know, given normal conditions against which this device was tested, which are not necessarily the same conditions in which attackers are trying to exploit it. And so monitoring and logging detection on that front is really going to be challenged by a lack of information.
There's also going to be really difficult access to network infrastructure. In a lot of these environments, it is hard to get switch access or a stand port to monitor off of. These are things that people are used to in the IT side, a little less common on the OT side.
So let's take this a little bit further, though, however. ProtectWise, as I mentioned, we have been a company – we're about – closing on five years old, based in Denver, Colorado, myself, my cofounder, who serves as the CEO of this company. We came – we were most recently at Intel security, which is McAfee, also with David Hatchell, who will be participating later in this call, who ran industrial security there as well.
But we actually took a lot of that time talking to the teams, meeting with people, and we came away with a basic conviction that if we were to [inaudible] this kind of a problem surface anew, we would approach things fundamentally differently.
So what we did, after moving on from Intel/McAfee, is that we built what we called the ProtectWise grid. And now this is an important notion. This is really a nod to the – like the power grid, for instance, so the utility grid. We have this notion that we're able to present an automated threat detection and response platform for any network across all time. So this is a really ambitious undertaking. When I say any network, I mean the traditional enterprise as well as the cloud infrastructures that these companies have, and in the industrial environment.
So for all the teams everywhere who are having the mature conversation, saying, great, our IT program is pretty mature, we're running it well, things are going well from a cyber security perspective, but we're not an IT company. We're actually a manufacturing company, or we're an energy company. So what about the actual business that our customers pay for? What about that? Being able to pull all that together is really powerful.
So what we've built is this platform that we deliver entirely from the cloud to provide threat detection and response and visualization as a service in a grid-like manner, representing a ubiquitously available resource that is highly reliable, that serves as a basis for which these teams and these programs can build far more interesting organizational structures and security programs on top of.
So let me tell you a little bit about how this works and how we think this is actually pretty relevant for the problem domains that we were just talking about, and how relevant it is for the energy space in particular. So what we have done is that we give people these lightweight software sensors. These organizations, they get these sensors, they are tiny. They're about 15 megabytes in size. And the idea is you don't buy them. They don't cost anything. Sometimes they can be packaged in the hardware. We provide that as well. But sometimes they can run inside the virtual infrastructure, such as virtual machines inside of these networks. Typically attached to a span or a tap, but don't necessarily have to be used in that way.
And what they do is they do packet capture, and they do low level deep packet inspection of every single piece of communication that's going out to the wire that they have been given visibility into. And then they take that data and they compress, they optimize, and they stream that to the cloud in a near real time manner, meaning network latency bounded only, in order to do rich analytics in a centralized way from the cloud perspective, but what's actually more interesting, in my opinion, is that we actually take a copy of that data and we store it for an unlimited amount of time.
And the reason why we do this is that whenever there's a shift in the threat intelligence landscape, such as a new vulnerability is discovered, a new [inaudible] is discovered, we take the indicators of compromise, the tactics and techniques, and we codify them, push them into the system to do the traditional real time analytics, but also to be able to go back in time to discover the previously unknown. We'll talk a little bit more about that here in a moment, the ramifications for these types of environments.
But you can imagine this being inside the egress of the organization, inside of the core of the network, inside of the cloud, inside of these industrial locations and remote facilities, etcetera, and all this contributing to a single haystack in the cloud, which then can be made coherent and useful for being able to reason about this entire stack as a coherent whole, from a security perspective. So being able to trace this movement and rationalize all the signal across all these assets in one coherent system to make sense of the entire picture.
So let me tell you a little bit about this time machine for threat detection. So I mentioned that we like to store the data for an unlimited amount of time, and the reason being is so that we can go back and look for something that nobody knew how to look for before, and see, wow, six months ago, based on what we know today, this is pretty meaningful.
So the types of indicators that go into a system like this that should be part of any mature security program are forms of detection, such as signatures and heuristics, anomaly detection, behavioral analysis, etcetera, and machine learning classifiers. All these things, a battery of different expert capabilities pulled together into a single system that is firing around this notion that many different forms of detection are required to get a really comprehensive view of what's moving across these assets, and to be able to make sense of how they relate to each other.
And there is a lot of technology in the world that makes the claim that if we do this one single approach, and machine learning is the magic buzzword of the moment, to say, hey, if we have this one single approach to machine learning or behavioral analysis, we'll be able to find anything. That is largely false. What we believe makes a much more mature program is a service that actually pulls together many different forms of expert systems, all with their own expertise and their own capabilities to execute a detection strategy. So that careful coordination is a big deal.
Obviously, this happens in a real time sense. Traditional real time analytics, which you're probably used to in many environments. But as these indicators are loaded in this system, they also like a trigger, which tells us what to look for in the past. And so we're able to execute this model that we call retrospective analytics, this analysis of being able to take what we know now to reinterpret what happens six months ago, a year ago, or yesterday, and be able to look for things that nobody knew to look for before, and to take what those findings are, and then to use them to influence what we look for in the present. So the state that comes out of that should change how we think about what's happening here in real time.
And so this present and the past having a conversation back and forth with each other is a form of continuous analysis, which represents a system that can continuously improve its own answer, improve its own understanding of its circumstances. So this kind of approach to intelligence that's about a system that is able to change its mind based on becoming smarter in the course of time.
And of course, with regard to being able to move through time, we have hits element of predictive analysis, to be able to use, for example, a kill chain oriented analysis, to look for movement and create things such as an early warning system to say that based on these patterns, we believe that this is on its way towards becoming successful in its actions, on its objectives. And of course, being in the cloud means that there's a collective and correlated opportunity to tie all these different systems together to benefit any one system from the lessons learned amongst the many.
So let's go through some case studies, and we'll talk about some of these organizations. We will keep them anonymous, but we will talk about the results that they have authorized, and talk to you about their experiences. And then what I'd like to do is actually spend some time showing you some of this stuff in action, so we can see the power and presence of visualization in its role of being able to help us understand our circumstances. And then we will move on to our question and answer session.
So this first company that we're talking about here, it is a small energy cooperative, so a very small coop, and this is on the small end of the spectrum. There's a person here who is effectively a one man SOC, as he describes himself. And they were really looking for a forensic tool that could lighten their workload. They don't have a lot of time to spend on this. All the time has to be very well spent, has to be incredibly precise, and they have to have all the information necessary to walk away with conviction about the value of the findings.
And so they also rely on this system in order to be able to stop attacks before they reach their actions and objectives. In many cases, they were worried about the exfiltration of data coming out of those environments.
So some of the challenges they faced is that this lean staff, they need solutions that can be deployed really quickly, they can quickly identify, validate, investigate, etcetera. And they needed a solution that had good compliance with electric power industry regulations.
Results from using this technology, from using ProtectWise, they were able to, in this example, detect Shellshock exploit attempts ahead of – in advance of an E-ISAC alert, and they were able to shift from reactive to proactive, which is a really big deal. So instead of just getting a feed of things that are going wrong, what if this small team could go into a proactive stance and automate their network detection and intercepting problems before they become meaningful issues? Incredibly important in these types of environments. And of course, they needed – they had the result of significantly reduced man hours that were previously dedicated to these kinds of investigations.
So looking to the future, they're going to be extended their network security into their OT environments, focused on this IT pivot into the OT, and this is the natural roadmap for them.
This second company I'd like to talk to you about is quite a bit larger. We can't name them, but we can say that they were challenged with a network topology that made appliance placement problematic. Many of the solutions in the market today are very much hardware appliance oriented. ProtectWise said, hey, what if we can take the analytic functionality that's normally siloed inside of these racks of hardware appliances and move them to the cloud for a really lightweight Edge approach to this?
They were struggling with environments in which racking hardware was not like physically possible, very problematic. They also were challenged with the data retention costs that were incredibly prohibitive with their on premise solutions. And so a cloud-oriented approach made a lot of sense to them. They were also really challenged with event-driven packet capture, which many systems out there execute, where they make a specific detection, and then they log some forensically useful information about it, including like packet capture, and then they do that for 30 or 60 seconds, and then they kind of wrap that up. But you tend to not have packets or information for data that's not necessarily running around triggering alarms. And so for them in their investigations, it was very challenging to investigate flows that weren't already being convicted.
So the results of being able to deploy ProtectWise in this environment is that they had really fast sensor deployment, so they were able to get these sensors out into their organization really quickly. They were able to get immediate threat detection. These things turn on immediately. It takes a couple of minutes to set them up, and the product starts functioning.
They were able to improve the credibility of their network threats detected, and they were able to reduce the time in eliminating false positives, because when you have enough data, you are able to come to conviction and validation of your findings really, really quickly, instead of that causing a series of work where you have to investigate across many different systems.
And so looking to the future, they were able to consider – they will be extending the solution to the grid modernization project to enhance their OT security. So they have this huge undertaking, and this is going to be a central piece of that.
So they gave us this quote, which I thought was pretty nice. "ProtectWise gave us visibility into areas of our network that traditional appliances couldn't reach. Full packet capture enhanced incident response by enabling us to quickly identify and prioritize the most critical threats."
So there are some other resources worth checking out. We have this white paper of "Using the Cloud to Address Compliance and Security Controls for Utility Companies." It's a great paper I would encourage reading. And let's actually go into a demo.
So I'm going to switch – I'm going to switch browser tabs here real quick and get into our visualizer. So I'm going to go to full screen, and I trust you can all see this, and please, Erfan, if this is not visible, unmute yourself and let me know.
Erfan Ibrahim: Yeah. It looks fine. I see the circles spinning.
Gene Stevens: Oh, great. Thank you very much. Okay. So what we're looking at is our visualizer. And so I'd like to show you a little bit about this product, how it works, and also show you some of these IT as well as these OT industrial types of detection, see how this kind of works together.
So as you notice, here are visualizers running inside of our browser. It's a web app, though it doesn't look like a web app. And the screen you're looking at right now is called our heads up display. And our heads up display is really designed for like a large screen in a SOC or a NOC or whatever the operations center that you have. Many people put it on a large screen. However, it works really fine just on my laptop, which I'm using to do this demo now. It works equally find on my iPad as a roaming and remote version of this.
So I'm going to talk to you a little bit about some of what we're seeing, and then we're going to dig into some of these relevant detections to show you how some of that works and how you can gain visibility into these environments. So our heads up display provides broad situational awareness, but also pivots into deep forensic exploration.
So some of what you're seeing here, on the left hand side, we have a list of sensors. Now in this demo environment, this is actually a full end to end working version of our product. There's actual packet capture, compression optimization streaming to the cloud. All the analytics and storage is being made available the same way this works in our customer environments.
So we have a list of software sensors that have been added here. There is a family of cloud-oriented ones, a family of enterprise sensors, and four industrial/process control network sensors that have been deployed. And so we see manufacturing, power plant, refinery, turbines, etcetera. This high level overview basically gives you a sense for what's deployed and what is their experience from a threat perspective. We have these little threat meters here. But also, like answering basic questions, like are they online? Is everything working? Is the system flowing the way it's supposed to? Do I need to troubleshoot anything?
However, your eyes are probably drawn to these heurographs here in the middle. This one here is a connection graph, and what it's doing in this view is giving a real time breakdown of all the network activity happening inside of these facilities, inside of the IT, inside of the OT, as well as its relationship to the outside world. So what we have here is all the connections from the inside of the organization to the outside world as defined by the organization, broken up by geography, and it uses really simple metaphors, like blue is clean, red is bad, yellow and orange are somewhere in between. And I have seen this scale into the many millions of connections per second in some of our large customer networks.
To the right hand side here we have what we call our attack spiral, and what this is is our illustration of the kill chain. And if we watch this long enough, it should start to populate, but what this does is show stages and progression within an attack. It's everything from this relatively innocuous outer ring here called – where we have reconnaissance, to delivery, to exploit, and we focus on beaconing quite a bit, to command and control, communications [inaudible] activity, lateral movement, fortification, other types of related post-hardening events, as well as actions on objectives, which in most cases is data theft or exfiltration, but is quite often in the OT environment simply meant to impact the systems that are under attack.
To the bottom here, we have what we call our network timeline, and this functions very similar to like a financial timeline, where you can pinch and zoom and pan around and go as far back in time as you want to be able to go to see detections found at that time, but also see the state and health and performance of the network devices as they flowed across this timeline here.
To the right hand side we have a list of what we call priority events, which are very similar to what like Gmail would have in its priority inbox, the idea that there are a number of different types of things, messages, inside of this system, but there's a subset of them that are the really good use for your time, either because they're really severe and pretty easy to identify, such as this critical lateral movement is probably a hash dump or something really easy to identify, or it could be because we're seeing movement along the kill chain access, which implies something's in the middle of becoming successful and working towards its objectives.
So all this together, really broad situational awareness, will give you a sense for the overall health and performance of the network, see how things are acting and progressing. However, for the interest of time, we're going to go ahead and dig into this next area. We call this our kill box, which is basically our inbox for detections.
And here, what we have is a list of different detections. We have this concept of events, as well as observations, and events are often comprised of many different forms of activity. For instance, let's actually pick one from our heads up display. I'm going to pivot into there and find something that's probably pretty interesting. Let's look at this kill chain progression here. We will go ahead, and we could actually download the data that compromised these actual transactions, so if you wanted a copy, but we're going to pivot back into the kill box.
And to show you this kind of who, what, when, where, and how. This one internal host, these two external hosts, have many different forms of detection firing together. There's probably a mix of intrusion detection as well as static file classification, response codes from industrial systems are being analyzed. The typical benefits of genuine deep packet inspection being pulled together to show you the families and clusters of activity and to show you movement and communication patterns between these types of posts, whereby a person can actually dig through the different components of this. For instance, I can look at this – these exploit opportunities here. I can actually look at the raw underlying communications, because we do packet capture at great scale in the cloud. I can actually search through this, as well as download all this data, this PCAP download, if I wanted to.
And then I can also actually pivot into seeing a little bit more about how we integrate with the existing environment. So this is definitely on the IT side, as well as look at what intelligence rules are producing this, get a real good, solid visibility into how this determination is being made.
What I want to do, however, is point out that these – this one event has six different detections that are being pulled together. If you remember earlier, we talked about having a single form of detection is utterly insufficient to having a mature program inside of an environment. What you want is many different expert systems pulled together. What we have done is created what we call a hierarchy of experts model. The phrase is not new to us, but I think we're the only ones using it in cyber security at the moment. It has the idea that we can automate many different forms of detection, many expert systems, the same way human experts tend to coordinate and collaborate in the sense that if you were to sit around a table with other people with different backgrounds and different experiences, and to have a mature conversation with them, you would have a very basic experience.
And when the experts around the table begin to agree, you know that you're really onto something. If the experts disagree, maybe you need more time or more data. And each situation will vary, but you get a sense that what we're able to do, then, is put these together in a very basic neural net in order to form many different forms of detection together, [inaudible] observations, and I'll show you some of these here in a moment. [Audio glitch] organized can be constructed to automatically product far more interesting structures that we call events, which we're looking at right now.
So without really nerding out on that too much, I'll roll that back a little bit. What we're going to do now is look a little bit into some of the industrial specific detections here. So I'm actually going to go and pick some of these industrial sensors that we saw on the previous screen. We have manufacturing, we have a power plant, we have a refinery, a turbine. Boy, this company does it all. So this is a great demo.
What we can actually do is look at the different kinds of detections that are in this environment, so we can see there's a lot of like heuristic detections, payload detections, etcetera. We can actually filter this out and look for different other types of expert systems that are contributing in here, such as – let's look at heuristics, purely. We can see things such as these possible failed mod bus connections. This could be actually a pretty big deal, depending on the environment that you're in, where master PLCs, for example, do not scan known mod bus [inaudible]. So we can see that port scanning here in this list, but we can also see an industrial environment, master PLCs are generally configured to make mod bus connections to [inaudible] devices, and to know if the device supports mod bus or not. If a PLC device is trying to connect to port 502, for example, it is generally a sign of a potential attack. So this would be doubly suspect if the device rejecting the TCP request is a general purpose computer, such as a laptop or an HMI, and not necessarily an embedded industrial control device. So you can kind of see the value of something like that. That's very basic heuristic type of detection.
There's a few others in here we can look at, but let's go ahead and look at, for instance, anomaly detectors. So this is where some of our machine learning and heuristic approaches also converged, both being sources of anomalies, where we can look at things such as this inconsistent length in this mod bus request. This is pretty strange.
So what we can do is actually expand this and see more details about this. So we can see a function code 43 in this mod bus example, which is this encapsulated interface transport. What's curious about this is that we expected to see a length of four, but we actually had a length of five. And it's known in mod bus types of attacks that a potential weakness in these systems is to pack other function codes behind function codes that are going to be accepted by certain controllers or certain devices. And so this is really suspicious. You should never see this. That stuff should be computing accurately. It suggests somebody's packing other information inside of there.
So as you can see, many different types of detection can be pulled together, delivered entirely from the cloud. There's so much more I would like to show you here, but I don't have a lot of time for this, and probably should skip over to our open ended search of the product, and then to Q&A.
So if the heads up display and the kill box kind of guide people towards the threat that matter most, based on the intel and indicators noted in the system, very much detection oriented, the Explorer, however, is the open-ended inquiry of the entire network environment, these entire plant and facility environments, and IT environments, and OT environments, the reason being that there is a huge difference in the behavior of a team between simply responding to alarms and trying to get really good at that, which is really what most security companies do and encourage. That's very reactive.
What we're encouraging is a very proactive stance in this that is oriented around hunting and patrolling and being able to ask open-ended inquiries of the entire haystack to do things like look for specific function codes. Have they ever moved across the entire network? Look for certain things such as file hashes, file sizes, domains, URLs, stuff you can find in IT networks as well as OT networks. Being able to say – maybe I get a list every day from people in my working group, other peers in the industry, saying, here's a couple of things that we found in our environment. Just letting the group here know that these have been problematic for us, that kind of basic sharing that's becoming more common in some of these – like the energy sec group, the other groups that are working to foster collaboration in this space, some of what NREL is doing here as well.
All these things kind of working together, you're able to say, well, okay, I'm not getting an alarm on this, but I'm kind of curious. Has this file ever moved across this environment, for all time? If so – yes, apparently a few times. If so, where did it go, who are these devices, where are they located, and, you know, can I have a copy of this data?
So the basics are there. The shift from reactive to proactive is a big deal. Many different expert systems is a big deal. Being able to shift that core analytic functionality to the cloud with a utility model like ProtectWise provides, we think this is a really big deal. You can have this passive detect and respond type of approach that's very mature, that's not detect and block, that has not been there trying to truncate or deny communications across these devices, [inaudible] side effect friendly, uptime and availability friendly. This type of approach is pretty meaningful.
And so with that said, let's go ahead and use the time we have now to engage in some question and answer. And if some of it requires us going back to looking at any of this product, I'd be happy to do that. But also joining me for this segment will be David Hatchell, who will – it looks like I got signed out of my Google account – who will also participate in helping to answer some of these questions. So Erfan, do you want to say anything before we go to Q&A?
Erfan Ibrahim: No, I think it was a very timely presentation, especially for a lot of resource-constrained organizations that are facing emerging threats and cannot afford to have a large cyber security team in house to monitor these very sophisticated attacks. So having a cloud model, of course, assuming that the connections are secure, and that the cloud company is not purchased by some nefarious state, as long as that's the case, it's a good way, a nice bridge for resource constrained organizations to have a mature cyber security posture. So I think that the presentation was very timely, and important especially for small and medium-sized utilities, many of whom may be on the call today, who are very concerned about the emerging threats.
And the challenge that we're facing in the industry today is that we're spending so much time and energy on compliance that security and reliability becomes the baby that gets thrown out with the bathwater. So I'm glad that you are thinking beyond compliance as you're offering these solutions.
So Gene, are you still there?
Gene Stevens: Yes, I am.
Erfan Ibrahim: Okay. Did you want to bring Dave on? Yeah?
Erfan Ibrahim: I have – we have some – actually, we have some questions that people have posted already, so I can read those as – and then Dave and you can both respond. How does that work for you?
Gene Stevens: That works great.
Erfan Ibrahim: Okay. So we have first a comment from William Miller, who says it's a very good presentation, and then he says, question, now do you – how do you feel about what if all the sensors and actuators were individually encrypted, this being separate from transport protection? As you stated, IIOT is not protected like the IT network, using certificate authority. There is a concern that a device can be connected and supply incorrect or bad data. The IP address is insufficient, and there's no building security for the devices. Encryption could offer a solution when a device is connected. This approach will work on the cloud as well. Your comments?
Gene Stevens: Yes. Well, we believe there's actually a very strong future for security solutions with encrypted traffic. So as these devices are communicating with each other, there's a lot of proprietary, homegrown communication. As you mentioned, there's a lot of CA oriented, certificated authority oriented, communications. And so it's hard to get in the middle of these things. And in general, you don't want to ever do so in a way that would interrupt communications.
We see organizations making very slow progress towards this. However, we would say that where the rest of the industry talks about the network going dark from encryption, we would actually say it's better to say that it is going opaque, in the sense that yes, you can't see inside a lot of this communication. However, context, such as assets, knowledge, risk, vulnerability assessments, these types of things that they are growing inside of these organizations and inside of these environments, is pretty meaningful, and you can do a lot of stuff on the machine learning front as well that has to do with communications patterns.
So for instance, like at ProtectWise, we automatically learn the natural communication patterns of these assets, and so we can look for both new assets talking to devices – that's pretty easy to identify. That's – you can alert to say somebody new is now talking. They may be talking safely, but it is brand new, and it's never happened in the last year before, so definitely worth investigating.
But you can also do things like analyze – we have a lot of success in convicting encrypted traffic on the IT side that is based around being able to do such things as analyze how the encryption was produced, like how it was generated, what kinds of ciphers it uses, the strength, etcetera. We can also analyze – and where packet capture is really useful for – is keeping – truncating these flows, keeping the first set of kilobytes that have such things as the handshakes and certificates being exchanged, to be able to have a high fidelity record of those things, so that we can both search and query them, but also convict them in real time and retrospectively, and use all this information together to say, even though I'm not really sure what function codes are going over this fully encrypted, quote/unquote, dark communication, I know that it's deeply suspicious, and I know that it's never happened before. And looking at this encryption pattern, that's not how you guys normally talk. We know that this is an outside actor of one form or another, and it's definitely a good use of your time to investigate.
And I think that actionable intelligence is really what a lot of these teams are looking for. And so we are actually really optimistic that the ability to provide value in an encrypted future is actually still very reasonable, but it's going to take some different approaches.
Erfan Ibrahim: Yeah. I think one of the challenges of encryption is anomaly detection is very hard to do, especially if you're going back and forth between clear text and encrypted to inspect. So I think that there are better options available through software defined networking and the wide area network with randomization of path that gives you actually more effective protection than encryption does, and still allows you to assemble your data and do intrusion detection at the points of assembly. Okay.
Gene Stevens: Yeah. And [inaudible] would be a big part of that, too. Yep.
Erfan Ibrahim: Yes. Tony Nobless, he has left, but he made a point. He said 15 million – I guess. He's saying 15 with a capital M – is not a tiny agent in the ICS world.
Gene Stevens: He means 15 megabytes.
Erfan Ibrahim: Yeah.
Gene Stevens: And as a security – as an alternative to an appliance, that's actually pretty small. Now our approach is not to embed these agents on the devices themselves, because we do not believe that that's how you should respect the operational integrity of these environments. So we're very much anti-embedding stuff on individual hosts. We're not an endpoint product. We like being on the network. We like being passive. We like being out of band. Compared to like a hardware appliance, this is really small.
And the reason why it can be so small is that we do all the heavy lifting in the cloud, and all the analytic functionalities produce fully out of band, fully asynchronously. But if this were a host-based technology, yeah, 15 megabytes would be big, but that is not what it is, and nor do we think that's the correct approach.
Erfan Ibrahim: Okay. So these are standalone sensors, and they have 15 million bytes of memory in them?
Gene Stevens: That's about how big they are, fully expanded. Yeah.
Erfan Ibrahim: Okay. Michael Shay asks, the lightweight software – the lightweight – let me see it again. Yeah. The lightweight software sensor does not cause any latency to the traffic flowing through the network of interest, correct?
Gene Stevens: Yes, that is correct. That is an important core principle. In certain environments in particular, even latency – even if everything were communicating safely otherwise, introducing even a millisecond of latency into standard communications can have disastrous side effects on the infrastructure.
So what we do via segmentation as well as very basic policy controls is make sure that the paths that this device takes do not intersect with paths that need to be shared with other devices that are communicating in a time critical manner.
Erfan Ibrahim: And also, that you're doing it through a process of duplication of packets, and not inline, where you're inspecting the packet as it flows, correct?
Gene Stevens: Absolutely correct. The duplication is a key win for this type of approach.
Erfan Ibrahim: Yeah, because actually, that can be more taxing than the network latency, is the processing time.
Gene Stevens: Absolutely. Yep. So you want to have zero of that. You get a copy of the traffic and you operate on that and don't slow anything down.
Erfan Ibrahim: So are you using taps for duplications of packets, or is the inline sensor creating the duplication?
Gene Stevens: So we use mostly the taps in order to do this duplication. So like an industrial environment, we have a vendor that we work with that produces a hardware-based one way tap that is tiny. It's this little tiny little device. And what they are able to do is provide strong guarantees around failover, hardware-based, flow control, that makes sure that it's not even physically possible for – under even egregious and erroneous scenarios, for any communication to get accidentally injected back into the communication network.
Erfan Ibrahim: Yeah, because in my experience here at our lab doing testing, I've found that the mirror ports on routers and switches were not a very reliable source of getting all the data, and it becomes also a point of failure, because an attacker, especially a sophisticated attacker, would be seeking them out and disabling them before they do their [inaudible].
Gene Stevens: Yeah. Yep.
Erfan Ibrahim: All right. Next is Jean Wai Ken, who says, how much false positives does ProtectWise create?
Gene Stevens: Good question. So in general, very few. I would say because – if you understand our hierarchy of expert systems approach, this is a unique – this is a unique approach to how we actually de-noise the haystack, but also get expert systems to remove false positives.
Now I actually am philosophically of the position that I'd rather have false positives than false negatives. So we err on the side of being greedy in the kind of data that we put together with regard to what we call observations that we looked at earlier inside of the product that had to do with – those are normally what you would see in like the log files from existing point products.
ProtectWise, even though you have access to that data, and I was able to show you a few, ProtectWise does not actually treat those as attacks in a direct manner. We just say they're interesting, and that's all that we know about them. When we actually create events, those are highly correlated and consolidated, and use many different forms of detection in order to contribute towards being sure that this is something that's genuinely malicious or not.
And so from that perspective, our events, it's very, very rare for people to get false positives out of the events. We are very tolerant of having false positives in the observations, and it's normal for a organization to go from 10,000 plus observations a day, which is actually pretty normal for even modestly sized IT environments, to having actually – I think our average rate now is somewhere between 5 and 12 events a day for even fairly large networks. Those are highly consolidated, highly de-noised, uses the expert systems approach in order to correct for false positives, where one expert system might have, for instance, an IDS rule that's a little too greedy, and is able to classify traffic that maybe is benign, and classify it as malicious. If you have other expert systems, such as machine learning elements [inaudible] saying, but no, that actually is within the tolerance range, that expert system can then be used to control for false positives, if that makes sense.
Erfan Ibrahim: Yeah.
Gene Stevens: So that is a – that's a different approach.
Erfan Ibrahim: One of the ways of reducing false positives is having tools that have the domain expertise of the protocols, such that they can correlate multiple pieces of information to say that this is what the root cause of the problem is, and that helps really reduce it. If you have a knee jerk approach to just setting traps and then getting violations, you'll end up with a lot of false positives.
Gene Stevens: A good point.
Erfan Ibrahim: All right. So William Miller asks again – he says, it is important to know the characteristics of devices and their use case. This can protected by encryption. This capability must be added to systems in the future, and it will protect the assets of the systems. It has not provided in closed systems, but will benefit open systems of the future, where information may be provided externally to any others within the organization or for a smart city.
Gene Stevens: Those are really good observations. David Hatchell, if your microphone is working and we have you correctly unmuted, you might want to comment on that a little bit more, because we have a really interesting strategy with regard to access to that information.
David Hatchell: Yeah. So to the point of the encryption in general, I mean, we're certainly seeing a larger trend amongst the automation vendors to kind of build out your encryption in the protocol streams. So obviously, we saw things that are part of the utility industry, like DNP3, right? It didn't have authentication and encryption kind of built in it. We're seeing that change. We're seeing things like OPC, that – the primary communication device between the windows and the control devices come out with OPC UA, right, a specific encrypted protocol kind of managing that.
So kind of what we're really trying to do is kind of focus on kind of the open protocols and the things we see in kind of legacy devices, where we kind of analyze that specifically for the protocol based anomalies that we kind of see in that, so be it mod bus, be it DNP3, be it OPC, of how we look at that. And so that was some of the things that Gene shared with you in the demo.
So being able to kind of boil that down in an instant response process to the top ten things that you kind of really need to look at from an attackers perspective in terms of are they trying to kind of manipulate fields in the protocol to cause a crash of a slave device, right? The tradecraft of the adversaries has kind of increased from things like we've seen in the crash, override, and then destroy, where they're building modular malware frameworks to specifically exploit [inaudible] protocols and get [inaudible] those devices. Having that understanding is becoming increasingly more important.
Erfan Ibrahim: All right. We have a few more questions, and the time is short, so we'll continue. Michael Shay asks, would the cloud connection to the ProtectWise need any special protection to secure the security as a service to the client? What – yeah.
Gene Stevens: I think I understand the question. Feel free to type in chat if there needs to be some clarification, if I misunderstand it.
So ProtectWise is communicate – securing ProtectWise's communication to the cloud. So ProtectWise communication is obviously encrypted, right? This is the basics of our TLS, [inaudible] secrecy, etcetera. So the sensor, when it communicates to the cloud, is itself going over a very tightly managed channel. It uses its own certificate keys, etcetera. It can actually, if you had the infrastructure, which some people do, most people do not, to do key management, you can actually manage those things separately and independently yourself, and such that revoking sensors actually de-provisions them at the certificate level as well, and causes them to be incapable of being reused or replayed across multiple environments, etcetera.
So all the basic controls are there. And then you can actually manipulate that policy as your organization sees fit. That is considered pretty state of the art with regard to encryption and availability, and it also works in a way that's friendly with existing infrastructure, including not just, obviously, being able to go through a firewall, but being able to like use proxy servers, etcetera, if necessary. So it tries to also – so it's actually designed to cause that safety and control and privacy, security, availability, work across even adverse network circumstances.
Erfan Ibrahim: Okay. Nigel Nawaki asks, if we are in principle able to encrypt data at high speed, and therefore stop people looking at our data plan, then how would ProtectWise sense any form of infiltration from its various sensors and DPI apparatus? Also, question two, by proliferating sensors, could we be adding to some of the vulnerability space? Thanks.
Gene Stevens: Yeah. Good questions. I think we kind of answered the first question with some of the previous questions that were in the queue, with how we handle like the network going dark, that concept in general. So there's many different opportunities [inaudible] very positive with regard to being able to continue to convict that traffic. We also think that deep packet inspection still continues to have some role in that type of environment, including particularly with being able to tear apart the micro features encryptions. And I also think that there's lots of really interesting research happening in this space of really interesting ways of which to analyze even the – for example, measuring the energy used to encrypt all these types of things.
So lots of really cool stuff happening there. I think there's actually a really good future for us.
And sort of the second part to the question, if I remember, was does this increase the attack surface by proliferating sensors. So if the sensors are complete – an ideal scenario, which is how we tend to deploy, is that it's on the other end of a physical one-way tap, and the sensor itself does not actually accept communications, but it simply binds to that one-way tap, and uses only that data, and it is able to reply that to the cloud, which then does additional checks on it, in which to accept the data for – as valid for detection as well as storage and analysis, etcetera, those, we believe, is a very hardened approach to being able to say that this does not increase the attack surface, because it does not give another way into the organization, and isolated as it is, should someone have physical access to the machines, which is pretty severe, that they're still not able to get to the other side of that tap, so not able to enter into the industrial environment via a ProtectWise sensor.
Erfan Ibrahim: Yeah. A key thing to understand here is that there is no IP address that is visible to the hacker. This is almost like, on an Ethernet, if you had an extender, and it's a one way, so it's a data diode type architecture. So it's very difficult, as Gene was saying, to get in. Even if you have physical access, there's no logic that will allow you to enter data, because it'll be seen as malicious tampering with the device. So that's good.
Next, Mike Tosher, who has left, but asked the question, so the threat can be detected after, during the intrusion event. What happens next?
Gene Stevens: So I'm going to interpret this as a response mechanism, like how do we respond to this.
Erfan Ibrahim: Yeah.
Gene Stevens: So the answer is going to be a little bit different for the IT side than the OT side, I think, and David, I'll invite you to jump in on this here in a moment. But when it comes to the IT side, it's pretty easy. In general, it's actionable intelligence. You can actually go to the users. You can go to the devices. You can offline remediate them. The platform is designed in a way that literally 100 percent [inaudible] also an API, which means that it integrates with – we have certain partners that do well on the IT side, such as Phantom or Demisto, which have to do with security orchestration and automation. You can actually automate large parts of this to say, hey, when I see this type of an alert, automatically go ahead and add a block to the firewall. I don't want to be talking to that host. And put it in my queue so I can review it later and decide if I want to keep that or not. I can have stuff put automatically into my ticketing systems, etcetera.
On the OT side, you're going to do a lot less of that kind of automation from an enforcement perspective and a remediation perspective, because you don't want to automatically begin blocking communications of devices inside of the organization. I mean, people are going to handle the circumstances differently, but in general, you probably want to a side effect free approach. And so in those environments, you can imagine it going – the information being delivered from ProtectWise potentially back into your SIM, if you have one, or into your ticketing and workflow management system, and being able to use that as a pivot into ProtectWise as part of the investigation, whatever the workflow may be. And in those environments, that kind of investigation is done by the OT sec team, sometimes with support from the IT security people as well.
What we have designed this to do is to kind of fit into that natural workflow, and allow people to choose the level of automation that they want, that makes sense based on the assets in the fields in which you're operating. David, would you add to that?
David Hatchell: Yeah. Certainly, Gene, and a good point. I think on the OT side, it's kind of drilling into the events [inaudible] observations, right? You can see a lot of observations which might be kind of indicating that hey, I'm seeing a – for example, a firmware programming change kind of occur kind of at the same time that I'm seeing a remote connection attempt, right? So there's some context that you kind of need to build, and kind of the ICS threat hunting, which is really important, you know, but also is kind of in combination with some – Erfan, to your point, of some basic network hygiene and policies, right? If you're seeing a controller or something on your control [inaudible] generation network talking outbound, port 80, you know, through the internet connection, right? Misconfigured device, you've got a problem right?
So it takes a little bit more kind of coordination between your IT and OT team and the SOC team that's potentially in the middle to kind of identify those devices, should they have been talking that specific protocol, should that configuration have occurred in that manner, to really kind of get to that point.
Erfan Ibrahim: Yeah. And I would highly recommend that in professional engagements that you do with companies, before you even deploy your technology, help them understand their use cases, and help them gradually redesign their network, so that that segmentation, the hygiene is there, so that your alerts are meaningful, because you've tightened the flexibility in the network to the point where anomalies are really anomalies, and there's not this noise on the network. So I would almost recommend a professional services engagement before the deployment of your product, or maybe deploy your product, gather the noise, and make that the compelling argument to help them redesign the network, to quiet it down, and then put it into maintenance mode.
Okay, next, we have a question about – yeah, it says, isn't detect and respond an IT tactic? Is that even acceptable in an OT environment?
David Hatchell: Yeah, I would say – I'll kind of handle that. I mean, certainly, visibility is the key piece, right? So you'll be able to visualize what's on the network. Obviously, in a lot of these environments, we have very little asset inventory of what's even out there. What are your controllers? What are the RTUs that are out there? What is the firmware on those, right? So a lot of this right now is kind of a visibility gain. But certainly, as you have the components between a combination of network segmentation, a combination of basic endpoint controls, and a combination of understanding the protocol traffic kind of going across, you do have the ability to effectively detect and respond in these areas.
So in the OT environments we have them kind of deployed into, it's a lot of the – kind of a go after a top ten approach, right? Of what you can kind of effectively do and kind of look at the policies in this area. But certainly from the response portion of that, it does kind of take coordination between your IT teams and your OT teams to effectively do that.
Erfan Ibrahim: And the next question is from Olivetta Garcia, who says, the lightweight soft sensors that you talked about does come with your products? That's a question.
Gene Stevens: Yes. Absolutely. So we offer them for free. You do not pay for them. They are part of the basic service that's what we provide in order to help data get to the cloud, so this can be kind of run in a fully passive and asynchronous manner.
Erfan Ibrahim: So you are providing this as a software as a service kind of model?
Gene Stevens: This is exactly a software as a service model. However, I want to qualify that with the idea that in a lot of these OT environments and industrial environments, people need hardware, and we have the partnerships set up to deliver really OT and industrial friendly security hardware to make delivery and provisioning super simple.
One of the beautiful things about these sensors is that because they're so simple, and all the really – the magic happens in the cloud, is that not only are they tiny, but that means they configure and deploy in minutes. I mean, literally minutes. You just – you boot up the sensor, get it provisioned, and it connects to the cloud, and seconds later, that visualization I just showed you is completely lit up. And so what you can do in these environments when we do provide hardware is we send those hardware devices. They're already configured and they're ready to go. You can have somebody just plug them in, power them up, walk away from it, and they will work.
Erfan Ibrahim: And have you considered the NERC CIP compliance of these vendors?
David Hatchell: Yes. So I'll handle that in general. The white paper that Gene kind of referenced in the presentation we've specifically written to kind of address that. So we kind of recognize right now that the – depending on the regional coordinating councils that you kind of fall under, there's some kind of distinguishing between what can be considered electronic access control and monitoring information, between – or what is considered bulk electric cyber system information, right?
So in general, we see cloud deployments, security services as ourselves, sort of like ticketing systems, like Service Now, etcetera, right? So we wouldn't be kind of considered electronic access control and monitoring system. Typically, you'd have kind of specific products kind of doing that. The intent of our product would be to augment those security controls, and that we would be potentially consuming bulk electric system information. And generally, we've seen auditors be able to kind of approve that. But there's some more specifics in this paper that we kind of talk about, and one of the things we are kind of working on is with our partners, Archer Energy and the founders of EnergySec to come up with a specific interpretation of this through FERC to make this a little more consumable by the individual utilities in this area.
Erfan Ibrahim: Yeah, because I would recommend even a cursory review and approval by the NERC CIP community, as well as the NIST community, with the 800-53, because it has implications in the Department of Defense with the risk management framework. And if you've gone through that exercise, it lowers the blood pressure, let's say, of the asset owners.
David Hatchell: Yeah. It kind of comes to the physical protection plan, right? Who has access to the data, right? And so I think a lot of it is people who are kind of defining the security plan from a NERC CIP perspective are able to kind of control who has access to that data, and that's something we have kind of gone through an audit and can certainly provide help with.
Erfan Ibrahim: Okay. Very good. Next question, because our time is very short, and the questions keep coming, considering the small utility with limited resources and a one guy NOC, what's the relative cost for deployment for an example such as a small rural electric coop utility?
Gene Stevens: Dave, go ahead and handle that.
David Hatchell: Sure. Yeah. I would say in general, we charge – our pricing model is kind of by bandwidth and by retention period. So it's not – since we're not charging for the sensors, as we kind of talked about, unless you were kind of purchasing hardware, so if you're just talking a couple of megabytes of traffic, then obviously, if you're kind of using this to kind of monitor your industrial environments, right, you should probably consider longer protection – retention periods, because as we've seen through the attack surface in this, right, that the tradecraft of the adversaries, they're just kind of taking long times to kind of exploit and delivering tests to these systems. You might want to consider retaining your data for a year. That's typically what we see in these industrial environments.
But the base cost of this for – depending on what your bandwidth and retention time is, is very low. I mean, we – so we can certainly address that with you specifically offline.
Erfan Ibrahim: Yeah. One of the things – I work a lot with munis and coops in our research work here at NREL, and one of the challenges they have is exactly this one man NOC situation. So in situations like that, having a tool like this with a GUI that even a non-technical person with some training can understand can increase the strategic depth in the organization of who can benefit from this product, so that you're not just relying on that one person running the NOC.
All right, so Anthony – I can't say his last name. It's Cicchosi, I guess, or Ciccosi. He says, following up on the latency, can this be used on layer 2 time sensitive protocols, like GOOSE?
David Hatchell: So one of the key things we're doing is – and so the answer is yes. From the detection of essentially what we consider the traffic itself, right? So can we recognize the protocol? And then using our kind of lightweight pre-processing on the sensor, being able to store that and stream that back to the cloud for retention, yes.
So beyond it kind of inspecting anything further, as far as the metadata, and specifically in GOOSE, that's a little more of an issue. That's something we're looking at.
Erfan Ibrahim: Right. So you are focused really on the payload that the GOOSE messaging is carrying, not the GOOSE protocol itself?
David Hatchell: Yeah. I think the base thing that we've got to look at in these industrial environments is what percentage of the traffic we're kind of seeing is unclassified, right? If we're able to kind of then make the appropriate decisions as we're setting up a capture policy to say which data do I want to retain, hey, if it's GOOSE messaging traffic, I kind of want to retain it, so I can then look back at packets retrospectively against threats that might have been there, that's kind of the focus that we're kind of trying to provide at this point.
And then some of the things we want to go kind of deeper in, right, as we showed you kind of the mod bus specific examples of where we're going in deep and looking at the function codes, and DNP3, we see similar things, right? That's where we're going to drive deeper in the metadata, the packet, to kind of look at the specific anomaly conditions that we see.
Erfan Ibrahim: Okay. Next question from Rob Hubbard. He says, what is the majority of your traffic you monitor today, and how do you connect to the cloud TCP/UDP from ICS SCADA, scan for VPN tunnels or IP sec to the cloud?
Gene Stevens: So that is in the scope of an industrial environment specifically, or just in the aggregate?
Erfan Ibrahim: Just generally. Share some flavors that you use.
Gene Stevens: Well, we have – so our VPI technology is capable of embracing and decoding and understanding and enacting policy around like 2,500 different applications and protocols. We have pretty good visibility into a lot of different things. We see – on the IT side, we see just a tremendous amount of the basics, email, HTTP are big ones, streaming media are big ones. We see a lot of DNS activity. We actually see quite a bit of SMD. So like – which is really powerful when you consider like lateral movement, etcetera. Those are all really big contributors.
By volume, when I'm just kind of going back anecdotally through some of the analysis we've done on different aspects of this, we do see a lot of traffic that's not a lot of value from a security perspective, and that for us is something that we actually try to encourage customers to filter out by policy, in the sense that, for instance, we've seen like data synchronization traffic, backups, automated backups, different kind of infrastructure and operational support technologies, some of which is interesting, most of which is not, from a security perspective.
But in the industrial environments, we see a lot of open source as well as proprietary protocols and communication across devices, different hardware vendors, ranging from Honeywell, Siemens, Yokagawa, Emerson, etcetera. There's quite a few people out there. But the interesting thing in that environment is that a lot of the traffic we see contains what I would call maybe abuses of the protocols, so like deviations from the standards. Like even a lot of the open source implemented ones, they're actually implemented badly, and some of the proprietary ones are also implemented badly, in the sense that they just focus on a subset, and then kind of have a very pragmatic approach to cutting some corners just to make it work reliably, right? Because that's their number one interest.
So I don't – if that answers the question. It's a pretty wide variety of stuff. IT side is really heavy on the basics. OT side, really heavy on just inter-device communication across a wide range of vendors. We do a lot of –
Gene Stevens: – [inaudible] we see a lot of those.
Erfan Ibrahim: Yeah. And I think he had a two-part question. The first part was the types of traffic, but the second one was more about the plumbing, and asking if you use TCP/IP networks with the IP sec and VPN tunnels, that kind of question.
Gene Stevens: Oh, if we use that ourselves? Yeah.
Erfan Ibrahim: Yeah.
Gene Stevens: So we actually – so we use just – when we transmit from the remote facilities to the cloud, that's just generic TCP/IP. We want it to be very internet-friendly. We want it to be very network and firewall friendly. We like it to not require special policies, unless absolutely necessary, right? Just try to be really friendly to the team that's trying to manage this and make sure it's maintainable.
Regarding ourselves internally, as we move data within ourselves, inside of our platform, which is housed in AWS today, that data is also encrypted, and the data [inaudible] is also encrypted, the idea being Amazon should not be able to inspect how data moves. So for ourselves, any system that runs at scale, we stick the basics, really strong signing, really good encryption, really recycle certificates frequently, so that perfect port secrecy type of approach, etcetera. But keep it down to the basics, which are less likely to be exploited, when you have less vulnerabilities to attempt to manage.
Erfan Ibrahim: So along those lines, I would encourage you to look at Dispersive Technologies as a way of reducing the internet cost for the customers. They provide a software defined wide area networking technology, and Cal Iso has been using that technology to connect with their independent power products in California who don't want to pay the high fees of T1 lines or other types of –
Gene Stevens: Yeah, that's great.
Erfan Ibrahim: – [inaudible] networks. So look into that. They're based in Atlanta.
Next question says, I'm curious as to your methodology of transferring the data packet capture to the cloud. OT networks are typically in very controlled enforcement zones with many firewalls and DMZs between the IT and OT network. Are you opening additional ports and firewalls to allow this traffic?
Gene Stevens: Sometimes the answer is yes. It depends on that organization and their policies, or that facility, whatever regulations they have around it. So that will vary from organization to organization, I guess is what I'm saying.
In general, when we work in really massive OT environments, there tends to be pretty specific controls around that, and we submit to them which TCP rules will allow ProtectWise to run successfully, and they are able to run that through their change management and get that implemented.
But if that's not in place, generally, you don't have to – if you don't have that kind of a burden or requirement in that location, then in general, you do not have to do such a thing. It works over the internet, and like, Erfan, you kind of somewhat alluded to, it doesn't have to be over the internet. There's other ways to get it to the cloud.
Erfan Ibrahim: Yeah. Next, so then there was one more question from Matthew Langlaw, but you answered it by saying that it was – you're using TLS for security. And then Brian Kepper says, is this session being recorded? Yes. And can you send a link out? I will to all those who registered. And Rob Hubbard said yes, the second part was already answered when he said TLS.
So we are at the end of all our questions. I'm sorry we went beyond the stated time, but it was well worth it, because we got meaningful discussion. Any final thoughts, comments, from you or Dave?
Gene Stevens: Yeah. Just one real quick thank you to you, Erfan, for setting this up, and thank you for sending the communication out. I really enjoyed this conversation. So under the weather today, but certainly very happy to be part of it. And I didn't have to mute myself and cough offline too often. [inaudible] special thank you to Armature Systems, who helped broker this. They're a partner of ours, and they're great to work with, and so thank you to that team, who's probably on the line as well. David?
Erfan Ibrahim: And Dave?
David Hatchell: No. Thanks, everybody.
Erfan Ibrahim: Yeah. I have to say that over 300 people were there by the time this seminar or this webinar started, and the group is pretty diverse, from a variety of stakeholder groups, universities and utilities and vendors. And so it was quite impressive. And this is about 1/15th of my distribution, was registered for this webinar. So I thank you all for participating, and Gene and Dave, thank you for sharing with us this very innovative approach to looking at anomalies and very, very sophisticated protocols across a wide set of verticals. That's not easy to do if you're just on a single appliance, because you're programmed to one protocol or another.
So having this ability to take data back into a cloud and do deep analysis and correlation, and then providing what I would call actionable intelligence back to asset owners, is a very important service that we need in this time, with so much digitalization. So I thank you for continuing to work in this area. I thank the audience for listening. And hopefully, that you will use this email address that's being displayed to communicate with Dave and Gene and have an ongoing dialogue as Santa comes to town.
So everybody, happy holidays, a very happy and safe and productive new year, and our
next webinar is going to be sometime in January. Once I have the speaker finalized,
I will share the link with everybody, and the topic, the abstract, title, and the
bio. So Gene and Dave, thank you, once again. At this time, I'm going to stop the
recording and end the webinar. Have a good day.