The Industrial Immune System: Using Machine Learning for Next Generation ICS Security— Video Text Version
Below is the text version for the video The Industrial Immune System: Using Machine Learning for Next Generation ICS Security.
>>Moderator: Good morning. This is Erfan Ibranhim of the National Renewable Energy Lab. Today is April 7, 2017. I'm coming to you live from Golden, Colorado. The Energy Systems Integration Directorate is part of NREL, and my cyber-physical systems security and resilience center resides within in that ESI directorate at NREL. And I have been doing these webinars now in NREL since February of 2016, and we have a pretty healthy audience now of close to almost 5000 people in our distribution, and almost 300 of you have registered for this presentation.
So, let me say a few words before I pass it on to Jeff. Cybersecurity has become a very popular topic and lots of people are talking about it, but the domain expertise is quite limited, so a lot of the conversation occurs at a very high level and lacks the technical detail that is needed for asset owners to implement to protect critical infrastructure. What we are finding is a trend that Moore's law and the reduction of cost in networking, the fast development of sensors and their proliferation, especially in the energy sector, has created a situation where having a human in the loop for the detailed information analysis is not humanly possible anymore. And it is very important to start incorporating machine learning to sort out in the raw data what the human needs to look at and then turn it into actionable intelligence and take action with that data.
If an organization does not move in that organization, they end up being very reactive in their posture. And today's advanced persistent threats that are coming out of nation states as well as from organizations with nefarious intentions have that deep domain expertise of energy systems or water systems or thermal systems, and can overwhelm the asset owner by fuzzing data, by subtly changing some commands. So, if you don't have machine learning and you don't have an adaptive learning atmosphere in which you can set rules as you go because you understand what normal operation is, you're going to be playing catch-up. And the way that will work is the APT will enter your system and will compromise critical components, and you'll end up locking everything down and trying to fix it. Meanwhile, the applications that run energy are going to be affected.
So, today we have an accomplished speaker, someone who has been in different disciplines throughout his career and has entered the cybersecurity realm with a very out-of-the-box methodology. And I am very pleased to welcome Jeff Cornelius to our webinar, and he will talk a little bit about how machine learning can help with the next generation of ICS security, but also share the experience of Darktrace in the industry. So, with that, Jeff, the floor is yours.
>>Male: Erfan, thank you very much for that kind introduction, and it couldn't have been teed up any better. Many of the points you bring up are the very points that I discuss on a daily basis with thought leaders in industries all around the globe. One of the things that's most important about what you – your introduction was the realization that reaction perspectives in cyber are so far behind the power curve that it's just impossible to catch up from a significant cyber event today, especially in critical infrastructure. The perspective that's most valuable is a very forward-thinking, very forward-learning perspective in risk mitigation and management. And you hit it right on the head: The only way to do that today is by using the human for what the human is best at, and that is the subtlety and decision process associated with efficiency. Allow machine learning and advance mathematics to do what they do best, and that's crunch a gross amount of information very, very quickly so that you can stay ahead of the curve.
So, I couldn't have teed it up any better, Erfan. Thank you very much for that. And folks, thanks for being on the webinar. I really appreciate it. My name is Jeff Cornelius, as Erfan said, and I'm the EVP of the Industrial Immune System here at Darktrace, responsible for critical infrastructure, ICS and OT environments. The Industrial Immune System is founded basically on the premise of we model 100 percent of every communication inside of an OT environment, understand what that communication looks like. We can understand it when it's normal and when it's abnormal because we modelled that behavior over time, and we use a very advanced form of mathematics called Bayesian recursive estimation that allows us to understand the subtle changes of information over time, developing a pattern of life, if you will, or a sense of self such that we can understand when that pattern of life or sense of self has something anomalous going on.
So, what I'd like to kind of talk with you about today or share with you today is not so much specifically about what Darktrace does but kind of from a though leadership perspective what's going on the OT marketplace today. What's going on in cybersecurity as it relates to OT? And what's going on with cybersecurity as it relates to OT and the meshing of IT in an IoT perspective, and the implications of that for visibility down to the plant floor from the board room? Because that's – those are all topics that I kind of deal with on a daily basis, whether that's at a conference that I'm speaking at or with customers that I'm sitting with on a daily basis. And so, I just kind of – that's the theme of the talk today and the discussion, and we'll leave some time for questions at the end.
So, a little bit of background into Darktrace and what we do probably is valuable because it gives you an idea of where we come from. Darktrace is founded on a 100 percent mathematical background. And Erfan, I'm glad you mentioned the machine learning and math are critical to the forward-thinking position that organizations have to take today to stay ahead of advanced cyber threats. With the advent of machine learning years ago, we were teed up, as you said, for Moore's law. We were teed up for the ability to tackle the problems of today, tomorrow, and many years into the future, because with machine learning we now have the ability to focus on the subtlety of behavior mathematically that we couldn't see even five years ago. We could sometimes see them in Cray computers or in advanced math labs at MIT or in Cambridge in the UK, but we really couldn't see those on corporate floors. We just – we couldn't see that.
And of course, then we saw the advent of big data, which is essentially crunching gross volumes of content and extracting meaning or value from that content. But now, we've ventured to a world where cyber actors, as you well said, Erfan, are behaving in a way that even the largest, most lucrative companies in the world can't keep up with. Many of these organizations, many of these APTs, nation states, et cetera, are so well-funded that they have full-on advanced machine learning labs that are capable of modelling behaviors at a rate and pace that no corporation can keep up with. So, the only way to fight that, as Erfan said, is with machine learning.
And if you've got – if you have machine learning as your core focus – and I will talk a little bit about machine learning, because that's a really popular term in the marketplace today and I want to make sure it's perfectly clear, the subtleties of machine learning, because it's very, very important – if you have machine learning as a focus, you're already a step ahead of the game because you're no longer trying to build that wall higher such that that ten-foot ladder is mitigated. Right? So, the analogy of I build a 10-foot wall; the threat actor brings an 11-foot ladder. So, you build a 12-foot wall; they bring a 14-foot ladder. You get the idea. Well, you're always playing catch-up in that regard. Well, with machine learning, that wall is not even necessary anymore. Now you have the ability to see behaviors that are subtle in their very nature in such a way that you can mitigate that risk without having to build that entire perimeter.
Now, I'm not suggesting by any means that you take down your endpoint protection tool or your antivirus or your IPS or your firewall or any of that. I'm not suggesting any of that. Quite the contrary: That's good corporate security hygiene that we strongly advocate. Keep those tools. You've invested in them heavily.
But those tools are only valuable at the perimeter. Some internal monitoring can occur with some of those tools, but you – the real fear, the real concern is what happens – and you mentioned this, Erfan – what happens when that threat actor gets inside? Because they will, by the way. They'll find a way around that advanced endpoint protection tool or that next generation antivirus, or that next gen firewall. They're going to find a way around those things. I deal with this on a daily basis, where people have the next gen X, Y, or Z, and we deploy and we show them that they're compromised internally.
So, what do you do with that? Well, you simply keep those up, because they're going to keep 80 percent of the threat out. But the reality is, is what happens on the inside is even more nefarious, more challenging, even if it's accidental. Even if the insider behaving accidentally – the CEO that clicks on a link that he or she shouldn't have – that insider behavior presents an immense challenge to organizations today, even if it's not nefarious. Because what it does is it sets up the subtle behaviors that are oftentimes missed by the exfiltration of content, or the beaconing out through that firewall. So, one of the things that machine learning allows organizations to do today is it allows us to see the behaviors that go on inside of an organization in real time, and understand that those behaviors are normal, abnormal, and any number of flavors in between.
Now, the subtlety – obviously, the devil's in the details, and the subtlety is most important here, and we'll hopefully dig into that just a bit in this time that we have. But the reality is that whether it's an IT environment or an OT environment we see on a daily basis threat actors breaching these environments and doing something, whether it's getting in to just do reconnaissance, figure out what the company has, understand what the company's assets are, understand what the company's intellectual property is. It doesn't really matter what's the purpose, or necessarily attributionally who's doing it. What really matters is being able to identify it and enable the organization, empower the organization to do something about that risk, whether it's mitigated or remove it, whatever it is they choose to do. That's the real benefit of machine learning.
Now, in the OT side of the house there are even more subtleties. Right? So, we can drop an appliance into any IT environment off a course with 100 percent passably and model behaviors inside of an organization to understand what normal looks like. We and many others – we deploy with no rules, no signatures. We don't use log or flow data. We actually capture 100 percent of the packet information and run advanced mathematical analytics against those – that full packet. About 350 different dimensions are analyzed in real time. And what that gives us is a sense of self. Well, in an IT environment that's fairly straightforward. It's very easy for us to identify normal patterns of behavior and abnormal patterns of behavior and compare those using recursive estimation, compare those to their peers, the family, their associated types of devices, their associated types of users, the associated types of network segments.
In an OT environment it's even more sensitive, right? Although, the OT environment is typically more staid and static, very deterministic calls, actions – so, simpler to model. But the challenge in OT, obviously, there are multiple protocols that you have to deal with. Multiple different supply chain vendors. In any particular OT environment – take an energy utility for example. It might have Honeywell, Rockwell, Siemens, Yokogawa, Schneider products all inside their organization, among many other third-party or supply chain vendors, providing services to that power plant. The challenge there is understanding the subtleties of the behaviors of each of those devices and understanding how those devices communicate either with their SCADA servers or their HMIs or across the network themselves such that you can model that behavior as normal or abnormal. That's the real subtlety of an OT environment, and that's specifically what we want to talk about today, is the challenge of OT or ICS environments – and ICS and security.
So, one of the most common threat vectors we're seeing at our customer base today is what we're calling trust attacks – and I think, Erfan, you alluded to that a little bit – in these subtle changes of data. One of the things that we're seeing most rapidly evolving inside of our customer base is – we call them trust attacks. They're behaviors by threat actors not to necessarily extract data or even destroy data, not necessarily a large retailer breach, extract millions of credit card bits, but rather change data for some future deliverable. We don't know what that is, obviously. Only the threat actor has a plan. But that subtle change of data is significantly challenging to OT operators.
Now, let's put this in context, right? So, I go into – if I'm a threat actor and I get into a health management system or a hospital system and I begin to change values on, I don't know, nuclear technology, nuclear tools that are used in x-ray machines, those types of subtle behaviors are very, very, very challenging. Those types of subtle behaviors are nearly impossible to be identified by a human. Nearly impossible. It would take a human literally watching every communication to, from, and through a device, and an intimate understanding of normal behavior looks like for that device to be able to pick that up. You see that that doesn't scale. Everyone on this call sees that that doesn't scale. Well, the real challenge is finding a solution that can model and monitor that level of granular communication, understand what normal looks like, and then be able to understand – and divert from – understand what normal looks like, and then determine what abnormal is, and report on that. So, that's really what Darktrace is all about and what machine learning really brings to the table.
Now, a little talk – a little bit here about machine learning. Machine learning is a challenging term right now. It's like the term big data was five years ago, seven years ago, when it was just coming out. Everybody was jumping on the big data bandwagon. You hear big data used once in a while now, but the market's kind of flushed out and you know what big data is and you know who's really using big data analytics or such. That market, the same kind of thing is happening right now with the term machine learning and artificial intelligence. It's probably going to be the next one that people jump on the bandwagon for.
Machine learning is a really interesting discussion topic, and we could probably spend an hour or two talking about just machine learning. But machine learning is tough to do right. It's very, very difficult to do right because if you take, you download a machine learning pack from, say, Amazon – which you can do today – or from Azure, and you set up a machine learning algorithmic process on an application or on a data store, it has the ability to do something, whatever it might be that your algorithms are writing it for.
But the challenge with machine learning is that most people in the machine learning space today are applying machine learning with a rule set or a signature set or some kind of a threat intelligence base. Well, the very – the problem with that is that you're taking a purely mathematical approach to something and you're hamstringing that approach. I'm grossly oversimplifying there for any Ph.D.'s in math or pure mathematicians on the call. The reality is, is that that's what we're doing when you apply rules, signatures, and threat intelligence to a machine learning algorithmic pack.
So, a completely different – that's simply called supervised machine learning. The opposite approach is completely unsupervised machine learning. Allow the algorithms to learn what they need to learn and model what they need to model based on the behaviors they see. Now, that's really difficult to do unless you have a learning mechanism on top of that. That learning mechanism for Darktrace is called Bayesian recursive estimation. It's our ability to look at behaviors and say, "These devices all behave similarly. Those are now a mathematical family for us. And if any one of those devices exhibits a different behavior of a unique behavior, we evaluate that –" again, I'm grossly oversimplifying that, folks, but – "we then evaluate that compared to any other devices or to its family, and that's how we determine its anomalous behavior."
More importantly, we can break those families out into groups and groups and groups and groups, and we do that iteratively, continuously. Now, that's where machine learning gets really interesting, because now you have the ability to dig very, very deeply very, very quickly into a specific behavior, model that behavior against other behaviors of devices, to determine truly how anomalistic is this?
So, a real world example might be valuable here. If I see a behavior in a SCADA server that I've never seen before – if I'm a machine learning algorithm and I see behavior in a SCADA server that I've never seen before, it could be any number of things. Right? It could be a new command string that comes through for a new device that we've installed. It could be just a report coming back from a series of sensors or PLCs. Well, the reality is it could also be a nefarious act.
An example: A customer of ours had – we saw some activity in a SCADA server we had never seen before. It was a newly installed server. It had only been online for about five weeks. And unreported to the operator there was a wireless card installed in this new server that came in. That card woke up after five weeks and began to beacon out, looking for a wireless connection outside of the OT environment. Darktrace identified that and the operator was able to question the supply chain operator and "Why is there a wireless card that's in this server that's in our OT environment that is 'air gapped'?" It's important to know. An important perspective is, hey, there's an asset that you did not know was on your shop floor. That asset was beaconing out. You need to know that. And no human would have seen that content or seen that behavior. Very, very important to see that subtlety is critical there. You might say that that's fairly obvious. It's absolutely not obvious. This operator had almost 7000 devices on a plant floor; this was just one that was – that lit up and started beaconing out.
So, you might say, "Why the Industrial Immune System?" Right? I've talked a little bit now about kind of machine learning and operational technology, but why an immune system approach? Well, because the immune system is a great analogy for the way that networks behave. The immune system, the human immune system is very, very complex. Right? We have skin. That's our perimeter defenses, if you will. Go with me on the analogy. And then you have DNA inside the body. You have T-cells. You have a whole immune system deliverable that's designed to protect you should something get into that system. It's designed to alert your body that something's awry, something is wrong, something is different, something is not normal.
That immune system's goal, of course, is healthy functioning of the human. Well, Darktrace's Industrial and Enterprise Immune Systems are designed that way as well. The ideal of the Enterprise and Industrial Immune System is to identify anomalous behavior when it occurs and let you know so that you can do something about it, so that you can mitigate the risk or figure out a way to ameliorate the problem.
It's a pretty straightforward deliverable. Appliance-based. It installs in about an hour regardless of the environment, blah blah blah, et cetera. The reality is, is that within a few hours' time we're showing you real behaviors inside of your environment that 99.9 percent of the customer base we have could not possibly see before Darktrace. And it's really the internal operation of a network, right? So, again, going back to the concerns that some people have about "Well, what if it gets in? Or what if it's already in, and what if it gets out?" Right? "My IP or my formula for a popular carbonated soda." Right? All of these kinds of concerns are valid and it gets back to my point of: A good corporate hygiene approach, a good corporate security hygiene approach, a good industrial hygiene approach is to maintain your perimeter defenses and your defense in depth. Right? So, maintain the good corporate hygiene that you currently have in place – maybe update or evaluate if it's valuable for you. But it's going to cover you for 80 to 90 percent of your risk.
The reality is there will be situations where threat is injected into your environment, whether that be accidental insider behavior, nefarious insider behavior, or just a very skilled attacker getting around, through, or over those perimeter defenses. Now, once they're in, obviously they have to do something or they have to get back out. So, you've got, again, perimeter defenses to identify when they're leaving the organization or when the activity itself is taking action. Well, what happens inside. Right? All of those [throat clear] – excuse me, I'm sorry. All of those, as Erfan alluded to, those are all reactionary. "They already got in? Okay. Well, they're in. Now what do we do?" Or "They've already got out and it's been posted to the dark web or it's made the front page of The Wall Street Journal. Now what do we do?" Those are all reactionary.
Well, our perspective is not reactionary. Our perspective is let's mitigate the risk before the risk becomes something that you have to deal with in The Wall Street Journal. Right? So, the perspective we take is let's take a look at the gooey center, if you will. Let's look at the place that nobody else really can see and help you understand where your risk resides, whether it's configuration of your systems so that you have true defense in depth, or whether it's you've already been compromised. "Here's a piece of malware that's sitting in your system or moving around in your system." Or "Here's a piece of content that's moving around in your system –" again, attributionally, we don't know if it's malware or not. "Here's a piece of content moving around in your system. You may want to identify what that is because normally these behaviors don't occur in these systems" – as an example.
So, the reality is visibility is 100 percent key to making risk-based mitigation decisions. If you truly want to mitigate risk in your organization, you can't do that without information. You can't do – you can't expect to have the information to make those decisions without visibility. And that's really the hallmark of what we do. We provide 100 percent visibility across every device, user, and network within your organization so that you understand mathematically where your risks reside.
I've kind of talked a little bit about machine learning and it's really difficult to get right. We see on a daily basis the types of machine learning errors that occur in the marketplace. We've gone into organizations where people have deployed some type of machine learning approach in, for example, a network environment, corporate network environment, and tried to deploy the same approach into a corporate ICS environment with massive failure as a result. We've seen many people try to deploy machine learning approaches that are not passive, and we've seen massive failure in those environments. Many of our customers have been burned by the perspective that someone walks in and says, "We do machine learning. And oh, by the way, let us scan your OT environment." Any OT operators on the call, please, if someone says they want to scan your environment, you know as well I do, please tell them to pack their bag and go home. OT environments are just too fragile for even the lightest touch scanning behaviors.
So, one of the challenges that we see in machine learning is this message that machine learning is the answer to everything, because it gives people a false belief that anyone using the term is in fact a messiah or a deliverer of the Holy Grail, and that's not – that's just simply not the case. So, I beg of all the people on the call, please check the validity of those who are talking about machine learning, ask what they do in machine learning, ask if the machine learning algorithms they're using are passive – ask if the approach they're using is passive, I should say. Ask if those machine learning algorithms are whitelisted, signature-based, rule-based, threat intelligence-based. What's the core? If they're unsupervised machine learning, that's worthy of a discussion. If it's supervised, you probably – you're probably just as good doing it on your own, to be perfectly honest with you.
So, in conclusion, I just want to kind of wrap this up and take you through a few examples that we've had. We fully believe that every organization is in fact compromised. We believe that there's some level of compromise in every organization, not necessarily nefarious, exfiltration, nation states, APT. Not necessarily that. In fact, we deliver proof of value to our customers every week where we don't find nation states or APTs. But we do find always network challenges that we can help them mitigate, whether those be configuration errors, configuration missteps, or holes that they didn't know about. Assets they had no idea that they had on their network. Connections between IT and OT that they didn't realize they had, if they're truly trying to air gap. Or bidirectional communications between IT and OT that they did not know they had.
So, lots of discovery with advanced machine learning. The ability to identify assets is amazingly powerful for most OT operators. There are regulators that require the identification of all assets in the US by OT operators, and Darktrace is critical in that deliverable. You want to know how many assets you have in your environment so that you can't be _____.
We truly believe that the legacy approaches that organizations take are limited. We know that they just don't work in our world today. They do protect you from the known bads, but they have absolutely no skill in protecting you from the unknown unknowns. And that's really where advance machine learning with recursive estimation comes in, and the value of that is identifying those unknown unknowns, whether those be malicious or nefarious or simply configuration errors, that you didn't know about. There's great value in that. So, the benefit obviously is identifying every asset, having 100 percent visibility across your entire enterprise, whether that be your IT or your OT or both organizations, and then being able to understand where your assets reside relative to your overall security posture. What level of risk am I willing to take for my environment? Or what level of risk do I need to mitigate in order to attain a certain level of confidence in compliance and governance from my board?
That's kind of the conclusion of kind of the talk track, if you will. I would like to now take you through just a few examples of anomalous behavior that we found – some anomalous – I mean, some nefarious and some not.
The first is a subcontractor that we found in a power station. The subcontractor was using a laptop to download updates to a PCN, a process control network that the person was responsible for from the third-party provider. There was a lot of information that was delivered to these devices, but then they used – the contractor also accessed information on a home router, and pulled content and pushed content back to the plant. Very anomalous in nature. Very unsafe in nature. And obviously, for this particular plant, noncompliant. Darktrace identified that, alerted the operator, and the operator was able to take action to mitigate the risk. Critical, because no one in the operator's site would have known that that happened. What that would have done is that would have provided an open avenue for threat actors. Even though it wasn't necessarily nefarious in this case, it would have opened the door for threat actors in the future. Should they stumble across that open door, they have access to essentially the process control network inside of this operations plant.
A compromised internal server was receiving remote control connections in another power environment that we run. The external – as we just read here – the external computer was in control of internal servers, and that gave way to the operator – or, the person operating the remote control connection had access to process control networks inside of the power plant. This again was a non-nefarious activity. It could have resulted in catastrophic failure of the power plant. This was US-based. And unfortunately – or, fortunately for the operator, we identified it before any activity was engaged that couldn't be undone. They closed the port and made sure that the contractor knew that they were not to do that again. Very straightforward, but undetectable by the operator in any way, shape, or form except with machine learning and recursive estimation.
Finally, there was an internal device at a customer of ours. There was behavior that was exhibited by the device that none of the internal operators recognized. We recognized that this behavior was probably a piece of malware that was beginning a setup for some other behavior that we did not know what was down the pike. Darktrace identified it and we found out that the device had been – had received a USB wireless connection. Someone had placed their – so, there was clearly nefarious intent but no nefarious action was taken – or, no malicious action was taken. It looked like it was a reconnaissance effort, and the purpose of it – again, we don't know – but the operator was able to identify the communications through mathematical- and machine learning-based analysis and was able to remediate that.
Finally, the last site I'll talk about is internal reconnaissance for a distribution company. We saw devices outside the ICS network operating within the ICS network and we were able to help the operator plug that hole.
Now, all of these seem very light touch, but the purpose of exemplifying these for you is: No human would have seen these behaviors. No human would have seen these avenues by which communications were engaged with OT environments simply because every engineer, every operator at a plant has a job to do, and that job is not to sit and watch net flow or log data all day long to understand and super correlate cognitively when two log files look odd. That's just not a human's job. And it's impossible for even next gen sims to aggregate that content and engage meaningful behavioral analytics against the number of logs that would come out of an OT environment. Rather, look at the full packets. Understand the sensitive and subtle behaviors of those full packets, and you glean that nugget of intelligence that nothing else can get in.
So, it's – essentially, that's the end. Erfan, I can take some questions now if anyone has any questions that have come across. I haven't – I'm not particularly paying attention to the question section, but if you have any questions, group, I'm more than happy to entertain those and we can move from there.
>>Moderator: Could you move to a slide that has your contact information?
>>Male: I think I can. It may need to –
>>Moderator: Or you can just enter it where you say "Thank you" on the screen.
>>Moderator: That would be helpful.
>>Male: I think I can. I may have to edit the slide –
>>Moderator: Yeah, just your e-mail and your phone number there.
>>Male: Yes. Will do.
>>Moderator: That would be great.
>>Male: I may have to edit the slide, so bear with me just a moment.
>>Moderator: While you do that I'll share some observations. For those of you who were listening to Jeff, he made a very important point about hygiene, that network hygiene is critical. So, in just the same way that he took the analogy of the immune system and spoke about the DNA, there are certain things you have to do to build your immune system, and then you can rely on the immune system to protect you against a lot of frequent diseases. So, those best practices in life are like eating properly, sleeping, exercising, keeping a low-stress life, keeping yourself hydrated, and so on. In networking, that is doing proper network segmentation, moving data only on a need basis – in other words, creating –
>>Male: Great point.
>>Moderator: – hyper quiet networks. Turning off unused points. Only allowing a network admin to enable a port when it's needed. And the ports that are being used, put some kind of sticky WIP MAC filtering so that people cannot just log in and out at will. When you do all of those things you are building up the immune system so that machine learning can be effective. It's not a panacea: It needs to be used strategically, and it requires a hygienic network to be really effective. All right.
>>Male: Great summary. Thanks, Erfan. That's an exceptional summary.
>>Moderator: All right. So, we have several questions here.
>>Male: Oh, fantastic.
>>Moderator: For those of you who registered, everyone is going to get the slides in PDF format as well as a link to the webinar recording. The first question is from David Rolla, who says: "Please clarify the difference between normal and good traffic" – and then he says "regarding ML training." Oh, machine learning training.
>>Male: Ah. Yeah. Yeah. Very good. Thank you. You said David, correct? Thank you very much. So, that's actually a great discussion point and one I think we could go on a tangent for for another hour. But it gets back to the comment that I made earlier about machine learning is difficult to do correctly. And I'm not sure but – and correct me if I'm wrong, please, but I believe the heart of your question is "What happens if we go into an environment that's already compromised versus an environment that's completely pure and clean?" Right?
Well, if we go into an environment that's compromised, the machine learning says, "Well, this server has a piece of malware sitting on it – doing nothing, by the way. We're going to look at that server. We're going to watch that server's behavior with devices, servers, users, and networks. And we're going to model that server as behaving a certain way with that infection." You're absolutely right – correct in that assessment if that's where you went with that. We would see that server and we would say, "Okay, that server is behaving this way and we're going to understand that this behaves, this server behaves from a machine learning perspective."
Now, for all intents and purposes, we've modelled that malware in as normal, correct? I mean, from what I've told you, that's the way machine learning works. And that's absolutely correct: That malware is doing nothing. It's sitting there. It looks completely normal as just – as if it were a file or a bit of data on that server. Now, for that piece of malware to action, it has to do something. For it to be a problem or a challenge, it has to do something. Right? It has to move laterally. It has to beacon. It has to command and control. It has to explode. It has to do whatever malware does for it to be active. The moment, the very moment that that behavior is seen by the machine learning algorithm, that all of a sudden is completely anomalous. Now, that's what's really interesting about recursive estimation, right? Now you lay recursive estimation on that and we say, "Wait, now something's happening in server X." Well, server X is part of a family of servers that all behave the same – A, B, C, D, E, F, G and – you get the idea. All of those servers are not doing what X is doing right now. So, recursive estimation allows us to shine a really, really bright machine learning light on server X and understand what it's doing.
Now, mathematically, we can model what machine X is doing relative to that move of that piece of malware, for example, or that beacon out or command and control or whatever it's going to do. Typically, that malware is just going to move to another channel where it gets more visibility in a reconnaissance mission. As that move occurs, we're going to say, "Wait a minute. That's completely anomalous." Now we've got not just the server waking up, doing something that none of its peers are doing, but also now a move of data, or a beacon out or a command or an unusual activity from server to server that we haven't seen before. Now then we've got that. And then we've got that next server – remember now, we've got machine learning already on that server. Right? We've already done that environment. So now, that next server wakes up and says, "Oh, I'm receiving a packet from a server I typically don't." The machine learning says, "That's three things now."
So, it's not just one anomaly that gets kicked, but it's the series and it's the severity and the uniqueness that allows machine learning to give us that subtlety.
>>Moderator: Yes, and if you – yeah, I think that's a good response. What you find as you get multiple reasons for thinking something is that the incidence of false positives drops. When you rely on just one thing, the false positive is a lot higher.
>>Moderator: Let's also extend this biology analogy to think of three aspects – and Jeff, you mentioned that first was the skin, which is kind of like the peripheral fences that we put up with firewalls and access control lists and things like that. And then, you've got intrusion detection systems on the plumbing. That's more like looking at the bloodstream and making sure it's pure and it's not adulterated. But finally, you have to look at the state of the organs. And that is where the Darktrace equivalent is: It's inside the systems looking at the processes. And the nice thing about this approach is it's proactive. So, usually, nefarious attacks occur gradually, and if you can catch things early on, your ability to do something about it is more because most of the system is still healthy. As the attack becomes more developed your ability to control becomes less, as the people in the Ukraine found out on December 23, 2015. It started out as an innocuous Microsoft Office vulnerability that turned into something where substations were being turned off because there was a pivot that it created.
>>Moderator: So, let's go to the next question: "From Darktrace's product point of view, should a client provide the system under test as a white box to maximize the effectiveness of Darktrace's product solution?" This is from Michael Shea in California.
>>Male: No. Actually – great question, Michael. Thank you for asking that. Actually, Darktrace deploys our proof of value only in live environments. And the reason we do that is because machine learning is best operated against dynamism, and white box or sand box or other environments that are lab- or testbed-based are not dynamic enough for machine learning to bring value. In fact, we won't deploy in test or lab environments for that reason. Typically, in a test or lab environment where you're going to set up red teaming exercises we're going to show you the very things that your white list, signature and rule-based deliverables would have caught anyway. There's not a lot of value to you, unless of course you're a black or a white hat hacker and you can write some custom code and throw it against it. Then we'll show you that, and your perimeter defenses, your intrusion detection, your firewalls, your endpoint protection tools, and your antivirus wouldn't show those things unless they had had a rule or a signature written against them.
But we prefer to deploy only in live environments, because as – Erfan, you seem to be perfectly on the same page with me today – the organs and the communication among those organs are really where the value of Darktrace is highlighted. We show you when the things that you can't possibly see or model are compromised. So, I hope that helps that answer.
>>Moderator: Next question from Dave Zielinski. He asks: "Have you used Darktrace in any US government-owned-and-operated or ICS environment? If so, was the amount of findings less, different?"
>>Male: Less different? I didn't get that –
>>Moderator: Compared to the power sector, the –
>>Male: Oh, so US government-owned power versus private sector power. Is that the question?
>>Moderator: Correct. Yes.
>>Male: Oh, okay.
>>Moderator: Because your use cases all involved private sector stuff, so that's why they were asking.
>>Male: Got it. So, I can't talk about customers specifically under NDA, except for one that we have a specific use case on that I put up on the screen. But there is no difference between whether we're running in a government environment or we're running in a private sector environment. Those networks almost always are as close to identical as you can possibly imagine. Different third-party providers, of course, and power plants built by different companies, of course, but they all do the same thing. At the end of the day, they're generating power and they're communicating information about the generation of that power through systems, and those systems, while monitored, yield different bits of information. Applying Darktrace to whether it's a government-powered system or whether it's a private sector-powered system, Darktrace would understand that network regardless of its ownership or orientation.
I don't know if that really got the answer or the question that was asked, but –
>>Moderator: Yeah, it did. Yes.
>>Male: – if it didn't, you're welcome to shoot me an e-mail.
>>Moderator: So, I'll add a few humorous notes to that.
>>Moderator: First of all, there's a very small cabal of consultants who rotate between public and private sectors, so they bring their mental models to both sides, and that's why the networks look so similar. Second, compliance does not lead to security and reliability, so even though the US government infrastructure is infused with all kinds of compliance requirements, the dysfunctionalities exist there just like they do in the private sector because what is missing is a holistic approach to security. It's very technology-driven, and if there's one thing you should have learned from Jeff's presentation, it's that it's not about the technology. It's about properly architecting your infrastructure and strategically using things rather than just soaking them in technology and hoping that it's going to solve all your problems.
>>Male: And I have – Erfan, I have unlimited examples of where companies have just soaked themselves in technology only to find out that they've literally drowned themselves in information that they cannot use. Then, they take a step back and they say, "Okay, give me actionable intelligence, Darktrace." And we provide them with actionable intelligence and efficiency soars through the roof, because now they have a means by which they can actually triage and mitigate the real threats, not the thousands of alerts that they see coming through their sim that they don't know how to deal with. One of our customers on the West Coast uses a sim, and they receive on average 5000 alerts an hour form their sim. Five thousand.
>>Male: I don't know of a human capital investment that could handle 5000 alerts an hour. Regardless, that's what they received. They deployed Darktrace and went from 5000 alerts an hour through their sim to 26 actionable alerts a day on average. Twenty-six actionable alerts.
>>Moderator: When you get 5000 alerts in an hour, then you start playing the Pink Floyd song "Comfortably Numb."
>>Male: You got it. You got it. I think I did hear that song in their SOC when I was there. [Laughs]
>>Moderator: Yeah. Where an intrusion detection is "just a little pinprick."
>>Male: That's right. Okay. You see the –
>>Moderator: Now –
>>Male: I'm sorry, one point. You see the numbness – that's a perfect example of the numbness that was experienced by the large retailer in 2014 when they received the significant breach they had. So many alerts going across that sim that even the team that was sitting there, the third-party team that was in their SOC, manning their SOC for them, watched the exfiltration occur.
>>Moderator: Yeah. One other thing that you're learning from Jeff's presentation is that he has taken the concept of big data and turned it into lots of data. And lots of data with machine learning can be turned into actionable intelligence. With big data you can't do too much.
>>Male: Correct. And in full disclosure, Erfan, it's not – I'm not the guy. We have a whole team. Right? We have a whole team of mathematicians in Cambridge, really, really smart people from the intelligence communities that have done all this work. In this case I'm just the mouthpiece for you guys. [Laughs]
>>Moderator: Yeah. Just a design recommendation to those of you who are in the audience, if you want to set rules for machine learning, try to take your sensors as out as far out to the edge as possible where the degrees of freedom are very limited and it's much easier to test rules. The higher up you sit and set rules, the more is the variability, the harder it is to figure out anomalies.
Now, by that I don't mean that you just leave the data out there at the edge. Bring it in to a central place in the form of summary so you can do correlation of events. If there is a coordinated attack, you can figure that out. But the world is much more deterministic farther out at the edge. And that's just the opposite of quantum mechanics, where at the elemental level it's very discreet, and as you go up towards human life level it becomes a continuum. So – and networking is just the other way around from quantum mechanics. So, please push your sensors as far out as possible to the edge.
Next – okay, so I'm going to do a little roleplaying with you, Jeff.
>>Moderator: So, imagine I'm a C-level person and we have entered an elevator in a tall building and you introduce yourself and you say you're from Darktrace, and he says, "Well, I have heard of Vectra Networks in your business. What's the difference?" And you have about 90 seconds before the elevator reaches the top floor. Go.
>>Male: Yeah. So, Mr. CEO, do you operate an ICS or an OT environment?
>>Male: Yeah, fantastic. Darktrace monitors that OT environment where nothing else can. We can provide you the visibility that no other technology in the marketplace can in your OT environment. And oh, by the way, we do that both for your IT and your OT environment, and there's no technology in the marketplace that can provide both. Thank you very much. Have a wonderful day, sir. Here's my card.
>>Moderator: [Laughs] One thing that I have heard in the industry about Darktrace, and it would be good to hear it from you, on how can you create a model – like, today your model involves a team of people who analyze data. I mean, you have the sensors and everything at the customer interfaces, but then you extract it and you analyze it with teams that you have, and then you give back reports. Is there a model that you're thinking of going forward where it remains all intrinsic within the organization, and their teams can look at the data and not feel the need to take it out and analyze it the way you do today?
>>Male: Yeah. Let me be perfectly clear, Erfan. We actually do not remove any data from the customer. Period. Everything is done inside the organization. Our appliance sits physically inside the network, whether that's an IT network that's a core switch – we mirror that content out completely passively, cap or span, mirror the content off in the IT organization. In the OT organization we sit inside of the air gap's environment and model the behaviors inside that air gap environment or the segmented environment, and we have a very parent-child or hub-and-spoke approach so we can move appliances out further to the segments that more remote and bring only the math models – very important; no data is ever moved – only the math models back into the appliance for analysis. Then the data is analyzed in the machine.
Now, where I think you probably felt like we – at some point we moved data out, we do ask that during the proof of value, the 30-day free trial that we give, we ask for port 22 SSH outbound connection from the box to two known IP addresses. That's so that our analysts can log into the box and run a threat intelligence report. But your data never leaves the box. The analyst only logs into the box once it makes a call out and begins to triage the alerts that we see and manages those alerts for you. So, in the first 30 days we do that for you.
After the first 30 days we can train you how to do it yourself. We can show you how to mitigate and triage the alerts that we find in the box. And if you choose, you can certainly turn off the port 22 SSH outbound and run the box completely independently inside of your organization. There's no requirement by Darktrace that you have connectivity back to Darktrace. And we are not a cloud-based deliverable. Although we model, we can monitor any cloud-based environment, we are not a cloud-based deliverable.
So, I – just to clear that up –
>>Moderator: Okay. Great. Thank you for that clarification.
>>Moderator: Yes, in my statement I said it incorrectly that you're exfiltrating data. Actually, you're making a remote connection for the analysis, but now that you've made it clear that you do that in the eval period, so that's good to know.
>>Male: That's correct. That's correct.
>>Moderator: All right. So –
>>Male: Now, the customer certainly has the ability to choose to continue the threat intelligence report, at which point we would require, we would still need the port 22 SSH outbound from the corporate environment. So, the topology or the architecture would be a Darktrace secure appliance inside the OT environment that talks to the parent appliance in the IT infrastructure, that then has a port 22 SSH outbound. The OT box does not need an SSH outbound.
>>Moderator: Okay. Wonderful. Next question is from Eric Shweigert. He has left but he has left the question, so we will go ahead and address it.
>>Moderator: "It sounds like the strength of Darktrace is the ability to detect anomalous activity. This seems to me as reactive: The attack or activity has already occurred. What thoughts do you have on that?"
>>Male: Great question. So, the attack has already occurred. Actually, the only time that a real attack is – an attack occurs and activity occurs at the same time is in ransomware. Typically, that's – a registry is being rewritten and data is being encrypted. Darktrace has the ability to determine that in real time. Our analytic runs in near real time: We're a few milliseconds off of real time because we're taking that mirror directly off of the core switch. So, if it comes across the core switch, we see it, obviously, and it becomes anomalistic immediately upon mathematical analytics. So, the only time it's really – that attacks are real time are typically around ransomware attacks. Everything else is as you said it at the top of the hour, Erfan. It's about modelling – it's about subtlety, right? So, deploy a piece of malware or a part of a piece of malware so that it can be reconstructed inside the gooey center, inside the organ, to use your analogy, and then do something with that later, whether it's a la the large retailer, deploy inside of the HVAC system and 217 days later or 243 days later exfil out of the IT system after it's collected all the info from the POS – right? – the point of sale systems.
That process, that's all up front. It's nonreactive. The reactive component that I think the gentleman is referring to, it sounds like because it's already in and from his – that gentleman's perspective, the perimeter has failed there. Yet I wouldn't say that. The perimeter is probably doing its job but that was sophisticated enough to get past it, meaning the perimeter had no idea of knowing what type of deliverable that was. So – or, it was, as you know, a phishing attack and the CEO clicked on a link that he shouldn't have, or she shouldn't have. In that case it's past the perimeter defense or past the antivirus and it couldn't have caught it anyway. Well, what Darktrace does in that case is we identify it and we can help remediate.
So, what we haven't talked about is on the IT side of our business, the Enterprise Immune System, we have a product called Antigena. And Antigena has the ability to take action against behaviors that it sees acting in real time, a la ransomware. We can throttle or block behaviors against networks and servers. So, to the point about "it seems reactive," it's reactive only because it – the first level of defense – the skin, the antivirus, the firewall, whatever it is – failed. Darktrace couldn't have seen it at the firewall because we're sitting inside the gooey, healthy center of the body, right? The organ level. And we want to see it at the core. The moment it hits the core, we identify it. We do something about it. We alert to it.
In that regard, alerting might come across to someone as reactive possibly. I would argue that we're letting you know that the alert, that the threat exists before something bad has happened. That would be my remark.
>>Moderator: Right. So, a couple of things. The whole idea of being proactive is to harden systems and do the network hygiene and all of that. And then, as the NIS framework says, after you have identified and you have protected the next one is monitor and respond. And all of these products fall in that monitor and respond space. And understand that an attack is not one thing: It's a series of activities. And only God knows what the person's mind in order to proactively stop them from doing something. So, everything by nature is reactive, but the question is: When do you react? And this – what you're talking about, Jeff, means quick reaction because you are sitting literally at the – where the bloodstream is coming and you can see in real time. So, before an attack develops you can provide immunity against it experiencing it completely.
>>Moderator: So, there is a reactive aspect to it and a proactive aspect to it.
>>Moderator: Next – so, we have several questions that are mounted, so I'll be real quick to address them because I want to stay within the time limit. The next question is from Dave Darva, who says, "Have you guys measured false positive rates?"
>>Male: Yeah. Good question. So, by the definition of machine learning and recursive estimation the false positive rate is absolutely zero. And Dave, to your question "How could it possibly be zero?" – well, the reality is if we – if machine learning tells us that something is behaving anomalously, it truly mathematically is anomalous. Now, the reality is, is that for your particular organization – and remember, we bring in no prior knowledge. The box comes in tabula rasa, right? We don't have any rule signature, white list, threat intel – nothing. We learn normal – we learn what behavior looks like and we're telling you that "This behavior is abnormal for this device, user, segments, et cetera." Now, you might say, "No, you know what? That was a contractor that logged into our ICS environment and loaded some – a firmware update to a SCADA server." Okay. You're telling me that's normal. You would tell me as Darktrace, you would say, "Jeff, that's a false positive." I would say, "Okay. It's really not a false positive because we've never seen this SCADA server receive a command from this desktop or this notebook before."
You get my point, Erfan? It's really about that truly is anomalous according to the behavior that we see about that device, the network segment, the user, whatever the case is. You're telling me, Mr. Operator, that that is normal behavior. Okay, great. We will model that as normal behavior – now, it's a simple check box inside the UI. We will model that as normal when this device connects with that server for this X number of dimensions. Right? Length of time, size of load… You get the idea, right? The multiple dimensions that we would analyze about that communication. That way, when that device connects about that time downloading about that size information over that network protocol, whatever the case is, that would then be normalized or modelled in as normal.
Now, my argument might be that's not a false positive; that's a true positive. You would say, "But wait, we have different contractors every week. Those different contractors have different notebooks. They download different packets to that SCADA server every week." Grossly overexaggerated, of course, but you get my point. The idea that that could be normal behavior would be picked machine learning very, very quickly. What we wouldn't pick up as normal are the IP addresses and MAC addresses of those notebooks that log into that SCADA server. Each of those would be anomalous because they truly are anomalous. We model the behavior as normal over time, but the activity of that particular MAC address and IP, the user credentials, would all be anomalous.
Now, you might say that we don't want to see those alerts. That's fine. That can be easily done. No problem. You won't see alerts from this contractor – I don't recommend this, by the way – you don't, you won't see alerts from this contractor contacting that SCADA server. I wouldn't recommend that, because any number of times that different contractor or that different Joe Lunch Bucket with a different notebook could have a piece of malware on it delivered to your environment, et cetera, et cetera.
>>Moderator: Great. Key thing to remember in any adaptive learning environment: There are going to be false positives in the first instances, and the key thing to understand there is that there is a trend towards reducing those as we learn more and more rules. But the first time, when the rule is not there, it will be seen as an anomaly, and then it's corrected, and then the next time it wouldn't think of it as an anomaly. It would think of it as normal behavior. So, there's a constant flow in of false positives in an adaptive learning – but you don't call them that. You just say, "The first time I'm just going to learn from this. The next time I'm not going to repeat the mistake." Okay.
>>Male: And the benefit there, Erfan, is that recursive estimation allows us to do that from a mathematical perspective. We don't have to use a human to do that. Right?
>>Male: BRE, recursive estimation allows us to say, "I see this behavior over and over and over and it is consistent. All of its dimensions match up to consistency for me. Yes."
>>Moderator: Right. Then Alia Katani had some nice words to say about this form and sends greetings from Saudi Arabia and then leaves.
>>Male: [Laughs] Well, thank you for the kind words.
>>Moderator: Yeah. Bruce Rosenthal likened your technology to virtualized prophylactics.
>>Male: That's an interesting perspective. Very interesting.
>>Moderator: Okay. Yeah. Just remember: Abstinence is the most effective thing.
>>Male: Yeah. Yeah, don't run a business, right? Don't run a business.
>>Moderator: Yes. Okay. Next, Eric Shweigert asks: "How do you deal with variable data? If you take DNP3 as an event-based protocol, what if your baseline data set hasn't taken this into account?" So, we've already addressed this.
>>Male: Yes. Yeah, we did. With recursive estimation we would see normal behavior. We would see anything new as abnormal. But we potentially could model that in as normal behavior as it comes up, yes.
>>Moderator: Next. Abraham Jose asks: "How do you manage false alarms in a very large environment where it changes very often?"
>>Male: Yeah, we've addressed this one as well. So, false alarms in a large environment where dynamism is key is – that's the reason, the difference between machine learning done right and machine learning done incorrectly. Machine learning done correctly allows for Bayesian recursive estimation to continually model what normal looks like. It's not a one-and-done. It's never a single baseline and then comparative to that single baseline, because that baseline in any organization – to your point, question asker – that baseline is continually changing, so we never baseline. Darktrace uses recursive estimation to constantly learn a new sense of self, a new pattern of life every millisecond.
Now, that's a heady idea to take in. right? I mean, thinking about that computationally, to learn over millions of devices, for example, or even thousands or even hundreds of devices, constantly learning the new sense of self or a new pattern of life for those behaviors is a very heady and mathematically heavy perspective. But it's done. It's done handily in our box because of the way that we leverage machine learning and recursive estimation.
>>Moderator: Let's move to the next question. Eric Schweigert asks: "How can you do estimation if the family or set only consists of one in the malware example that validates against a family of servers?"
>>Male: Yeah. Good question. That instance of one becomes its own family. So, if I truly have a singular example of anything in my environment, if there's only one of it, what we'll do is we'll look at that singular example and we'll model that as having its own pattern of life, its own ideal for communication. What does it normally communicate with? How does it communicate with it? What does it communicate across? Right? So, what's the protocol? What's within that protocol? Times? You know, the 350 different dimensions that we pull out for every communication.
And we want to understand what each of those does at that level. So, that now singular device becomes its own family, and we can compare that to every other event, every other device, every other segment, ever other user, and understand how it's unique. So, that's how we deal with a singular event.
>>Moderator: It's very important –
>>Male: Even in very, very large spools.
>>Moderator: Yeah, it's very important in operational technology environments for there to be a constant flow of knowledge from the domain experts into these tools. These tools are only as good as the information that's put into them. So, there needs to be a structured way, especially before these people retire, to get their domain expertise into these structured models. And then it's much more effective. Then you won't get the false positives. You will understand even the one-off situations if the domain expert tells you about it and you can model it in your technology.
Next question: "What kinds of tools are you using to create the machine learning model?" That's from Mauricio Lopez.
>>Male: Oh, Mauricio, thank you. So, all of our machine learning models are authored and constructed at our corporate office in Cambridge in the UK. Our office there is staffed by mathematicians from Cambridge University. I don't know how large the team is now but it's very large. And they're very, very bright. I consider myself the dummy of the company. [Laughs] These guys are just amazingly bright.
So, the tools that we use, they're really proprietary. But mathematicians from Cambridge University authoring and utilizing the tools that bring us to the place that we are.
Erfan, did I lose you?
>>Moderator: No, no. I'm here.
>>Moderator: Oh, I thought you were going to say something.
>>Male: No. That was –
>>Moderator: You were moving from one slide to another. That's why. Okay.
>>Male: Yeah, it's – actually, I just lost my PowerPoint. Sorry.
>>Moderator: Yeah, no problem. We – next question we have: "Are the measurements of system state pulled from the targets requiring protection or pushed from those systems? Are there impacts associated with the data collection mechanism; i.e., making the immune approach self-doc the system – so, making the immune system an attack vector?"
>>Male: Yeah, so everything is done in Darktrace 100 percent passively, to answer that question directly.
>>Male: Darktrace sits at the core switch inside of IT networks at switches that see, have visibility into subnets in OT networks and then talk back to the parent. So, we never actually touch the endpoint. We never touch the device. We only see the comms from the device back to the switch. And we only – we're 100 percent passive off of those switches, so no activity on the network whatsoever.
>>Moderator: Yeah. Key thing to understand there is that Darktrace is looking into the payloads and not just looking at it from a byte perspective but actually breaking down the field within the payload and then understanding which board, which application, and even the transaction within the application. So, it's a full packet capture analysis capability, the key thing to understand there.
>>Moderator: All right. Next, Dave Zielinski asks: "Understanding that the US government tends to levy additional security measures, so that said, has there been a difference?" And I think we've already talked about it, that those measures are really compliance-driven and that does not necessarily lend to a more secure environment.
>>Male: That's correct.
>>Moderator: But it's great job security –
>>Male: But I'm glad you said that. Thank you.
>>Moderator: It's tremendous job security for many people.
>>Male: [Laughs] Yes it is.
>>Moderator: "As the latest report, March 2017, revealed that about 14 percent of laptops and desktops in the businesses worldwide are still running Windows XP. Has Darktrace been deployed to deal with industrial systems built with Windows XP in it?"
>>Male: Oh, yes. Yes, we do. In fact, about 40 percent of our customer base that are running the Industrial Immune System are running XP-based systems. And those are the most active systems for us, as you can imagine. So, our appliance UI, which we won numerous awards for our UI, lights up like a Christmas tree in most XP environments.
>>Moderator: Yeah. So, one thing to understand: In order to reduce the threat vectors to vulnerable _____ systems like Windows XP, you can't rely on just defenses from outside with firewalls. You have to really, as I said, quiet down the network so people have very limited access to those devices, only on a need basis. And then, second, you can put inline blocking tools in addition to the monitoring and response from Darktrace type things. If you don't do that, it's going to be a very noisy environment, and then Darktrace is going to get a lot more events than is needed.
>>Male: That's correct.
>>Moderator: So, those are things you can do in hygiene and strengthening the network to not rely on the security of endpoints that have Windows XP on them.
>>Male: That's correct.
>>Moderator: And then – yes. The second thing I would say in that area is that you need to make the appropriate business cases in order to upgrade your systems if they are mission critical. If I see Windows XP running something critical, then I know that the IT department and security people have not made the appropriate business case to the CFO.
>>Male: That's correct. And often times, Erfan, what we see in our customers, especially in the manufacturing sector, is that the engineering platform manager or director says, "My system is working fine. My system is up and running. Even though, Darktrace, you say we have malware in our system, I'm not taking it down because it's running correctly." That's a business decision that we can't change. It's important for us to be able to tell them, "Hey, you have compromise. Here's the compromise. Here's how we would recommend remediating it." But if a plant shop operator says, "I'm not taking my system down to do X, Y, or Z," that's fine. We can't make you do it. But at least you know about it now.
>>Moderator: So, just remember that when a person smokes and drinks heavily and eats saturated fats in the morning and says, "I'm as healthy as a horse" when he is two weeks away from a heart attack, that doesn't make that person healthy.
>>Male: That's correct.
>>Moderator: All right. Next question, from Rodney Martin: "What methods from machine learning or related fields help to inform the tradeoff between false alarms and missed detections in this space?"
>>Male: Yeah, I think we kind of covered that as well, to be honest. I think with recursive estimation we mitigate false alarms, we mitigate false positives down to zero, because anything that Darktrace sees is truly anomalous. It is, in fact, different than at some previous state that we've seen it, whether that's a server receiving a command from a previous device – maybe it's perfectly legitimate in your organization, but it's never happened before. Maybe you're bringing a new segment or a new device online. What Darktrace will see is new behavior. Now, we'll model that down and we use a grading scheme inside the machine learning componentry that allows us to grade an alert according to the physics color continuum: red, orange, yellow, green, blue, indigo, violet – the ROYGBIV spectrum that most of us – well, some of us are old enough to remember from our physics classes.
The reality is, is that anything above a yellow gets a report in the UI. Anything below a yellow we typically allow machine learning to continue to model. So, many of the true – what most people would consider false positives, many of those are really blues, indigos, and violets that Darktrace is just beginning to model. And we look at those and we're constantly relearning and remodeling and reunderstanding what the sense of self or perspectives are. And so, we model that up or down the continuum constantly. And this might happen as quickly as milliseconds. It might happen as slowly as days, weeks, or months, depending on the behaviors and the model that – the model reactivity that we see in any particular device, user, or segment.
>>Moderator: Great. Next question, from Dave Zielinski. He asks: "Lastly, do most of your environments you work in have commercial ISP connections – closed, restricted environments – aversive, closed, restricted environments?"
>>Male: Most of the OT environments that we work in do not have connectivity outside of the organization at all. What they do allow for is connectivity over 443 into the corporate networks. Most of our organizations do. It's an ironic discussion point, actually. Most of my organizations, my first meeting with the CISO or CFO is "Oh, no, we are 100 percent air gapped." And then, we run a POV in the OT environment and we say, "Can we speak to a corporate network server for our outbound comm," and they say, "Oh, yeah, here's a port 443 right into your box that sits in the corporate network." And so, the moment we get that, obviously, we highlight all the connections into the corporate network from the OT environment.
So, while they – while most operators are very, very particular about the verbiage they use – "We do not allow internet connections to our OT environment" – the reality is, is that there's typically some way to get into that OT environment. And the value of, obviously, Darktrace is showing that to them in our graphic, saying, "Look, here are 15 connections that are going right into your main switch in your OT environment."
>>Moderator: Air gapping is a nice marketing concept, as are data diodes. When you really do real life, you find out that if you have a smart phone and you're a field service technician, the smart phone is managed by the IT side of your organization, and the mobile app that allows you to repair the substation is run by the OT side. So, you have IT/OT conversions right in your smart phone as a field service technician.
>>Male: You got it.
>>Moderator: And then, the other thing is that any OT environment that goes over a wide area network, unless you're running your own fiber, is going over service providers like Verizon and AT&T and others. And while they'll say that they're providing you secure lines, what they're doing is they're leveraging their infrastructure for multiple highways.
>>Moderator: So, your hope is that no advanced persistent threat can compromise their soft switch that has multiple VPNs running through it. So, the world is very complex, and I just find it comedic when people talk of – in simplistic ways and say, "Oh, we have complete air gapping and all our networks are private," only to find out they're handing it off to an ISP somewhere in a VPN or a T1 line.
>>Male: And you can imagine –
>>Moderator: All right.
>>Male: You can imagine, Erfan, putting yourself in my shoes, when I sit in front of a CISO who demands that his – you know, he's demanding in front of me – I mean, very, very vocal – "No, we are air gapped," I just have to say, "Yes, ma'am… no, ma'am… yes, sir… no, sir…"
>>Moderator: Well, you know, the air gap is actually in the brain.
>>Moderator: Next is from Michael Shea. He says: "Would the solution from Darktrace introduce visible latencies to the data streams being monitored?" The answer is no because it's passive, so let's move on.
>>Male: One hundred – that's correct.
>>Moderator: Yeah. Thomas Williams says: "Can a network visibility fabric such as Gigamon provide packets to Darktrace?"
>>Male: Yes. Yes. We love environments that are running Gigamon because everything's aggregated in a single point for us. We don't have to go out fishing for remote sources of the data. Right? So, the segments are no longer segmented: They're a single view for us. Yes. We love Gigamon environments.
>>Moderator: David Rucker says something very funny. He says: "Since only God knows, I'll be praying." With a smiley face.
>>Male: Fair enough.
>>Moderator: Yes. And then, finally, David Rolla asks: "Does Darktrace do anything automated beyond alerts? Does it take any action?"
>>Male: Yes. So, on our enterprise side of the house, the Enterprise Immune System, we've launched – I mentioned it earlier – we will launch the product called Antigena. And Antigena is available for – general availability now. It has been for a little bit now. Antigena is active in the network in the enterprise side and it can throttle behavior. Antigena is a very, very powerful deliverable from Darktrace, and we can certainly fill you in more about it if you want. Just go to Darktrace's website and download the white paper on Antigena.
>>Moderator: Wonderful. So, we have just a few minutes left. Do you have any final thoughts, comments, Jeff?
>>Male: No. I just want to thank everyone for your participation. It was – I really enjoyed this webinar and I certainly appreciate the level and breadth of questions and the number of participants. Thank you all very much.
>>Moderator: One thing I would ask you to do when you send me the PDF version of your slides is put in a typical OT environment diagram and then show us where Darktrace sits so that our audience gets a visual understanding of the layout. That would be very helpful.
>>Male: Yeah, absolutely. That's not a problem. We can do that.
>>Moderator: Great. So, I look forward to receiving that PDF – hopefully this afternoon?
>>Male: You will.
>>Moderator: Also, the audience, thank you so much for participating. I'm going to send out an announcement very soon for the next webinar. We are going to shift gears. We've had a series of cyber security webinars for the last month and a half, two months, because there's so much interest in the area and I wanted to tackle the subject from different perspectives. But now, for the next couple of months we are going to shift gears and we are going to get organizational updates from the electric sector, organizations like SEPA or the CREST SEA project, as well as SEEDS and many other initiatives that are going on in the industry. They will usually have their conferences, but they don't get a very, very diverse group of people coming and listening to them. But in my Smart Grid Educational Series I am able to bring people from different areas, so I want them to get exposed to these initiatives and see what they can do to contribute to these efforts. So, the next couple of months are going to be more focused on industry initiatives.
So, at this time I am going to end the webinar. I thank you all very much, and please look out for our future announcements for Educational Series webinars. Jeff, once again, thank you very much for this very profound presentation, and I'm really glad the way you explained the subject and then incidentally showed Darktrace. This is exactly how we like the format of our webinars. So, thank you so much.
>>Male: Well, thank you for your time. And again, everyone, thank you for your participation. I really do appreciate it.
>>Moderator: All right. Have a good one, everyone. Thank you.