Tutor HuntResources Computer Science Resources

Ai In Cyber Security: The Double-edged Sword And Challenges

Date : 05/10/2020

Author Information

Cedric

Uploaded by : Cedric
Uploaded on : 05/10/2020
Subject : Computer Science

AI is a term that is given to devices that can mimic what is generally considered as natural intelligence that is, intelligence as exhibited by human beings such as problem-solving, learning, and other cognitive functions. AI is often loosely used in replacement of its constituent technologies/methods which are machine learning and deep learning. Thanks to the development of robust and fast computing technology like graphic processing units (GPUs) and the advent of deep learning, AI has made great progress. AI has had a huge success in speech recognition, image processing, robotics, health care, amongst others. However, its applications go far beyond these.

AI can be used in cyber security for malware protection and intrusion detection, amongst others. We will narrow our talk on the application of AI in malware prevention/detection. AI has not only been used as a means of enhancing cyber defence but equally as a means of enhancing cyber attacks. Indeed, there is no doubt that the implementation of AI in cyber security is a positive step for cyber defence, however, it will create 2 other wee issues

AI models themselves will require another specific form of cyber security to protect them from adversarial attacks. This means implementing AI in cyber security will require creating another form of cyber security to protect the artificial intelligence itself and this new form of cyber security will come with its own shortcomings.The application of AI whether as machine learning (ML) or as deep learning (DL) in cyber security is but a double-edged sword. On one side, AI can be used for defence purposes such as malware analysis, threat and anomaly detection (also known as system resilience), amongst others. On the other side, which is the part that interests us, AI can identify vulnerabilities that are often overlooked because of human natural weaknesses such as making mistakes during programming/algorithm development, etc. These zero-day vulnerabilities will then be exploited to make better targeted, highly precise and more impactful cyber attacks. This was particularly the case during the 2016 Challenge of the Defence Advanced Research Projects Agency (DARPA) when 7 AI systems were able to identify the opponent s vulnerabilities while patching their won system. More so, if the saying that the attacker has the advantage in cyber security and given that AI reinforces these advantages, then we should be quite concerned.Machine learning Vs. Deep learning and why it mattersApplications of AI in cyber security starts with ML. ML has had great success in cyber security. With the case of malware, for example, ML functions by extracting features and patterns from the data. However, the heavy reliance of ML on feature extraction makes it quite a threat to cyber security. For example, during malware detection, we have to manually pre-define features specific to malware so that the ML system can detect it later. Nonetheless, should there be any new feature/type of malware, it will pass through the nets of the malware detection algorithm because it was not included in the predefined malware features in the beginning. That is to say, ML algorithms function based on pre-defined features, this means that features which were not pre-defined will escape detection and will not be discovered. We can, therefore, extrapolate and say that ML algorithms accuracy can be judged based on their capability of correctly extracting features. Since ML cannot account for non-pre-defined features during feature extraction, it would therefore not be able to detect new malware hence rendering the technology not so efficient in this context.

To account for this flaw, the application of a new, more robust form of technology is necessary. Deep Learning (DL) can be trained without pre-defining features of the original data, can detect non-linear correlations, and can support new file types. Most importantly, pertaining to cyber security, DL can adapt to APTs, even if they employ evasion techniques which are very advanced. DL techniques are very similar to that of ML with the exception that feature extraction is automatic rather than manual as with ML. There is considerable use of Recurrent Neural Networks (RNN) in DL like for example in the detection of permission-based Android malware. This was done by classifying Android malware by using long short term memory (LSTM) RNN which learned temporal behaviours/patterns over sparse data of Android permission sequences.

Another useful application of AI is in system response. AI can enhance system countering measures through a certain type of autonomous and semi-autonomous cyber security systems that contain a repertory of pre-established attack responses. Honey pots and honey nets have been used to learn adversarial behaviour in order to prepare autonomous weapons that will recognise the attacker through its patterns gotten from these decoy systems and then give the corresponding retaliation (response) that had been established in advance. Such techniques are already being incorporated into states cyber defence strategies for deterrence purposes.

The downside and dilemmasUnfortunately, the same way DL enables new capabilities is the same way it enables new vulnerabilities. Its algorithm is vulnerable to cyber attacks and deceptions such as evasion attacks, poisoning/toxic attacks and so on. These attacks are mainly done by injecting a corrupted sample, good enough to not be noticed by the human eye, into the DL algorithm thereby forcing the algorithm to make false classifications. Therefore, such attacks undermine the integrity and usability of DL applications. Even though solutions to such adversarial attacks have been proposed, such as modifying the network by adding more layers, or modifying the training process, some security issues still exist. One of these issues has to do with the training data being stored in terminal devices in distributed learning modes of AI. On the other hand, in the case of insider threats, ML provides an advantage because it creates a record of all known users and permits network security to learn their network activities and establishing a pattern. Any network activity beyond these would be considered an anomaly. The dilemma will be to chose between ML that can provide a list of known users but is rule-based so cannot detect threats that were not part of the initial rules, and DL which does not need rules to detect threats but then cannot make a record of known users.

AI systems thrive in closed worlds - worlds that have a definite and known number of variables and would quickly get confused when placed in the real world (as the real world has an infinite number of different variables and external stimuli). This is quite similar to the Open Category problem. It is a problem whereby machine learning uses a method called open discrimination which assumes that the world is made up of a finite number of entities/objects, say 1000. If an AI system then encounters a new object/entity, it will assume that it must belong to one of the 1000 objects/entities it was programmed to understand. Therefore, one of the main challenges relies on assuring that AI systems will work well when encountered with unfamiliar events/variables, in this case, malware. Thomas Dietterich suggests that to solve this, we must ensure that AI system does not confuse unfamiliar situations with familiar ones that is, ensure that they are not too confident. He further explains that this can be done through the implementation of an anomaly detecting algorithm that will give a score to an abnormal situation and provide a mechanism of action for the kind of situation. Although his proposition was based on an example of AI being confronted to real-world stimuli, it can still apply to the virtual world as the essence of the idea is all about AI being confronted to unknown data, regardless of the environment in which it is found. In view of this, it is possible to say that AI safety all boils down to how we would like the system to behave under different scenarios. Basically, how do we account for the infinitely different scenarios?

Also, ethical aspects of the application of AI in cyber security are more often than not overlooked. Ethical aspects are important in order to prevent, most importantly, the de-skilling of humans because ML/DL malware prevention systems could easily replace highly skilled engineers. This could equally potentially provoke the rejection of AI in cyber security by society in spite of its potential to increase security. At the same time, partial use of AI would make the system porous. Therefore, regulations are necessary to find the balance between AI applications in cyber security and preventing human deskilling, encourage responsible behaviour, amongst others.

This resource was uploaded by: Cedric

Other articles by this author