# Report on Anomaly Intrusion Detection by using Bayesian Statstics

** ****Anomaly- Based Intrusion Detection System Using Bayesian Network **

** By Afazuddin Ahamed; M.S student at QUB. **

1. **IIntroduction:**

Every day millions of people using Internet [1] and online trading become more dependent for business purpose. All the networks which are connected to the Internet have been seriously affected by various cyber-attacks (Hackers) as it become common these days. Authorization and access control are not enough to solve security problems, and Intrusion Detection system (IDS) [2] is system’s second line of defence. IDS are basically categorized into two parts one is host- based [2] and another is network-based [2, 3]. In past research, researcher was aiming to detect attacks on the host, e.g., by analysing the system calls, while later focused on network packet analysis. In host-based or network based IDS can be separated into two categories, namely misuse detection [4] and anomaly detection [5, 6]. Misuse intrusion detection uses signature of known attacks to compare with what is actually observed, raising an alarm as soon as known signature is encountered. But this technique is effective if attackers are known otherwise it will not work for unknown and new technique. Our research interest is Anomaly-based Intrusion detection system by using Bayesian Network. The network based intrusion detection system detects intrusion by collecting packet data on network. It generally uses IP Address, Port, and TCP which are header information of network packet. It means it creates a model for the “normal” behaviour traffic. After that new data is compared against the model and if it does not fit with the model then it will give a report of anomaly. The main advantages of this approach are that it is potential to detect new and unknown techniques. Bayesian technique to detect Intruders which pretend to be normal users was presented by Mehdi Nassehi[7] as a technical report, and Steven L. Scott[8] applied this technique to perceive patterns of network intrusion detection system and to detect intrusion and presented the values obtained into probability values, and described relations between data and Intrusion model.

**1.1 ** **Motivation:**

Unauthorized intrusion into a computer system or network is one of the most serious threats to computer security, Intrusion detection systems have been developed to provide early warning of an intrusion so that defensive action can be taken to prevent or minimize damage. Collection of information about intrusion techniques that can be used to strengthen the intrusion prevention facility(IPS). It involves detecting unusual patterns of activity or patterns of activity that are known to correlate with intrusions. Effective intrusion detection can serve as deterrent, so acting to prevent intrusions. This study applies Bayesian framework for production of profile to represent behaviours in network area. This study overcomes uncertainty of Intrusion detection by applying Bayesian Network and Indirect relation. And it also provides graphic representation of new Intrusion detection patterns and modified intrusion patterns for detection/ classification Information and analysis.

**2. ****Background**

**2.1 ****Intrusion Detection System**

Intrusion detection is based on the assumption that the behaviour of the intruder differs from that of a legitimate user in ways that can be quantified. Of course, we can-not expect that there will be a crisp, exact distinction between an attack by an intruder and the normal use of resources by an authorized user. Rather we must expect that there will be some overlap.

Thus, a loose interpretation of intruder behaviour, which will catch more intruders, will also lead to number of “false positives,” or authorized users may identify as Intruders. Or on the other hand, an attempt to limit false positives by a tight interpretation of intruder behaviour will lead to an increase in false negatives, or intruder may not identify as intruders. Thus there is an element of compromise and in art in the practise of Intrusion detection System. In the case of Network-based intrusion detection system [9, 2], analyzing the operation of a network is important for optimum performances as well as for detecting attacks against in particular networks. Detecting attacks on a specific network evolved through the following stages:

a. System log analysis

b. Promiscuous monitoring

c. Inline prevention

**2.2. Anomaly detection:**

Anomaly detection [5, 6] also referred as outlier detection [10] to detecting patterns in a given data set that don’t conform to an established normal behaviour. An outlier is an observation which deviates so much from the other observation. The patterns thus detected are called anomalies and often to translate to critical and actionable Information in several applications domains.

Anomalies are patterns in data that don’t conform to a well defined notion of normal behaviour. “Normal behavior” can be defined on a global level or a per-user level, can be specified by a human, or learned automatically over time. Figure 2 illustrates anomalies in a simple two dimensional data set. The data has two normal regions N1 and N2, since most observations lie on these two regions. Points that are sufficiently far away from these regions, for given example, points O1 and O2, and points in region O3 are anomalies. Defining Normal region that encompasses every possible normal behavior is often not precise. Thus an anomalous observation that lies close to the boundary can actually be normal or vice versa. But when anomalies are the results of malicious actions, then the malicious adversaries often adapt themselves to make the anomalous observations appear normal, thereby making the task of defining normal behavior is more difficult.

**2.3. Bayesian Theorem/ Statistics: **

**Conditional Probability and Independen**

In order to understand Bayesian theorem [11,9], we need to know what conditional probability is. Probability that is conditional on some event, the effect of the condition is to remove some of the outcomes from the sample space. Formally, the conditional probability of an event A assuming the event B has occurred, denoted by P [A/B] = (P [AB]**/ **P [B]), where we assume that P [B] is not zero.

Two events A and B are called statistically Independent if P [AB] = P [A] P [B], then it can be easily seen that if A and B are Independent, P [A/B] = P [A] and P [B/A] = P [B]

2.4**. Bayesian Network**:

Bayesian Networks [12] consists of a graphical model (Directed acyclic graph) together with the corresponding probability potentials. A Bayesian network is a pair (G, P), where G=( V,E) is a directed acyclic graph with node set V={1,…,d} for some d € N, where E is the edge set, and P is either a probability distribution or a family of probability distributions, indexed by a parameter set **Ɵ**, over d discrete random variables, {X_{1}, X_{2,}……X_{d }}. The pair(G,P) satisfies the following criteria.

For each θ** **€ **Ɵ****, **P (./θ) is a probability function with the same state space X, where X has a finite number of elements. That is, for each θ** **€ **Ɵ****, **P (./θ): X [0,1] and ∑_{x€}xP(x/ θ) =1.

For each node X_{v }€ V, with no parent variables, there is assigned a potential denoted by Px_{v}, giving the probability of distribution of the random variable X_{v}. To each variable X_{v}€ V with a non empty parent set π_{v}= (X_{b1(v)}….., X_{bm(v)}), there is assigned a potential PX_{v}/π_{v}

Containing the conditional probability function of X_{v} given the variables (X_{b1(v)}….., X_{bm(v)}). If X_{v }has no parent, then set π_{v}= ø, where ø is the empty set, so that PX_{v} = PX_{v}/π_{v. }The joint probability function may be factorized using the potentials PX_{v}/π_{v }, thus defined as PX_{1}……PX_{d}= π_{v=1}^{d }PX_{v}/π_{v}

_{ }

**2.5. How Anomaly based Intrusion detects works using Bayesian Networks**

It’s basically perform more complex monitoring and analysis such as watching and respond to the traffic patterns as well as Individual Packets, “detection mechanisms can include address matching, HTTP, String, and Substrate matching, generic pattern matching, TCP connection analysis, packet Anomaly detection, traffic anomaly detection and TCP/UDP port matching. In this case malicious activity may as well adhere to the rules of protocols and, in such case, be missed by this type of Anomaly detection [13]. The detection system is concerned only with the packets matching the pattern, all the rest is not important. In Anomaly detection, on the other hand the benign patterns are identified. Packets cannot be automatically treated as harmful ones, as they may belong to the same benign activity that triggered the pattern detector. Here system has to move from per- packet analysis to a higher level that copes with activities, understood as groups of packets are transmitted for some common reason. Fortunately, such concept is already presented in the TCP/ IP [14] networks- it is a session [5]. Therefore the anomaly detection portion of the Basset System (Bayesian System for Intrusion Detection), will work utilizing the concept of Sessions.

Bayesian classifier and model data [13] on the current classification, prediction attack or produce results;

2.6. **Related work: **

_{ }Bayesian theory is one of the theory which deals with uncertainty, such as probability. First uncertainty means lack of required Information to make a decision, and reasons of uncertainty include existence of partial information due to loss of information, collision between information, lack of confidence in Information and limit to languages to present knowledge. Therefore, this study to apply Bayesian theory for anomaly (new and modified) intrusion detection. Bayesian networks express conditional independence of Bayes’ theorem as a graph of network type. That is, it expresses actual knowledge as given directed acyclic graphs. The Bayesian network is called as a belief network or casual network. Adam [15] used Naïve Bayes Classifier that classify packets using preliminary on network anomaly detection. To detect intruders which presented to be normal user, a detection using Bayesian technique was presented as a technical report by Mehdi Nassehi and Scott[7,8] applied the Bayesian technique for Intrusion pattern recognition and anomaly detection and presented the relations with audit data and intrusion models as probability values.

**3. **

**References****: **

1. “ Predictors for Internet dependence in the collegiate Population

http://209.85.229.132/search?q=cache:9qxjpGr15EAJ:personal.graceland.edu/- norman/example%2520proposal.doc+increased+internet+independence&cd=4&hl=en&ct= clnk&gl=uk

2. Intrusion Prevention Fundamentals by Earl Carter and Jonathan Hogue, Cisco press.

3. Network based Anomaly Intrusion detection Improvement by Bayesian Network and Indirect relation by ByungRace Cha and DongSeob Lee.

4. W Tylman. Misuse-based Intrusion detection using Bayesian networks. In proceedings of 3^{rd} International conference on Dependability of Computer systems.

5. W Tylman. Anomaly-based Intrusion detection using Bayesian networks. In proceedings of 3^{rd} International conference on Dependability of Computer systems.

6. V.Chandola, A.Banerjee, V.Kumar Anomaly Detection : A Survey on ACM Computing, july 2009.

7. Nassehi, M.: Characterizing Masquerades for Intrusion detection. Computer science/ Mathematics( 1998)

8. Scott, S.L: A Bayesian paradigm for designing Intrusion detection systems, computational statistics and data analysis( June 30, 2002)

9. Cryptography and Network Security by W. Stallings, fourth edition.

10. Hans-Peter Kriegel, Peer Kröger, Arthur Zimek (2009). “Outlier Detection Techniques (Tutorial)”. *13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2009)* (Bangkok, Thailand). Retrieved 2010-06-05.

11. Probability, Random Variables and Stochastic process by Papoulis, published by Mcgraw- Hill.

12. Bayesian Networks by T. Koski and John M.Noble wiley series in Probability and Statsticss.

13. Bayesian Statistical Inference in Machine learning Anomaly detection, by Y.Zhao, Z.Zheng, H.Wen in 2010.

14. TCP/ IP protocol suite by Forouzan, third edition , Mcgraw – Hill publishers.

15. Barbara, D., Couto, J., jajodia, S., Popyack, L., Wu, N.: Adam- Detecting Intrusions by data mining, in: processings of the 2001 IEE workshop on Information security and assurance.

16.