1. Introduction
Developed by the University of Hawaii in 1971, the Additive Links On-line Hawaii Area network (ALOHAnet) ushered in the era of wireless networks. Since then, wireless network technology has evolved and become virtually ubiquitous owing to advantages such as better mobility and scalability compared to wired networks. With the development of wireless network technology and the proliferation of smartphones and Internet of Things (IoT) devices, the demand for Wi-Fi to provide wireless networks to wireless terminals has increased. Currently, Wi-Fi has become essential in homes and public places such as libraries, schools, restaurants, airports, and hotels [1,2]. Wi-Fi is based on the IEEE 802.11 standard, and it provides a network to wireless terminals through communication with a wireless access point (AP) connected to a wired network [3,4].
Wi-Fi also provides advantages such as low cost and simple implementation. However, it also has the disadvantage of being vulnerable to attacks such as traffic analysis, sniffing, authentication, denial of service, response attacks, session hijacking, and rogue APs [5-7]. A rogue AP is a wireless AP installed without the network administrator’s permission. Rogue APs are classified into evil twin APs, improperly configured APs, unauthorized APs, and compromised APs [8]. An evil twin AP is an AP that is disguised as an authorized AP by duplicating the service set identifier (SSID) or media access control (MAC) address of an authorized AP. Attackers can carry out man-in-the-middle attacks using evil twin APs to bypass users who connect to phishing sites or send forged packets. Additionally, open source tools have been developed to easily create evil twin APs that can be used to attack wireless networks.
Because an evil twin AP copies and disguises the SSID and MAC address of the authorized AP, it is difficult to detect without checking its physical location. Therefore, to detect evil twin APs, various methods for distinguishing a new identifier based on physical characteristics rather than conventional identifiers such as clock skew, Received Signal Strength (RSS), Round Trip Time (RTT), and radio frequency have been proposed. However, because of the characteristics of wireless networks, which are more unstable than wired networks, errors due to small environmental changes can occur, making it difficult to accurately measure characteristic values, and therefore necessitating additional expensive wireless signal collection equipment for accurate measurement.
This paper proposes a method for extracting various features of wireless APs and detecting evil twin APs using machine learning. As the proposed method collects wireless signals using wireless network interface cards (NICs), it does not require expensive wireless signal acquisition equipment and is not significantly affected by various environmental changes as it uses machine learning algorithms. The main features used for detection are as follows: the clock skew generated owing to the minute differences in the manufacturing process of wireless communication devices; RSS, which is the signal strength of the target AP; the channel used by the AP; duration, which is the transmission time of a frame. To improve the evil twin AP detection accuracy, these four features were applied to several machine learning classification algorithms to compare their performance; specifically, logistic regression, naïve Bayes, k-nearest neighbors (k-NN), support vector machine (SVM), and random forest.
The remainder of this paper is organized as follows: Section 2 provides the background for the study. Section 3 discusses related work. Section 4 describes the evil twin AP detection technique based on multiple features. Section 5 analyzes the experiments and the results of detecting evil twin APs using the proposed technique. Finally, Section 6 concludes the paper and outlines the direction of future work.
2. Background
This section gives an overview of passive AP scanning, evil twin APs, and the classification algorithms used in machine learning.
2.1 Passive AP Scan
A station must connect with a surrounding AP to connect to a wireless network. The AP is connected to a wired network, and it communicates with the station via a wireless signal. Hence, a user cannot physically check whether the AP exists in the vicinity. Therefore, the user must scan and identify the surrounding AP through the station. The station scans for the surrounding AP via either passive scanning or active scanning. Fig. 1 shows the passive AP scanning process.
In the passive scanning method, each neighboring AP is recognized by scanning the signal transmitted by the AP to inform itself about the station without any other action [9]. In this case, the signal broadcast by the AP is referred to as a beacon frame. The owner of the AP can determine the period in which the AP broadcasts the beacon frame through the beacon interval setting.
2.2 Evil Twin AP
An evil twin AP is a rogue AP that impersonates an authorized AP by assuming its SSID or MAC address. The SSID or MAC address can be easily identified through the probe response and beacon frame. An attacker installs an evil twin AP and sets its signal to be stronger than that of the authorized AP or executes a distributed denial of service (DDoS) attack against the authorized AP to connect a station and the evil twin AP [10,12].
Authorized AP and evil twin AP connection scenarios. (a) The connection between an authorized AP and a station. (b) An evil twin AP attack scenario. (c) Another evil twin AP attack scenario.
Fig. 2(a) shows the connection between an authorized AP and a station. The authorized AP is connected to a wired network, and it provides wireless Internet access to the connected station. Fig. 2(b) shows an evil twin AP attack scenario. The evil twin AP is wired to the Internet, similar to the authorized APs, to provide wireless Internet access to connected stations. The evil twin AP can intercept packets from and to the station. Fig. 2(c) shows another evil twin AP attack scenario, in which an evil twin AP is connected to an attacker’s private network. The station connected to the evil twin AP appears to be accessing the wireless Internet. However, private information and important data can be leaked because it is connected to the attacker's private network [13].
3. Related Work
This section reviews studies that have used single and multiple features to detect evil twin APs.
3.1 Evil Twin AP Detection Using a Single Feature
Jana and Kasera [14] proposed a method of using clock skew to detect an evil twin AP. Clock skew is calculated from the IEEE 802.11 Time Synchronization Function timestamp of the beacon frame broadcast from the target AP. They defined clock offset [TeX:] $$o_{i}$$, which is the difference between transmission time and reception time when the transmission and reception times of the ith frame are [TeX:] $$T_{i}$$ and [TeX:] $$t_{i}$$, respectively, as follows:
They defined the rate of change of the calculated [TeX:] $$o_{i}$$ as clock skew and calculated it using linear programming and least squares fitting. They state that the calculated clock skew could be used as an identifier to distinguish between an evil twin AP and a normal AP. Furthermore, they proposed a threshold-based algorithm that could separate each beacon frame set from the entire set containing the beacon frames of an evil twin AP and a normal AP. They demonstrated that clock skew could identify APs within an error range of 0.2 ppm.
Arackaparambil et al. [15] pointed out problems with the reliability of the beacon frame reception time required to calculate clock skew. The reception time of a beacon frame is determined by the internal clock of the wireless NIC that receives it. They indicated that the internal clock of the wireless NIC changes in synchronization with the clock of the most recently connected AP. They also conducted experiments to measure the AP’s clock skew using a wireless NIC to prove their claim. Their experimental results show that the AP's clock skew is measured differently depending on how recently the AP is connected to the wireless NIC.
Kim et al. [16] proposed an evil twin AP detection scheme using RSS. Their scheme assumes that an attacker broadcasts signals from multiple authorized APs on one NIC. Their method collects radio signals from neighboring APs, sorts them in the order of collection to generate an RSS sequence, and normalizes the sequence to fill noise or empty values to improve detection accuracy. Then, the RSS sequences are compared for similarities. If two RSS sequences are found to be similar, they are classified as fake signals from one device. They performed several experiments by changing the value of the threshold and found that the accuracy of the method is 97.1% when the threshold is one, with true positive and false positive rates of 30.1% and 0.9%, respectively. Moreover, when the threshold is two, it has a 96.5% accuracy, with 100% true positive rate and 3.6% false positive rate.
Lee et al. [17] argued that in the past, it was necessary to build a separate network to install an evil twin AP, but recently, the hotspot function of mobile devices such as smartphones and tablets makes it easy to build evil twin APs connected to cellular networks such as 3G and LTE. They point out that it is possible to conduct man-in-the-middle attacks and state that an evil twin AP connected to a cellular network will increase communication latency more than an AP connected to a wired network owing to the presence of the eNodeB base station in the communication process. They subsequently proposed an evil twin AP detection technique. Through experiments, they measured the RTT of the AP connected to the wired network and the evil twin AP connected to the cellular network and generated a learning model using the k-SVM algorithm. We also conducted an evil twin AP detection experiment using the trained model to achieve maximum evil twin AP detection accuracy of 93.4%.
3.2 Evil Twin AP Detection Using Multiple Features
Vanjale and Mane [18] proposed an evil twin AP detection method using multiple features. They pointed out the limitation of single-feature-based evil twin AP detection and designed a system to detect evil twin APs using MAC address, SSID, RSS, channel, frequency, authentication type, timestamp, sequence count, and clock skew. Their proposed method is divided into a learning mode and a detection mode. In the learning mode, a detection policy is created using multiple features. In the detection mode, the created policy is used to classify authorized APs, unauthorized APs, and evil twin APs. They confirmed the capabilities of the proposed system for the detection of rogue APs, evil twin APs, and MAC spoofing attacks.
Kang et al. [19] proposed an evil twin AP detection method using an SVM and two features. They pointed out that the detection of evil twin APs using RTT is less accurate in crowded channels and added packet inter-arrival time (PIAT) to compensate for this. Their RTT measurement method is based on the existing RTT measurement method. PIAT is a measure of the interval of response packets returned when a packet is transmitted at regular intervals for RTT measurement. It is used for checking whether there is a change in the packet reception interval depending on channel congestion or the workload of an AP. They measured RTT and PIAT and conducted an evil twin AP detection experiment using an SVM. Evil twin APs were detected with a maximum accuracy of 96.5% and a minimum accuracy of 89.75% in congested channels.
4. Proposed Method
The studies described in Section 3 used clock skew, which is a unique feature that an attacker cannot forge to identify an AP. However, clock skew is measured through packets that are broadcast in the air, and thus depends on several factors, including the measurement environment, measurement time, and the state of packet collectors. Therefore, we propose a method that uses multiple features, including clock skew, to detect evil twin APs. Fig. 3 shows the flowchart of the proposed method.
The proposed method is composed of the following components: a feature extractor for extracting features from an authorized AP in advance, a learning phase for generating an evil twin AP detection model through machine learning with extracted features, a detection phase for detecting evil twin APs using the trained model.
4.1 Feature Extractor
Feature extractors are used to generate data for machine learning in the learning and detection phases. A feature extractor extracts the feature of each AP from a collected beacon frame. Assuming that there are N APs, the input of the feature extractor is the beacon frame set collected from the N APs. The output of the feature extractor is a [TeX:] $$1 \times 5$$ vector that includes four features extracted using the elements of the input set and a label matching the SSIDs from zero to N. The features extracted by the feature extractor are clock skew, channel, RSS, and duration.
Flowchart of the proposed method.
4.1.1 Clock skew
Clock skew is the distortion of the relative clock between two APs caused by a slight difference in the clock oscillator of each AP in the manufacturing process. This distortion can be used as a unique fingerprint of an AP because it is different even for APs manufactured by the same vendor. In this study, the beacon frame broadcasted by the AP is used to estimate clock skew. As the beacon frame contains a field that indicates the timestamp of a transmitting AP, the transmission time of the transmitting AP can be determined. The time at which the beacon frame is received is measured by the driver of a receiving AP, and the offset of the clock between the target AP and receiving AP can be calculated using the transmission and reception times. As explained in Section 3, clock skew is the rate of change of clock offset. Hence, estimating clock skew is the same as estimating the slope of the time–clock offset graph. In this study, machine learning linear regression is applied for accurate slope estimation.
When the clock offset for the ith frame of the collected AP is [TeX:] $$\hat{y}_{i}$$, the receiving time of the ith frame is [TeX:] $$x_{i}$$, clock skew is W, and bias is b, a straight line is calculated by linear regression as follows:
At this time, W and b are initialized to a random value between zero and one and adjusted based on the gradient descent algorithm to minimize the cost, c, of least squares fitting, as follows:
The gradient descent algorithm is an optimization algorithm that is mainly used to minimize cost in machine learning. It changes W and b to reduce the cost during the learning period. When learning is over and the cost reaches its minimum, W is the slope of the straight line representing given data, which is estimated based on clock skew.
4.1.2 Channel
The Wi-Fi network used in this study employed a 2.4-GHz frequency band and a 5-GHz frequency band. Channels are units into which a frequency band is divided to reduce crosstalk and interference in wireless communications. Fig. 4 shows the center frequency of each channel in the 2.4 GHz frequency band.
Center frequency of 2.4 GHz band channel.
The center frequency of wireless communication channels is divided into 2,412 MHz and 5 MHz units, and the number of channels used varies by country. IEEE 802.11n uses a bandwidth of 20 MHz; hence, one channel can affect four neighboring channels.
The owner of an AP can set the channel to be used between the AP and a station for communication. The channel used by the AP is constant until the owner of the AP changes the AP configuration. Therefore, the channel used by the AP may be suspected as an evil twin AP (This does not consider the automatic channel setting function used by APs to find and change the optimal channel). Therefore, we used the channel as an identifier for detecting the evil twin AP. There is a field representing the channel used by the AP in the body frame of the beacon frame transmitted by the AP, and the feature extractor extracts this field value and uses it as the feature.
4.1.3 RSS
RSS refers to the strength of the wireless signal received by a sensor. The unit of RSS is dBm, a log-scale unit based on 1 mW. In other words, when the power of a received signal is 1 mW, RSS is 0 dBm, and a difference of 1 dBm implies a power difference of 10 times. As the power levels of APs are low, RSS is mostly negative. The factors that significantly influence RSS are the transmission power of the transmitting side, the distance between the transmitting side and the receiving side, and the noise caused by the surrounding environment. However, transmission power does not change abruptly in the case of an AP, and the distance between the transmitting AP and receiving AP is constant because an AP is not mobile. Therefore, the rapid change of RSS may be due to noise, but it may be that it is installed as evil twin AP. Therefore, RSS is used as an additional identifier for detecting evil twin APs. A feature extractor extracts the RSS average value of the beacon frame collected during the unit time.
4.1.4 Duration
Duration is calculated based on various types of radio and IEEE 802.11 information, and the total frame duration is measured in microseconds. In this study, the duration is calculated using the following formula:
Duration is determined by the length of the beacon frame and the speed at which data are transmitted. The length of the beacon frame varies depending on the optional field of the frame, as shown in Fig. 5. The optional field of the beacon frame includes information such as SSID, supported rates, encryption method, and vendor. The same AP does not change the duration because the value of the optional field does not change. That is, two different APs may accidentally have the same duration value, but one AP cannot have two duration values. Therefore, because the change of duration of one AP may be suspected as an evil twin AP, the duration is used as an additional identifier for detecting an evil twin AP. The feature extractor extracts the duration values in the radio information field and uses them as features. In this study, we adjust the units to harmonize the duration with other features.
Beacon frame structure of 802.11.
4.2 Learning Phase
The learning phase refers to the process of generating a training set by collecting and processing a beacon frame and learning an evil twin AP detection model using the training set.
A previously collected beacon frame set must be processed to create a training set. In this study, pre-collection time was set as 24 hours, and it was assumed that an evil twin AP is not installed during this time. The beacon frame set collected for 24 hours from N neighbor APs is processed as follows:
First, beacon frame sets for each AP are generated using the SSIDs of the beacon frames collected over 24 hours. The beacon frames collected for each AP are divided into 10-minute intervals to generate [TeX:] $$\mathrm{N} \times 144$$ subsets. Then, each subset is input to the feature extractor to create a matrix of size [TeX:] $$(\mathrm{N} \times 144) \times 5$$ that is then used as the training set.
The performance of several classification algorithms in generating an evil twin AP detection model using the four features and labels was compared. The random forest algorithm was subsequently selected based on the results of the comparison. The details of the comparison are provided in Section 5.
4.3 Detection Phase
In the detection phase, a long pre-collection time is not necessary because the model has already been trained. In the learning phase, the evil twin AP detection prediction model divides the data collected over 24 hours into 10-minute intervals and then learns it. Therefore, the beacon frame of a neighboring AP should be collected for at least 10 minutes to detect an evil twin AP in the detect phase. According to our experimental statistics, approximately 300–500 beacon frames are collected over 10 minutes. After collecting the beacon frames for 10 minutes, the feature extractor is called to extract the feature for each AP. As a result of the extraction, a vector with a size of 1 5 is obtained. This vector is input into the learned classifier to check the classification result. The classifier then outputs the classification result and the probability of the input data. If the classification result and the SSID of the beacon frame broadcast by the corresponding AP are the same, and the probability is higher than the threshold, the AP is considered to be an authorized AP. If either condition is not satisfied, it is considered an evil twin AP.
5. Experimental Evaluation
In this section, we outline the experiment conducted to detect the actual evil twin AP using the evil twin AP detection method proposed in Section 4 and compare and measure the performance for various machine learning classification algorithms and experimental environments.
5.1 Experimental Environment and Procedure
Fig. 6 shows the experimental environment. The experiment was conducted on 10 APs on a university campus. A wireless NIC using Ralink’s chipset was used to collect the wireless signal. The wireless NIC was connected to a computer running Kali Linux and the wireless signal collection was carried out via Wireshark.
In the experiment, beacon frames were collected over a period of 9 hours for nine authenticated APs, and a training set was created using the feature extractor. Then, the AP most similar to AP 0 was designated the evil twin AP of AP 0, and beacon frames were collected again over a period of 24 hours for accurate measurement. We created a test set using the feature extractor and measured the performance of the classification algorithms using the training set and the test set.
Experimental environment.
5.2 Generation of Training Set
A total of 850,886 beacon frames were collected over 24 hours around the 10 APs to generate a training set. The distribution of the beacon frames collected for different APs is shown in Fig. 7.
In the case of AP 0, more beacon frames were collected compared to the other APs owing to its location advantage. Approximately 50,000 to 100,000 beacon frames are collected for all APs except AP 0. After separating the beacon frames collected over 24 hours according to APs and dividing them into 10-minute intervals, we generated 144 learning datasets for each AP using the feature extractor. Table 1 shows the training set used for feature extraction. The analysis of the training set shows that the error in clock skew was approximately 7 ppm. Additionally, AP 0 and AP 4 had extremely similar ranges, channels, and durations of clock skew but different RSS.
Distribution of beacon frames per AP.
Average of the features of the training set
5.3 Generation of Test Set
The proposed method requires data collected over a period of 10 minutes to detect evil twin APs, but this test required a large amount of test data because it aims to measure the evil twin AP detection accuracy of the algorithm. Therefore, we created a test set by collecting beacon frames for 24 hours, and each test set was divided into 10-minute intervals and used as a test set. Table 2 shows the average of the test set features.
Through the collected test set, the clock skew and RSS sensitive to the surrounding environment had a little error that occurred between the training set and the test set. The beacon frame of AP 0 and evil twin AP 0 could be separated using the threshold-based algorithm of [14].
Average of the features of the test set
5.4 Evil Twin AP Detection Results
We compared the output results by inputting 1,440 test data collected in addition to the random forest model generated by 1,440 training data. Table 3 shows the experimental results. The random forest model correctly classified all 144 data from AP 0 to AP 8. Evil twin AP data was classified as AP 0 because it was never trained but was classified as an evil twin AP because the probability did not exceed the threshold.
Evil twin AP detection results
5.5 Performance Analysis of Classification Algorithms
To obtain more objective results, two beacon frames were collected every 2 hours for 2 days, and then two additional test sets were generated. We compared the performance of classification algorithms for evil twin AP detection using a total of three test sets. The features used were clock skew, RSS, channel, and duration; the comparison algorithms were logistic regression, naïve Bayes, k-NN, SVM, and random forest. The threshold of each algorithm was set to the highest accuracy through several experiments. Table 4 shows the detection accuracy of each algorithm according to the test set.
Accuracy (%) of evil twin AP Detection for each algorithm according to test set
Although there is a difference between algorithms, the superiority of accuracy did not change, and in the No. 1 test set, each algorithm showed high accuracy. Therefore, we measured the change in accuracy according to the number of features of each algorithm using the first test set. Table 5 shows the accuracy of each algorithm according to the number of features.
Evil twin AP detection accuracy (%) of each algorithm according to number of features
The random forest algorithm showed the highest accuracy when all four features were used. In the last experiment, we fixed four features and changed the number of classes of the learning data to two, four, and nine. Further, the test data class was changed to three, five, and 10 by adding a rogue AP 0 to the class of the training data. Table 6 shows the accuracy of each algorithm according to the number of classes. The experimental results show that the smaller the class, the higher the accuracy of the random forest. When the class is three or five, the maximum detection accuracy of the evil twin AP is 100%.
Accuracy (%) of evil twin AP detection for each algorithm according to the number of classes
5.6 Comparison with Previous Methods
Table 7 compares the method proposed in this paper to previous methods. Because the proposed method uses multiple factors, [18] and [19] were selected for comparison as they are similar. In the case of [18], a total of eight features are extracted from the beacon frame to generate a white list and similarity detection for the evil twin AP with 100% accuracy, whereas the proposed method uses four features. The proposed method also uses clock skew, RSS, and channel, but adds a new feature called duration, which makes it less susceptible to errors caused by the machine learning, rather than similarity comparison. On the other hand, [19] differs from the proposed method, in that two features, RTT and PIAT, are obtained using ICMP packets, and the number of packets required for detection is similar to those of the proposed method. The evil twin AP detection is performed using SVM, a machine learning classification algorithm. In terms of performance, [19] showed 93.4% accuracy and the proposed method achieved 100% accuracy in an environment with 10 APs by changing various variables.
Comparison of previous methods and the proposed method
6. Conclusion
This paper proposed an evil twin AP detection method using machine learning for accurate evil twin AP detection and compared the performance of classification algorithms for the evil twin AP detection process. The proposed method estimates sensitive clock skew using linear regression and adds channel, RSS, and duration as features to improve accuracy. Moreover, we analyzed the performance of the classification algorithms using four features, and obtained 100% evil twin AP detection accuracy via the random forest algorithm. The Evil Twin AP 0 used in the experiment had a clock skew difference of only 1–2 ppm because AP 0 and the clock were synchronized. However, repetitive experiments did not show perfect classification accuracy in various environmental changes, and there was a false positive rate of about 1%. While the existing evil twin AP detection methods use features such as clock skew, RSS, RTT, and channel only, in this study, we added a feature called duration. Furthermore, the experiments confirmed that it is more effective to use a combination of these features than to use these features independently, and in particular, in the case of duration, an accuracy increase of approximately 10% was observed compared to the unused model. Additionally, through the experiment to verify the correlation between the number of APs and the accuracy allowed by the administrator, it was found that the number of APs authorized can achieve at least the same level of accuracy. However, the accuracy was not confirmed in the environment where more than 10 APs are operated. In the future, experiments will be conducted in an environment that can operate more than 10 APs.
Finally, the proposed method can detect evil twin APs by collecting radio signals for a short time. Therefore, when an AP is installed in the enterprise or institution that is not authorized by the network administrator, it can be used to quickly detect such an AP. This study is expected to contribute to providing a secure wireless network environment by integrating with the existing security solutions.
In the future, research will be conducted on an evil twin AP detection system with real-time enhance¬ment through wireless signal collection libraries such as Scapy and Pyshark, with blocking and location tracking technology on the detected evil twin AP.
Acknowledgement
This work was supported by the research fund of Chungnam National University.