A REVIEW OF ATTACKS AND ITS DETECTION ATTRIBUTES ON COLLABORATIVE RECOMMENDER SYSTEMS

: Today, there is lots of information available over the Internet but it’s very difficult to filter out the required information from this overload of information. Thus a solution to this problem, came as “Recommender Systems”, they can predict outcomes according to user’s interests. Although Recommender Systems are very effective and useful for users but the mostly used type of Recommender System i.e. Collaborative Filtering Recommender System suffers from shilling/profile injection attacks in which fake profiles are inserted into the database in order to bias its output. This paper is aimed at discussing various attacks that can affect Recommender Systems and the attributes that are used for the detection of these attacks.


INTRODUCTION
Weare living in an era of information overload that means, we get more information than what we actually want and sometimes even the information we get is not actually relevant to what we actually wanted. Thus one tool developed to tackle such problems is Recommender System. Recommender Systems [4] can filter out information required by user from the vast amount of information available using certain characteristics and thus this concept is very helpful in overcoming the problem of information overload. Recommender Systems can broadly be categorized as Content-based [12] , Collaborative [3,13,14] and Hybrid [2] Recommender Systems ( Table 1 gives an overview of different techniques of recommender systems). Collaborative Recommender Systems are quite helpful in many ways but they are still prone to shilling or profile injection attacks due to their natural openness. In these attacks, malicious users are inserted into existing dataset in order to influence the result of Recommender Systems. Mostly these attacks are generated by product sellers or developers who aim to promote their own product or demote their competitor's product. Based on different assumptions attack models can be divided in different categories such as push [16] or nuke [16] attacks and standard [3] or obfuscated [15] attacks which we will be discussing in detail further. In this paper we present a review of different attacks on Collaborative Recommender Systems and different attributes used for their detection. This paper is organized as follows: section 2 describes different attack types, section 3 describes various detection attributes and finally in section 4 we conclude the paper along with possible future scope. Netflix, NewsDude

ATTACK PROFILES AND ATTACK MODELS
With the advancement of recommender systems, various techniques are employed to influence the output of recommender systems to promote or demote a particular product. These types of attacksare particularly observed in Collaborative Filtering based Recommender Systems which are known as profile injection or Shilling attacks [19] , in which malicious users insert fake profiles into the rating database in order to bias the system's output. The general description of the profile of a true user and fake user are characterized below: Profile of a True User: From above description of trusted and fake user profile it is clear that to attack a recommender system, attack profile need to be designed as statistically identical to genuine profile as possible. So the attacks are based on how an attacker selects ratings for target, selected and filler items. Figure 1gives an overview of different types of attacks.

Figure 1 Various Attacks on Recommender Systems
Some of these attacks are described below: 1. Random Attack: In Random Attacks, attack profiles are generated such that their ratings are chosen randomly based on the overall distribution of user ratings in database, except target item. It is very simple to implement but has limited effectiveness (i s =0, i f =random, i t =maximum). 2. Average Attack: In Average Attacks, attack profiles are generated such that the rating for filler items is the mean or average rating for that item across all the users in the database. Although it is a very effective attack but requires prior knowledge about the system (i s =0, i f =average, i t =maximum). 3. Segment Attack: Segment Attack basically targets a specific group of users who may already be interested in the target item. Alternatively, we can say that it increases recommendations for a target product to a certain group of users (i s =maximum, i f =minimum, i t =maximum). 4. Bandwagon or Popular Attack: In Bandwagon Attacks, profiles are generated such that besides giving high ratings to the target items, it also contains only high values for selected items and random values to some filler items (i s =maximum, i t =maximum, i f =random/average). 5. Reverse-Bandwagon Attack: Reverse Bandwagon is a variant of Bandwagon Attack except for the fact that in Bandwagon Attack only high ratings were assigned to target items but here in Reverse Bandwagon Attack, low ratings are given to target and selected items(i s =minimum, i t =minimum, i f =random/average). All the type of attacks which are discussed above are Standard Attacks [3,15,16,19,20] and you might have noticed that during our discussion about attacks we are constantly using the term filler items [3] , so what basically are filler items. It is simply the ratio between number of items rated by user and number of entire items in dataset. Next we will be discussing about Obfuscated type of Attacks [10,15,19] . 6. User Shifting: In these types of attacks we basically increment or decrement all ratings for a subset of items per attack profile by a constant amount so as to reduce the similarity between attack profiles. 7. Mixed Attack: In Mixed Attack, attack is on the same target item but that attack is produced from different attack modules. 8. Noise Injection: This type of attack is carried out by adding some noise to ratings according to a standard normal distribution multiplied by a constant, β, which is used to govern the amount of noise to be added. This added noise can be used to affect the generated output. 9. Average over Popular Attack (AoP): AoP attack [15] was designed to obfuscate average attack by choosing filler items with equal probability from top x% of most popular items rather than from whole database. In addition to above mentioned categories for classification of attacks, attacks can also be categorised as: push [16] and nuke [16] attacks where, in push attacks, higher ratings are given to target items, so as to promote a product while in nuke attacks, lower ratings are given to target items, so as to demote a product. Table 2 gives an overview of different attributes of certain attack models.

DETECTION ATTRIBUTES
Detection attributes can be described as some descriptive statistics that can be used to capture some of the major characteristics that make an attacker's profile look different from genuine user's profile. These can be categorised into two categories as: generic attributes [18,20] and type-specific attributes [18,20] . Table 3gives an overview of few of these attributes.

Generic attributes:
These are the attributes that can be used for almost all attack types and these are not specific to any particular attack type.
1. Rating Deviation from Mean Agreement (RDMA): RDMA [11] can identify attackers by analysing the profile's average deviation per item or user. It is defined as: where Tuis the number of items user xrated, r x,i is the rating given by user xto item i , r i is the average rating of item i , R x,i be the number of ratings provided for item i by all users and N x is the number of users.
2. Weighted Degree of Agreement (WDA): WDA [5] can be calculated as the numerator of RDMA.
where Tuis the number of items user xrated, r x,i is the rating given by user xto item i , r i is the average rating of item i , and R x,i be the number of ratings provided for item i by all users.
3. Weighted Deviation from Mean Agreement (WDMA): WDMA [8] can help identify anomalies by placing a higher weight on rating deviations for sparse items.
where Tuis the number of items user xrated, r x,i is the rating given by user u to item i , r i is the average rating of item i , and R x,i be the number of ratings provided for item i by all users. 4. Length Variance (LengthVar): LengthVar [5] is used to capture how much the length of a given profile varies from average length in the dataset. It is particularly effective in detecting attacks with large filler sizes.

LengthVar =
Where #score j is the total number of ratings in the system for user j, and N is the total number of users in the system.

Degree of Similarity with Top Neighbours (DegSim):
DegSim [9] is used to capture the average similarity of a profile's k nearest neighbours.

DegSim =
Where Z i,j is the Pearson correlation between users i and j, and x is the number of neighbours. There are certain other generic attributes as well. Some of them are H v -score, TWDMA (calculated by incorporating trust into RDMA) [17] , UnRAP (unsupervised retrieval of attack profiles) [6,7] .

Type-Specific Attributes:
Attributes which will be used for certain specific attack types, like some attributes will be for average attack, some for random attack, etc. 1. Filler Mean Variance (FMV) [7] : It is generally used for average attack and is defined as follows: Where L f is the filler item set, r x,i is the rating given by user u to item i and r i is the average of ratings assigned to item i. 2. Filler Mean Target Difference (FMTD) [7] : It is generally used for segment and bandwagon attack and is defined as follows: Where L s is the selected item set, L f is the filler item set and r x,i is the rating given by user u to item i. 3. Mean Variance (MeanVar): MeanVar [5] is generally being used for identification of average attack and is defined as follows: MeanVar (r target , j) = Where P j is the profile of user j, r target is hypothesized target item, r i, j is the rating user j has given item i, and r i is the mean rating of item i across all users. Other type specific attributes include FMD [5] , FAC [5] , Profile variance, etc.

CONCLUSION AND FUTURE WORK
The issue of Shilling attacks is a major concern in the field of Recommender Systems, to maintain its trustworthiness we need to either design Recommender Systems in such a way that they are resistant to such attacks or design algorithms which can detect attacks easily and effectively. Furthermore, we should also aim at developing detection attributes for obfuscated attacks.