CROSS DOMAIN COLLABORATIVE FILTERING RECOMMENDER USING PROBABILISTIC MATRIX FACTORIZATION

: Recommender systems (RS) are targeted towards users who lack sufficient experience to evaluate the overwhelming number of alternative examples that a system may offer. Collaborative Filtering RS is one of the approach that provide recommendations without taking into account the contents of the items being recommended however, they face several challenges such as cold start, data sparsity, low confidence etc. Of late there has been considerable interest in Cross Domain RS, where we exploit knowledge from auxiliary domains which contains additional user preference data to improve recommendation on target domains. This paper, focus on Probabilistic Matrix Factorization (PMF) model in Cross Domain Recommender (CDR) that outperforms on other model based Collaborative Filtering recommenders. Experiments are conducted on benchmark datasets shown significant improvement in the quality of the CDR over various test sets


I. INTRODUCTION
Recommender systems (RS) are designed to find items of interest, narrow down the set of choices, help the user explore the space of options. The suggestion provided by recommender systems provides additional and probably unique personalized service for the customer. RS are used to filter information and suggestions for items to users, where "Item" is a generic term used to refer what the system refers to the users, as a ranked list of items. Recommender systems come in various types based on the methodology and information used in the process: Content Based Filtering (CB) [1], Collaborative Filtering (CF) [2] and Hybrid Recommender (HR) [3]. CB uses the information of the user or item considered for personalization and accordingly the suggestions get processed. CF methods predict the ratings of a particular user to a given item based on their past preferences and interests along with the rating information provided by similar users and items. Consider a matrix X for an example as shown in figure 1, the rows represents the users and columns represent items, non-zero entries are treated as known ratings, remaining entries represent missing ratings. There are different approaches of CF: Memory based algorithm and Model based algorithm. On the other hand, based on the number of domains involved in the process RS classified as Single Domain RS (SDR) and Multi Domain RS (MDR) [4,5], where SDR takes input from only one domain and MDR considers from more than one domain for ranking of items. The disadvantages of single domain RS mostly occurred with data sparsity, by which, CF method suffers from lack of available data, and prediction is difficult in single domain [6,7]. This occurs when more number of new users added to the system, or new items added to the system or when users provide very less rating. This issue is called Cold Start problem i.e. when the system is not able to recommend items to a newly added user.

User/Item
If the input information for RS is from one domain and rankings for items are made in other domain then the RS called to be Cross Domain RS (CDR) [8]. Data sparsity problem can be avoided with CDR by using the available rating information in dense domain, where this available information is extracted and then transferred to target domain, by using which, the recommendations are done in target domain [9]. Cold user problem is avoided by CDR when recommending items to new users by finding similar users in other domains [10].
There are two different types of CDR approaches: linking/aggregating knowledge and sharing/transferring knowledge, based on how the knowledge from the source domain is attained and how it is transferred to the target domain as shown in figure 2. The former one links source and target domain by merging user preferences from both domains and prepares the recommendations accordingly where the later one shares the latent features, rating patterns, of the source domain to the target domain by which the recommendation get prepared in the target. In order to generate recommendations with CDR, the quality of the recommender is highly desirable [11]. However CF approach proved to be the qualitative recommender with model based techniques. In this paper the main focus is on matrix factorization models in CDR, where Probabilistic Matrix Factorization (PMF) model outperforms other model based CF methods. Experiments carried on benchmark data sets shown significant improvement of the recommendations quality for PMF when compared with other models.
The rest of the paper is organized as follows. The related work is presented in the next section and then the problem formulation of PMF is explained. There after results and analysis shows the experimentation part followed by conclusion.

II. RELATED WORK
This section provides different matrix factorization methods in the area of recommender systems and techniques of cross domain recommender systems [12]. The idea behind MF is to predict the rating for any unobserved given user-item pair using latent features identified with user-item matrix [13]. Given a partially observed rating matrix Y ~ R m×n , the aim is to find two matrices U ~ R m×p and V ~ R p×n such that Y ~ UV where p is number of latent features. When p is much smaller than either m or n, a factorization allows the matrix to be stored inexpensively, and to be multiplied to vectors or other matrices rapidly. X = UV is called as low-rank approximation of Y . Several general purpose matrix factorization techniques have been proposed in the literature such as Non-negative Matrix Factorization (NMF), Regularized Matrix Factorization (RMF), Positive Matrix Factorization (PoMF) and Maximum Margin Matrix Factorization (MMMF) [14,15]. NMF attempts to impose a restriction on individual elements of factors U and V as nonnegative elements (instead of introducing a regularization constraint). NMF focused to determine non-negative low-rank matrices U and V which minimizes the following loss function. The objective of RMF is to determine a pair of factors matrices U and V such that the element-wise aggregated squared error for the observed values are minimized. Regularization constraint is added to restrict the domains of U and V to prevent over fitting. In PoMF, the problem of non-optimal scaling has been explicitly addressed. In order to properly scale the data, it is necessary to look explicitly at the problem as a least-squares problem. To begin this analysis, the elements e ij of the "Residual Matrix" E are defined with latent factors U and V as Initially the problem was solved iteratively using alternating least squares [Paatero and Tapper, 1993]. In alternating least-squares, one of the matrices, U or V, is taken as known and the chi-squared is minimized with respect to the other matrix as a weighted linear-least-squares problem. Then the roles of G and F are reversed so that the matrix that has just been calculated is fixed and the other is calculated by minimizing Q. The process then continues until convergence, however, this process can be slow. MMMF is the method in which, in addition to the latent factor matrices U and V matrices, R -1 thresholds, ө ia (1 <= a <= R -1), are also learned for each user i. The prediction for item j by user i is obtained by comparing the real valued predictions U i V j against R -1 thresholds, ө ia (1 <= a <= R -1), to generate discrete rating values in {1….R}.There are different sets of thresholds for each user and the objective of MMMF is to minimize this objective function with respect to U, V [16]. Prepare Probabilistic Matrix Factorization

III. PROBABILISTIC MATRIX FACTORIZATION
For a matrix R which is N×M (where every row represents a user, every column represents a movie), PMF derives a N×D matrix U T and a D×M matrix V, thus approximates R=U T ·V [17]. The vector V is a factor matrix of items and the vector U T gives coefficients of these factors for every user.
Given the preference matrix with entries, R ij , we find factorization that minimizes the root mean squared error (RMSE) on the test set. One trial is to use a linear model where we assume that there is Gaussian noise in the data. Define I ij to be 1 if R ij is known (i.e. user i has rated item j) and 0 otherwise. , E reduces to the SVD objective function.

Bayesian Probabilistic Matrix Factorization (BPMF)
BPMF is one of the efficient variant of PMF method that considers the prior distributions over the user as well as item feature vectors are assumed to be Gaussian [18].

Further place Gaussian-Wishart priors on user and item hyper parameters
Where W is the Wishart distribution with ν 0 degrees of freedom, W 0 is a D × D scale matrix, C is the normalizing constant.  [19,20]. Collaborative Filtering Techniques proved to be efficient in cross domain recommenders [21].

CDR-PMF
In probabilistic matrix factorization matrices U and V are initialized with standard normal distribution along with zero mean and standard deviation as one [22]. Error gives the difference between the actual rating and observed rating. The main goal of PMF is to minimize this error, for which the error batch gradient descent algorithm is used. Batch gradient descent is an optimization algorithm which works by efficiently searching the parameter space, intercept (θ 0 ) and slope (θ 1 ) for linear regression, according to the rule, . Where α is the learning rate, a free parameter, J(θ) is the least square cost function defined as Where m is the total number of training examples, h θ (x (i) ) is the hypothesis function, the super script (i) is used to denote the ith sample.
Finally after minimizing the error by using batch gradient descent algorithm, U and V are updated in each iteration. After convergence of the algorithm, rating is calculated with updated U and V using maximum a posteriori probability (MAP) estimate which gives an estimate of an unknown quantity which equal the mode of posterior distribution. Now with all available ratings the items are recommended to users based on top-k ratings. The steps involved in cross domain matrix factorization are given in algorithm 1. Step 1: Pass source data set to PMF algorithm and retrieve the final updated U source and V source Step 2: Pass movie2 data set to PMF algorithm and retrieve the final updated U target and V target Step 3: Apply clustering on source domain for users and items Kmeans (U source , k) Kmeans (V source , k)

Algorithm 1: Cross domain Recommender with Matrix Factorization
Step 4: For all users u i U target in target domain do  Calculate the distance between the users in target domain with source domain user centroids and assign appropriate cluster id's for each user.  For each u i find a list of similar users in source domain and assign cluster id's for those liked by the users in source domain.  For all items v j V target in target domain do  Find the distance between the items in target domain with source domain item centroid and assign appropriate cluster id's for each item.
Step 5: Now match these cluster id's of source items with target items and finally obtain list of items liked by the user in target domain.  Calculate the ratings of the corresponding user with list of items and recommend Top-k items.
Here CDR is applied with PMF and Bayes PMF to both source and target data in order to find user latent feature vector U and item latent feature vector V. Source domain user latent feature U and item latent feature V are clustered using k-means algorithm. Now cluster id's are assigned for each user and item in the source domain. Then from the target domain, a user is selected and found the distance to all clusters and assigns the nearest cluster-id to that user. This knowledge is used to recommended items to the user in target domain.

IV. RESULTS AND DISCUSSIONS
After the text edit has been completed, the paper is ready for the template. Duplicate the template file by using the Save As command, and use the naming convention prescribed by your conference for the name of your paper. In this newly created file, highlight all of the contents and import your prepared text file. You are now ready to style your paper. In order to experiment the proposed method the two benchmark data sets about movie ratings were used (Movielens as source domain and Netflix as target domain). These are publicly available datasets which contains information of user ratings on different movies on a scale of 1-5, where 1 means poor and 5 means excellent. The format of each record contains three fields: UserId, MovieId, Rating. In order to experiment the scalability of the model according to number of items three test sets were defined as shown in table 1. Precision and Recall are the measures used to evaluate the performance of the CF models. Precision is used to measure the correctness of recommendation and is defined as the fraction of relevant instances among the retrieved instances. Recall is used to measure the accuracy of recommendation and is defined as the fraction of relevant instances that have been retrieved over total relevant instances. Here table 2 shows the precision and recall values of PMF and Bayes-PMF methods on the three test sets considered. Figure 4 shows the corresponding graphs for the table 2.
We ran our cross domain experiment on different k values and got best results with k=10 clusters and finally take precision and recall values for Top-k=20, 40, 60 and 80 items. Table 3 shows the precision and recall of the PMF and bayes PMF for top-k recommendations and corresponding graphs are shown in figure 5 and figure 6.  It is observed that the precision and recall values of test case 2 and test case 3 for PMF method in cross domain CF are better than Bayes PMF method. However recall value for top-k recommendations, when k is huge and numbers of samples are high, converges for PMF and Bayes PMF methods.

V. CONCLUSION
This paper presented a cross domain recommender system which uses model based collaborative filtering system with probabilistic matrix factorization that avoids data sparsity and cold start problem. This method uses data from multiple domains and allows knowledge transfer from source domain to target domain. The cross domain recommender with PMF outperformed bayes PMF with respect to precision and recall.
The key findings of the present work are that cross domain recommender provides a provision to resolve the data sparsity issue. As a future work, huge data sets and diverse models have to be experimented in cross domain.