Predicting Functions Of Proteins In Mouse Based On Weighted ...

Feature sorting.

Maximum Relevance, Minimum Redundancy (mRMR) Method was originally developed by Peng et al. to process microarray data [46]. The idea is to rank each feature based on its relevance to the target and redundancy with other features. A “good” feature is defined as one that has the best trade-off between maximum relevance to target and minimum redundancy within the features. To quantify both relevance and redundancy, mutual information (MI), which estimates how much one vector is related to another, is defined as following.(7)

where , are two vectors, is the joint probabilistic density, and are the marginal probabilistic densities.

Let denotes the whole feature set, while denotes the already-selected feature set which contains vectors. The to-be-selected feature set with features is denoted by . The relevance of the feature in with the target can be calculated by:(8)

And redundancy of the feature f in with all the features in can be calculated by:(9)

To obtain the feature in with maximum relevance and minimum redundancy, Eq. (8) and Eq. (9) are combined to obtain the mRMR function:(10)

For a feature set with , the feature evaluation will be executed rounds. In the first round, the redundancy is 0 for is null, therefore the feature with the maximum relevance to target is selected. After the evaluations, the following feature set in the selection order can be obtained by the mRMR method:(11)where the subscript index indicates at which round the feature is selected. The better the feature, the earlier it will satisfy Eq. (10), the earlier it will be selected, and the smaller its index will be.

Từ khóa » Hu Et Al. 2011