Train a random forest model (MAPD) for predicting protein degradability.

MAPD.train(
  class = NULL,
  featureDat = NULL,
  features = NULL,
  ntree = 20000,
  summaryFunction = caret::prSummary,
  metric = "AUC"
)

Arguments

class

A vector or factor, indicating the class label of proteins for training. The vector or factor should be named with official gene names. The first-level items will be considered as negative (e.g. lowly-degradable), and the others will be positive (e.g. highly-degradable).

featureDat

A matrix or data frame, specifying the user-defined protein feature data. By default, this function will use the internal feature data for training.

features

Protein intrinsic features used for predicting degradability. By default, the five features, including Ubiquitination_2, Phosphorylation_2, Acetylation_1, Zecha2018_Hela_Halflife, Length will be used for training MAPD. The full list of features are available at http://mapd.cistrome.org/.

ntree

An integer, specifying the number of trees in the model.

summaryFunction

A summary function for tuning parameter, prSummary is used by default.

metric

A character, specifying the metric used for tuning parameter.

Value

The trained model is returned. It is a list.