We present a probabilistic semi-supervised learning (SSL) framework based on sparse graph structure learning. Different from existing SSL methods with either a predefined weighted graph heuristically constructed from the input data or a learned graph based on the locally linear embedding assumption, the proposed SSL model is capable of learning a sparse weighted graph from the unlabeled high-dimensional data and a small amount of labeled data, as well as dealing with the noise of the input data. Our representation of the weighted graph is indirectly derived from a unified model of density estimation and pairwise distance preservation in terms of various distance measurements, where latent embeddings are assumed to be random variables following an unknown density function to be learned and pairwise distances are then calculated as the expectations over the density for the model robustness to the data noise. Moreover, the labeled data based on the same distance representations is leveraged to guide the estimated density for better class separation and sparse graph structure learning. A simple inference approach for the embeddings of unlabeled data based on point estimation and kernel representation is presented. Extensive experiments on various data sets show the promising results in the setting of SSL compared with many existing methods, and significant improvements on small amounts of labeled data.
Tuesday, March 3 at 3:30pm to 4:30pm