Method and apparatus for component association inference, failure diagnosis and misconfiguration detection based on historical failure data

, , and
US Patent 7,937,347
, (Granted)
Abstract. A method (which can be computer implemented) for inferring component associations among a plurality of components in a distributed computing system includes the steps of obtaining status information for each pertinent component of the plurality of components, forming an N by D matrix, X, based on the status information, and factorizing the matrix X to obtain a first matrix indicative of the component associations to be inferred and a second matrix indicative of failure explanations for corresponding ones of the probe instances. N is a number of probe instances associated with a given time frame. D is a number of the plurality of components for which the associations are to be inferred. Techniques are also presented for forming a database with the status information.