The number of multimedia applications has been increasing over the past two decades. Multimedia information fusion has therefore attracted significant attention with many techniques having been proposed. However, the uncertainty and correlation among different information sources have not been fully considered in the existing fusion methods. In general, the predictions of individual information source have uncertainty. Furthermore, many information sources in the multimedia systems are correlated with each other. In this paper, we propose a novel multimedia fusion method based on the portfolio theory. Portfolio theory is a widely used financial investment theory dealing with how to allocate funds across securities. The key idea is to maximize the performance of the allocated portfolio while minimize the risk in returns. We adapt this approach to multimedia fusion to derive optimal weights that can achieve good fusion results. The optimization is formulated as a quadratic programming problem. Experimental results with both simulation and real data confirm the theoretical insights and show promising results. Multimedia Fusion With Mean-Covariance Analysis
A multimedia analysis task involves processing of multimodal data in order to obtain valuable insights about the data, a situation, or a higher level of activity [1]. For example, surveillance systems utilize the data from multiple types of sensors like microphones, video cameras, etc., to detect certain events. For news video retrieval, video data is combined with audio data and text information to enable content- based search. In most applications, no single information source can accomplish the analysis task perfectly. Hence we use multimedia fusion to integrate multiple modalities, their associated features, or the intermediate decisions in order to perform the task. One of the advantages of multimedia fusion is to use the correlation of different information sources. Thus, how to measure and combine the correlation appropriately is an important problem. Usually heterogeneous and hard to combine, the decisions are homogeneous in nature. The data from different modalities can be analyzed using different yet appropriate methods to obtain the unimodal decisions. This provides much more flexibility in the multimodal fusion process. Moreover, for decision level fusion, it is easy to control the relative contributions of information sources to fusion results while this is more difficult in data and feature level fusion.
There are also many training-based fusion methods proposed training-based super-kernel fusion. It first finds statistically independent modalities from the raw features. Then, their method determines the optimal combination of information sources by training the output scores of different information sources. Computationally less expensive compared to other strategies. It is one of the simplest, most widely used methods and is also easily scalable. Several methods based on linear fusion have been proposed. Max/min/average fusion takes the maximum/minimum/ average prediction score of all information sources as the final prediction score To solve this problem, the portfolio theory is employed. A preliminary version of portfolio fusion has been described. Simulation and concept detection experiments have shown that it outperforms average fusion, weighted fusion, and naive Bayesian fusion method. Here, more description and details with varying return definitions are shown. Furthermore, experiments are done to show the superiority of proposed portfolio fusion method to other related fusion methods. The proposed portfolio fusion is based on linear fusion, and it is compared with several related linear fusion methods to demonstrate the superiority. the effectiveness of proposed portfolio fusion method (PTF), we simulated multiple information sources and compared with different fusion methods: logarithmic opinion pool (LGP), convex aggregation (CA), logistic regression (LR), and super kernel fusion method .