Multidocument Text Classification over Heterogeneous Data Sources