Knowledge Vault and Knowledge-Based Trust
In this talk we describe our knowledge extraction and fusion efforts at Google, including the Knowledge Vault project and the Knowledge-based Trust project. We use 15 extractors to periodically extract knowledge from 1B+ Webpages. The results are 3B+ distinct (subject, predicate, object) knowledge triples. Errors can creep in at every stage in this process, both from erroneous data provided by the Web sources and from mistakes made by the extractors. As a result, only about 20% of the extracted triples are correct.
We adapt state-of-the-art data fusion techniques to solve the knowledge fusion problem. By leveraging the collective wisdom from different extractors and from different Web sources, we are able to compute well-calibrated probabilities for the correctness of each triple as well as the correctness of extractions. In addition, we are able to compute trustworthiness for 119M webpages and 5.6M websites. We discuss our observations and provide insights on future research directions.
Speaker: Xin Luna Dong, Google
Wednesday, 04/29/15
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
