We work on extracting entities and relations between them from encyclopedias, semi-structured data or free text data. There are plenty of problems on this topic, and also some great new tools can be used such as CNN and Crowdsourcing.
Extracting structured data from free text record
Detecting semantic drift and cleaning extraction errors
We are working on building a large Chinese-English knowledge graph from encyclopedias, and copies of html pages in English or Chinese during the past several decades. We also want to detect events with spatial-temporal information to construct a spatial-temporal event graph.
Constructing Chinese Knowledge Graph
Constructing Spatial-Temporal Even Graph
We work on improving data quality in every aspect, including data integration with multiple data sources, schema mapping, record matching, entity resolution, imputing missing data, detecting and correcting erroneous data, and data provenance.
Auto Answer Machine
We work on building an auto answer machine in specific domains such as library. Plenty of challenges and opportunities in this direction. We would like to built a good one based on a domain knowledge graph.
A FAQ auto-answer machine for libraries
We work on building a recommendation system of specific domains, which not only recommends people with something very relevant to their previous activities, but also those things that we believe people would be interested at. Also, we do recommendation across multiple domains.
Efficient recommendation with consolidated information