Archived Projects and Research Areas

Large Scale Machine Learning Platform

The goal of this project is to provide a set of machine learning algorithms which can meet the requirements of research work and applications typically with very large scale data/features or applicable in multiple markets/domains. This platform provides but not limited to: classification, clustering, time series analysis, SVD, kernel distance function, statistical analysis, etc.

Behavior Targeting

Behavioral Targeting (BT) attempts to deliver the most relevant advertisements to the most interested audiences, and is playing an increasingly important role in online advertising market. There are a set of challenges for behavioral targeting research, which are user representation and modeling, user segmentation and targeted ads delivery. We have multiple sub-projects for behavioral targeting research. We start with the "Self Service Behaviroal Targeting" project. The most recent released products come from our BT research is the "Intent based Behavioral Targeting". Our ongoing project is called the Ad Selection with display ads team.

Categorized Search

Categorized Search is one of the solutions to organize search results by bringing categorization concepts into search products. Our focus is to scale up the whole solution, including: identifying popular galleries, mapping queries to galleries, creating intent profiles for galleries, and associating search result pages with intent profiles. We have implemented a tool used to organize queries and user search intents, which is a must-have for implementing the above search experience. We have used various kinds of data sources, including search log contributed by search engine users, Web pages provided by website editors and knowledge bases such as Wikipedia, Web directory organized by volunteers. Both processes are very effective and require not many human interactions, while the step of mapping result pages to intent profiles is fully automatic. At the same time, we will also exchange our thoughts about how to use our large scale machine learning toolkit to help scale up the solutions as well as our idea of how to evaluate Categorized Search system.

Opinion Search

Grassroots users play important roles in today’s Web. They have intensive communications using various kinds of channels like online community, blog, instant messenger, etc. Meanwhile, these users also contribute content data to the Web, e.g., opinion data which contains the knowledge of grassroots users, large in scale and updates very frequently. In order to well organize and utilize these data, we try to collect, store and organize user opinion data. Based on the analysis and mining of opinion data, we try to understand the opinion expressed by grassroots users as well as their requirements, which will help other Web users to make purchase decision, to direct manufacturers to improve their products and services. Different from previous research work focusing on the analysis of social network, we focus on analyzing text opinion data in this project.