Award Abstract # 1408924
III: Medium: Collaborative Research: Collective Opinion Fraud Detection: Identifying and Integrating Cues from Language, Behavior, and Networks

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: CARNEGIE MELLON UNIVERSITY
Initial Amendment Date: August 1, 2014
Latest Amendment Date: May 14, 2015
Award Number: 1408924
Award Instrument: Standard Grant
Program Manager: Maria Zemankova
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2014
End Date: August 31, 2019 (Estimated)
Total Intended Award Amount: $299,908.00
Total Awarded Amount to Date: $307,908.00
Funds Obligated to Date: FY 2014 = $299,908.00
FY 2015 = $8,000.00
History of Investigator:
  • Christos Faloutsos (Principal Investigator)
    christos@cs.cmu.edu
Recipient Sponsored Research Office: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3890
(412)268-8746
Sponsor Congressional District: 12
Primary Place of Performance: Carnegie-Mellon University
PA  US  15213-3890
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): U3NKNFLNQ613
Parent UEI: U3NKNFLNQ613
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01001415DB NSF RESEARCH & RELATED ACTIVIT
01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7364, 7924, 9251
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Given user reviews on Web sites such as Yelp, Amazon, and TripAdvisor, which ones should one trust? Online reviews have become an important resource for public opinion sharing. They influence our decisions over an extremely wide spectrum of daily and professional activities: e.g., where to eat, where to stay, which products to purchase, which doctors to see, which books to read, which universities to attend, and so on. However, the credibility and trustworthiness of online reviews are at stake. It is well known that a large body of reviews is fabricated -- either by owners, competitors, or entities paid by those -- to create false perception on the actual quality of the products and services. What is more, opinion fraud is prevalent; while credit card fraud is as rare as 0.2% or less, it is estimated that 20-30% of the reviews on well-known service sites could be fake. This poses a serious risk to businesses and the public, from investing on a low-quality product to consulting an incompetent doctor for diagnosis and treatment. Like other kinds of fraud, opinion fraud is a serious legal offense. In fact, it is currently being recognized as a serious issue in law enforcement by policymakers. Thus solving this problem is of great importance to businesses and the general public alike. Accurately spotting opinion fraud will enable site owners to provide trustworthy content, maintain the integrity of their service, and protect the online citizens from unfair (or potentially harmful) products and services. Businesses will also benefit from reviews with reliable feedback. Honest businesses will be indirectly rewarded, as it will no longer be easy for unscrupulous businesses to benefit from fake reviews. The research outcomes will thus contribute significantly to the healthy growth of the Internet commerce. Educational activities include incorporating research findings in graduate level courses, educating public on fraudulent behavior and misinformation, and providing publicly available educational materials including lectures and manuscripts.

Given the critical issues of opinion fraud in online communities, how can one identify fake reviews and attribute responsible culprits behind them? By conjoining expertise of the PIs over various modalities of deception footprints ranging over language, user behavior, and relational information, this project presents a research program that will result in much needed solutions to this emergent, prevalent, and socially impactful problem. The ultimate goal is to create a unified detection framework via synergistic integration of multiple information sources; from linguistics, user behavior, and network effects, to obtain the best of all worlds. The main idea is to formulate the problem as a relational inference task on composite heterogeneous networks, providing a principled, extensible approach that can blend and reinforce all the above cues towards effective and robust detection of fraud. From a scientific point of view, the research brings together three disciplines: natural language analysis, behavioral modeling, and graph mining. The outcome is a suite of novel, principled, and scalable techniques and models that will enhance our understanding of the creation and dissemination of opinion fraud and misinformation in general at a large scale. The PIs will collaborate with industry partners such as Yelp, Google, and Amazon, directly solicit online fake reviews, and conduct well-designed user studies for testing and validation of their techniques. The project web site (http://www.andrew.cmu.edu/user/lakoglu/PROJECTS/OPINION_FRAUD/) provides additional information and will include open-source software and datasets.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Alex Beutel, Amr Ahmed, and Alexander J. Smola "Accams: Additive co-clustering to approximate matrices succinctly" WWW , 2015
Bryan Hooi, Neil Shah, Alex Beutel, Stephan Guenneman,Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos "BIRDNEST: Bayesian Inference for Ratings-Fraud Detection" SDM , 2016
Dhivya Eswaran, Reihaneh Rabbany, Artur Dubrawski, Christos Faloutsos "Social-Affiliation Networks: Patterns and the SOAR Model" ECML/PKDD , 2018
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos and Shiqiang Yang "Catching Synchronized Behaviors in Large Networks: A Graph Mining Approach" ACM Transactions on Knowledge Discovery from Data (TKDD) , v.10 , 2016 , p.35:1-35:2
Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, Christos Faloutsos "Beyond Anomaly Detection: LookOut for Pictorial Explanation" ECML/PKDD , 2018
Yasuko Matsubara, Yasushi Sakurai, and Christos Faloutsos "Non-Linear Mining of Competing Local Activities" WWW , 2016 10.1145/2872427.2883010
Yasuko Matsubara, Yasushi Sakurai, and Christos Faloutsos "The web as a jungle: Non-linear dynamical systems for co-evolving online activities" WWW , 2015
Yasushi Sakurai, Yasuko Matsubara, and Christos Faloutsos "Mining Big Time-series Data on the Web" WWW'16 Companion , 2016 10.1145/2872518.2891061

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Given user reviews on Web sites such as Yelp, Amazon, and TripAdvisor, which ones should one trust? Online reviews have become an important resource for public opinion sharing. They influence our decisions over an extremely wide spectrum of daily and professional activities: e.g., where to eat, where to stay, which products to purchase, which doctors to see, which books to read, which universities to attend, and so on. However, the credibility and trustworthiness of online reviews are at stake.
A large body of reviews is fabricated -- either by owners, competitors, or entities paid by those -- to create false perception on the actual quality of the products and services.

In this work,  we focused on methods to spot such activities, so that on-line companies can neutralize them.  The merit of our work is the development of  novel, fast algorithms for fraud detection, like 'CopyCatch' and 'FRAUDAR',
   which attracted popular press.

The broader impact is that these algorithms are publicly available, and we understand that some of them are in production at FaceBook, FlipKart, and possibly more. 

Finally, this project produced multiple dissertations at CMU and at the co-PI institutions, including: Mr. Alex Beutel  Mr. Neil Shah, Ms. Dhivya Eswaran.  Ms. Danai Koutra.  Koutra and Beutel attracted the prestigious ACM KDD-doctoral-dissertation award (first place, and runner-up, respectively).


Last Modified: 04/19/2020
Modified by: Christos Faloutsos

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page