Award Abstract # 1646881
EAGER: An Integrated Predictive Modeling Framework for Crowdfunding Environments

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: VIRGINIA POLYTECHNIC INSTITUTE & STATE UNIVERSITY
Initial Amendment Date: August 15, 2016
Latest Amendment Date: August 15, 2016
Award Number: 1646881
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 15, 2016
End Date: July 31, 2018 (Estimated)
Total Intended Award Amount: $99,858.00
Total Awarded Amount to Date: $99,858.00
Funds Obligated to Date: FY 2016 = $99,858.00
History of Investigator:
  • Chandan Reddy (Principal Investigator)
    reddy@cs.vt.edu
Recipient Sponsored Research Office: Virginia Polytechnic Institute and State University
300 TURNER ST NW
BLACKSBURG
VA  US  24060-3359
(540)231-5281
Sponsor Congressional District: 09
Primary Place of Performance: Virginia Polytechnic Institute and State University
900 North Glebe Rd
Arlington
VA  US  22203-1822
Primary Place of Performance
Congressional District:
08
Unique Entity Identifier (UEI): QDE5UHE5XD16
Parent UEI: X6KEFGLHSJX7
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7364, 7916
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The research aims to study data analytics tools for improving crowdfunding project success rate. Crowdfunding provides seed capital for start-up companies, creating job opportunities and reviving lost business ventures. In spite of the widespread popularity and innovativeness in the concept of crowdfunding, however, many projects are still not able to succeed. A deeper understanding of the factors affecting investment decisions will not only give better success rate to the future projects but will also provide appropriate guidelines for project creators who will be seeking funding. The crowdfunding domain poses several new challenges from the data analytics perspective due to the heterogeneous, complex and dynamic nature of the data associated with project campaigns. This project develops a systematic data-driven approach to resolve these challenges by utilizing vast amounts of historical data which can be leveraged to accurately predict the success of crowdfunding projects. Though the proposed methods are primarily developed in the context of crowdfunding, they are applicable to various other forms of social data that will be collected in other disciplines such as social science, engineering, and finance.

This project develops an integrated predictive modeling framework to solve some of the complex underlying problems related to bringing success to crowdfunding based projects. Existing approaches in data analytics for classification and regression cannot tackle this project success prediction problem since the goal is to estimate the time for a project to reach its success. The research team develops a unified probabilistic prediction framework which simultaneously integrates classification and regression together. In addition, a novel iterative imputation mechanism, which calibrates the time to project success, is proposed for reducing the bias in the model estimators. This project can demonstrate the power of data analytics in delivering better insights about various categories of real-world projects by not only accurately estimating the chances of being successful but also quantitatively assessing the factors that are responsible for bringing success in crowdfunding environments. The progress of the project and the research findings are disseminated via the project website (http://dmkd.cs.vt.edu/projects/crowdfunding/).

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 14)
Ahuja, Aman and Wei, Wei and Lu, Wei and Carley, Kathleen M. and Reddy, Chandan K. "A Probabilistic Geographical Aspect-Opinion Model for Geo-Tagged Microblogs" International Conference on Data Mining (ICDM) , 2017 10.1109/ICDM.2017.82 Citation Details
Bhattacharya, Sakyajit and Huddar, Vijay and Rajan, Vaibhav and Reddy, Chandan K. and Luo, Feng "A dual boundary classifier for predicting acute hypotensive episodes in critical care" PLOS ONE , v.13 , 2018 10.1371/journal.pone.0193259 Citation Details
Dave, Vachik S. and Hasan, Mohammad Al and Zhang, Baichuan and Reddy, Chandan K. "Predicting interval time for reciprocal link creation using survival analysis" Social Network Analysis and Mining , v.8 , 2018 10.1007/s13278-018-0494-1 Citation Details
Fard, Mahtab Jahanbani and Wang, Ping and Chawla, Sanjay and Reddy, Chandan K. "A Bayesian Perspective on Early Stage Event Prediction in Longitudinal Data" IEEE Transactions on Knowledge and Data Engineering , v.28 , 2016 10.1109/TKDE.2016.2608347 Citation Details
Hannah Kim, Jaegul Choo "PIVE: Per-Iteration Visualization Environment for Real-Time Interactions with Dimension Reduction and Clustering" Proceedings of the ... AAAI Conference on Artificial Intelligence , 2017 Citation Details
Hua, Ting and Reddy, Chandan K and Zhang, Lei and Wang, Lijing and Zhao, Liang and Lu, Chang-Tien and Ramakrishnan, Naren "Social Media based Simulation Models for Understanding Disease Dynamics" Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence , 2018 10.24963/ijcai.2018/528 Citation Details
Rakesh, Vineeth and Ding, Weicong and Ahuja, Aman and Rao, Nikhil and Sun, Yifan and Reddy, Chandan K. "A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews" Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18 , 2018 10.1145/3178876.3186069 Citation Details
Rakesh, Vineeth and Jadhav, Niranjan and Kotov, Alexander and Reddy, Chandan K. "Probabilistic Social Sequential Model for Tour Recommendation" Proceedings of the Tenth ACM International Conference on Web Search and Data Mining , 2017 10.1145/3018661.3018711 Citation Details
Shi, Tian and Kang, Kyeongpil and Choo, Jaegul and Reddy, Chandan K. "Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations" Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18 , 2018 10.1145/3178876.3186009 Citation Details
Suh, Sangho and Choo, Jaegul and Lee, Joonseok and Reddy, Chandan K. "Local Topic Discovery via Boosted Ensemble of Nonnegative Matrix Factorization" International Joint Conference on Artificial Intelligence (IJCAI) , 2017 10.24963/ijcai.2017/699 Citation Details
Suh, Sangho and Shin, Sungbok and Lee, Joonseok and Reddy, Chandan K. and Choo, Jaegul "Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization" Knowledge and Information Systems , v.56 , 2018 10.1007/s10115-017-1147-9 Citation Details
(Showing: 1 - 10 of 14)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Over the past few years, crowdfunding platforms helped their users raise several billion dollars worldwide, thereby becoming a viable mechanism for people seeking funding to jump-start their business ventures. In spite of the widespread popularity and innovativeness in the concept of crowdfunding, many projects are still not able to succeed. Although, at the outset, the domain of crowdfunding appears to be intuitive and simple, it poses several new challenges from the analytics perspective due to the heterogeneous, complex and dynamic nature of the data associated with crowdfunding campaigns. In this project, we investigated the problem of predicting project success, which is multi-factorial and depends on a wide range of elements that are hard to characterize. To achieve this, we developed a systematic data-driven approach to resolve these challenges by utilizing vast amounts of historical data which can be leveraged to support crowdfunding project campaigns by predicting the success of projects. One of the important challenges in crowdfunding is to predict the success of the project using various kinds of project-related features. Existing approaches in data analytics for classification and regression cannot tackle this crowdfunding prediction problem since the goal here is to estimate the time for project success which is available only for a subset of projects (which succeeded in obtaining their target amount). Hence, the primary focus of this project was to incorporate the failed projects (which contain only partial information until the project end date) into the regression model, there is a need to develop new algorithms. The main goal of this project is to build accurate and robust prediction models for estimating project success in crowdfunding environments. We developed a unified probabilistic framework which integrates classification and regression. We also built an imputation model that calibrates the time to project success in an attempt to reduce the bias in the model estimators. This calibration step is performed using the estimated regularized inverse covariance matrix within an iterative convergence framework. In addition, we rigorously analyzed the important factors of crowdfunding to build novel algorithms that can overcome some critical drawbacks of existing approaches available in the literature. The project explored various kinds of complexities that arise in crowdfunding data and incorporated them into prediction models. A deeper understanding of the factors affecting investment decisions not only gave better success rate to the future projects but also provided better guidelines for project creators who will be interested in funding the projects.  We demonstrated that models which take into account both successful and failed projects during the training phase perform significantly better at predicting the success of future projects compared to the ones that only use the successful projects. We provided a rigorous evaluation using several sets of relevant features and show that adding few temporal features that are obtained at the project’s early stages can dramatically help in improving the overall performance. The main results of this work have been disseminated to the research community through publications, software, tutorials and other presentations. The main ideas and concepts developed in this project were also discussed in graduate-level computer science courses on data analytics. This project demonstrated the power of novel data analytics solutions in delivering better insights about various categories of real projects in crowdfunding environments by not only accurately estimating the chances of being successful but also quantitatively assessing the factors that are more responsible for bringing success. Though the proposed methods were primarily developed in the context of crowdfunding data, they can also be applied to various other forms of social data that is collected in other disciplines such as social science, engineering, and finance.


Last Modified: 08/01/2018
Modified by: Chandan K Reddy

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page