Using pattern recognition to analyze

Matt Aldrich, Coco Krumme, Ernesto Martinez-Villalpando, Charlie DeTar. December, 2008

Slides from our presentation for this project. is an online platform for peer-to-peer lending. Borrowers create personal profiles and solicit loans via online listings detailing the amount requested, maximum interest rate, and purpose for the loan. In turn, lenders assess and can bid on listings; if the total dollar amount of bids is equal to the amount requested, the loan is funded. When total bid amount exceeds the amount requested, those lenders electing the lowest interest rates are granted a stake in the loan. If a listing fails to garner complete funding, it is canceled by the system and the borrower has the option to repost.

The borrower’s profile includes independently verified information on his credit history, income, and current debts; Prosper also has avenues for creating social networks, joining interest-based groups (tied by geography, common interest, or common loan purpose), and collecting the endorsements of friends or group leaders.

Prosper makes the data from their listings publicly available for borrowers, lenders, or the general public to analyze (here is an overview of the data available).


We have created models and conducted analysis with the intent of better understanding three inter-related elements of the Prosper marketplace:

  • Can we predict whether a listing will become a loan?
  • Can we predict whether a loan will be paid back, or default?
  • What features best predict these outcomes?

We believe that an understanding of these questions is important on several fronts. First, borrowers or prospective borrowers would benefit from knowing how to maximize the likelihood of getting a loan at an acceptable interest rate. In addition, Prosper lenders have a stake in maximizing their return on loans, and minimizing the chances of a late payment or default, which carries costs measured both in dollars and time. Finally, a better understanding of loan dynamics would allow to increase revenues by taking measures to up loan conversion and decrease default rate.

There are a number of interesting research questions inherent in this analysis. We are interested, most basically, in whether peer-to-peer lending is a viable model for borrowers, lenders, and Prosper. Several questions in the social/behavioral realm are of interest: how much do social factors matter in calculating these likelihoods? How do lenders make decisions?

Finally, we seek to address methodological questions including: which models have the greatest predictive power? What are the costs of and tradeoffs between the various models? How can human classification aid machine learning?



Source Code

Source is available for the following parts:

Would you lend to this person?