Award Abstract # 0216213
SBIR Phase II: A Machine Learning Approach to Approximate Record Matching

NSF Org: TI
Translational Impacts
Recipient:
Initial Amendment Date: July 12, 2002
Latest Amendment Date: October 25, 2004
Award Number: 0216213
Award Instrument: Standard Grant
Program Manager: Juan E. Figueroa
TI
 Translational Impacts
TIP
 Directorate for Technology, Innovation, and Partnerships
Start Date: July 15, 2002
End Date: June 30, 2005 (Estimated)
Total Intended Award Amount: $499,764.00
Total Awarded Amount to Date: $880,105.00
Funds Obligated to Date: FY 2002 = $499,764.00
FY 2005 = $380,341.00
History of Investigator:
  • Andrew Borthwick (Principal Investigator)
    Andrew.Borthwick@choicemaker.com
Recipient Sponsored Research Office: ChoiceMaker Technologies, Inc.
48 Wall Street, 11th Floor
New York
NY  US  10003-4602
(212)918-4412
Sponsor Congressional District: 10
Primary Place of Performance: ChoiceMaker Technologies, Inc.
48 Wall Street, 11th Floor
New York
NY  US  10003-4602
Primary Place of Performance
Congressional District:
10
Unique Entity Identifier (UEI):
Parent UEI:
NSF Program(s): SBIR Phase II
Primary Program Source: app-0102 
app-0105 
Program Reference Code(s): 9215, HPCC
Program Element Code(s): 537300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.084

ABSTRACT

This Small Business Innovation Research (SBIR) Phase II project will enhance the company's approximate record-matching software, the Maximum Entropy De-Duper, MEDD(TM) by: 1) Enhancing MEDD's performance using advanced standardization tools to convert data, such as names and addresses, into standard formats; 2) Expanding MEDD's market by matching business names not only person names; 3) Internationalizing MEDD to support Canadian French or Mexican Spanish; 4) Benchmarking MEDD against the competition and developing a methodology to objectively compare matching systems; 5) Reducing MEDD's reliance on training data to ease deployment; producing the best possible "untrained" models that will adapt and improve through client use; 6) Applying the latest advances in machine learning technology to the record-matching problem to increase competitive advantage; and 7) Speeding MEDD word blocking with a fast, innovative memory-resident data-store.

MEDD's market includes all business and government entities that store mission-critical information in large databases. The project will yield societal benefits for public health, anti-terrorist efforts, epidemiological research, the U.S. Census, and the data quality of records relating to racial and ethnic minorities.

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page