Award Abstract # 2033604
A1: Systematic Content Analysis of Litigation Events (SCALES) Open Knowledge Network to Enable Transparency and Access to Court Records

NSF Org: ITE
Innovation and Technology Ecosystems
Recipient: NORTHWESTERN UNIVERSITY
Initial Amendment Date: August 22, 2020
Latest Amendment Date: August 28, 2023
Award Number: 2033604
Award Instrument: Cooperative Agreement
Program Manager: Jemin George
jgeorge@nsf.gov
 (703)292-2251
ITE
 Innovation and Technology Ecosystems
TIP
 Directorate for Technology, Innovation, and Partnerships
Start Date: September 1, 2020
End Date: August 31, 2024 (Estimated)
Total Intended Award Amount: $4,999,771.00
Total Awarded Amount to Date: $5,520,669.00
Funds Obligated to Date: FY 2020 = $2,532,105.00
FY 2021 = $2,467,666.00

FY 2022 = $520,898.00
History of Investigator:
  • Luis Amaral (Principal Investigator)
    amaral@northwestern.edu
  • Adam Pah (Co-Principal Investigator)
  • David Schwartz (Co-Principal Investigator)
  • Rachel Mersey (Co-Principal Investigator)
  • Charlotte Alexander (Co-Principal Investigator)
Recipient Sponsored Research Office: Northwestern University
633 CLARK ST
EVANSTON
IL  US  60208-0001
(312)503-7955
Sponsor Congressional District: 09
Primary Place of Performance: Northwestern University
2145 Sheridan Rd
Evanston
IL  US  60208-3111
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): EXZVPWZBLUE8
Parent UEI:
NSF Program(s): Convergence Accelerator Resrch,
CA-HDR: Convergence Accelerato
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
01002122DB NSF RESEARCH & RELATED ACTIVIT

01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):
Program Element Code(s): 131Y00, 095Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.084

ABSTRACT

The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future.

This project will develop the Systematic Content Analysis of Legal EventS Open Knowledge Network (SCALES OKN). The SCALES OKN seeks to create the computational and data science tools needed to democratize access to court records. Greater access to court records and analysis tools will enable policy makers, scholars, journalists, entrepreneurs, and the public to directly engage with and evaluate the workings of the U.S. courts.

The U.S. court system collects detailed data about their activities, but the challenge is that most of this data sits behind paywalls and in scattered systems that are difficult to access. Highly limited access means that court records are functionally inaccessible to the public. This limited access to court records has prevented the development of tools to turn court data into information and insights. The SCALES OKN will develop aggregation and analysis tools that will bring together a community of public servants, academic institutions, non-profits, private organizations, and individuals to better understanding how litigation proceeds. Access to these new data and analysis tools will enable legal scholars to better analyze litigation processes, entrepreneurs to assess litigation costs and risk, journalists to investigate equity in outcomes, advocacy organizations assess public policy needs, and the public to better understand how the modern judiciary functions.

This project joins 22 scholars in computer and data science, economics, journalism, law, and sociology from eight universities with a large and diverse range of partners from non-profit and for-profit organizations. The SCALES OKN?s existing partnerships will enable users to ask questions such as how lawsuits involving Fortune 500 companies or with representation from large law firms progress, or if judicial rules are consistently implemented. As the project develops, additional data and tools will enable an even richer view into topics such as how new laws impact the judiciary, corporations, and individuals, or how a changing economic climate impacts people and organizations?whether that be because of a global economic downturn or changes to the nature of employment as impacts from the COVID-19 epidemic unfold.

This team is building SCALES OKN as an open and freely accessible knowledge network. Their efforts include developing the tools to transform the data that define court records into actionable information. This work will include the development of tools to extract and transform data from court records, resolve and disambiguate entities, and enable the automated identification of litigation events and construction of a lawsuit?s lifecycle. Rather than having users depend on their own data skills, the SCALES efforts plan to map user information requests onto the analyses needed to address questions of relationships, correlations, trends, and distributions of actions and decisions in the legal system. The team also plans to build tools that facilitate the continued growth of open knowledge networks through public contributions. The project will leverage machine learning to enable users to develop further ontologies and merge additional datasets to answer novel questions. Importantly, these advances will allow for the rapid expansion of natural language processing techniques to legal contexts and catalyze further computational analysis of the law.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Adler, Rachel F. and Paley, Andrew and Li Zhao, Andong L. and Pack, Harper and Servantez, Sergio and Pah, Adam R. and Hammond, Kristian and Consortium, SCALES OKN "A user-centered approach to developing an AI system analyzing U.S. federal court data" Artificial Intelligence and Law , 2022 https://doi.org/10.1007/s10506-022-09320-z Citation Details
Alexander, Charlotte and Dahlberg, Nathan and Tucker, Anne M "Settlement as Construct: Defining and Counting Party Resolution in Federal District Court" Northwestern University Law Review , v.119 , 2024 https://doi.org/10.2139/ssrn.4963118 Citation Details
Nunes_Amaral, Luís A "Artificial intelligence needs a scientific method-driven reset" Nature Physics , v.20 , 2024 https://doi.org/10.1038/s41567-024-02403-5 Citation Details
Pah, Adam R. and Rozolis, Christian J. and Schwartz, David L. and Alexander, Charlotte S. and Okn Consortium, Scales "PRESIDE: A Judge Entity Recognition and Disambiguation Model for US District Court Records" 2021 IEEE International Conference on Big Data (Big Data) , 2021 https://doi.org/10.1109/BigData52589.2021.9671351 Citation Details
Pah, Adam R and Schwartz, David L and Sanga, Sarath and Alexander, Charlotte S and Hammond, Kristian J and Amaral, Luís A.N. "The Promise of AI in an Open Justice System" AI Magazine , v.43 , 2022 https://doi.org/10.1002/aaai.12039 Citation Details
Paley, Andrew and Zhao, Andong L. and Pack, Harper and Servantez, Sergio and Adler, Rachel F. and Sterbentz, Marko and Pah, Adam and Schwartz, David and Barrie, Cameron and Einarsson, Alexander and Hammond, Kristian "From data to information: automating data science to explore the U.S. court system" ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law , 2021 https://doi.org/10.1145/3462757.3466100 Citation Details
Schwartz, David L and Albrecht, Kat and Pah, Adam and Cotropia, Christopher Anthony and Sanders, Amy Kristin and Sanga, Sarath and Alexander, Charlotte and Amaral, Luis_A N and Clopton, Zachary D and Tucker, Anne M and Gaylord, Thomas and Daniel, Scott an "The SCALES Project: Making Federal Court Records Free" Northwestern University Law Review , v.119 , 2024 https://doi.org/10.2139/ssrn.4948027 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Modern governments gather information across an extraordinary range of activities and use this information to direct policy. The courts archive detailed information about their own activities, but surprisingly keep it out of the public’s reach, even though nearly every citizen and organization interacts with the judiciary, either directly or indirectly. The lack of comprehensive access to public court records isolates the judiciary from oversight and could compromise the legitimacy of the justice system. 

 

The goal of the Systematic Content Analysis of Litigation EventS Open Knowledge Network (SCALES OKN) is to transform the transparency and accessibility of court records. To achieve this goal, we developed an open knowledge network that contains an enriched and linked web of federal court data and a public, online data explorer that enables journalists, scholars, and lawyers to easily explore and analyze the SCALES OKN. 

 

The SCALES OKN is comprised of more than 750,000 federal district court cases—notably including every federal civil and criminal case that was filed in all 94 district courts for two years. The data has been extracted from these case records and linked—with entities like major corporations, lawyers, law firms, and judges identified in the records and connected across cases. We also developed a litigation event ontology—identifying key events that occur in cases and marking milestones like an order from a judge or parties reaching a settlement—and applied the ontology to the natural text from the records using AI. Having both clear labels of what events transpired in cases and who is involved across cases makes it easy for the public to answer questions about what happens in federal court for the first time. 

 

Importantly, the SCALES OKN is free and publicly available for use, both as a raw data and through the SCALES OKN data explorer. The data explorer (available at http://dataexplorer.scales-okn.org) makes it easy for journalists, legal scholars, and the broader public to filter the data based on key litigation events or types of cases and to easily answer questions about how long these cases take to finish or where they occur across the country. 

 


Last Modified: 12/12/2024
Modified by: Adam R Pah

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page