Novel Use of Regex for Echocardiogram Data Transformation
Abstract
Background Echocardiogram data provides a rich dataset about a patient's cardiac function and, more broadly, an individual's health status. Given this, these data have been subject to significant research and analysis; however, data analysis requires manual data transformation from the medical report to analytically favorable formats such as spreadsheet file types. There exists a need for a tool for automated data transformation. Results Ready-made, Easy Echocardiogram Data for Research (REEDR) is a software script that utilizes regular expressions for rapid data transformation from echocardiogram reports to spreadsheet format. REEDR provides a software solution emphasizing ease of use and reliability. Its goal is to instantaneously iterate over unlimited echocardiogram reports and transform the values into analytically friendly spreadsheet format. Discussion Prior to REEDR, manual data entry and curation required significant human labor and were susceptible to data entry errors. The novel use of regular expressions through a Pythonic program script provides the flexibility to iterate over many differing types of echocardiogram medical reports to instantaneously generate an analysis-friendly format, a spreadsheet file type. Source code and documentation can be obtained from https://github.com/mbrockman1/REEDR.
Related articles
Related articles are currently not available for this article.