OSIM2 Source Code and Simulated Datasets are Available

OMOP has released the second-generation simulated dataset procedure Observational Medical Dataset Simulator (OSIM2). In addition to releasing the source code for the simulator, we are making 16 simulated datasets available for public use. Each dataset is a 10m person dataset modeled after the Thomson Reuters MarketScan® Lab Database, one without any signals injected, and the other 15 databases have different size/types of signals.

Note that these are very large files and datasets available for download through OMOP’s anonymous FTP server.We have tested the OSIM2 dataset downloads using FileZilla and WS-FTP. FileZilla is free open source client software that can be downloaded from: http://filezilla-project.org/download.php

The initial Observational Medical Dataset Simulator was released in 2009 and used to generate datasets with millions of hypothetical patients with drug exposure, background conditions, and known adverse events for the purpose of benchmarking methods performance.

Please contact OMOP to share with us your experience with OSIM2 datasets.