<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>RWD on Degang Wang</title><link>https://degangwang.com/tags/rwd/</link><description>Recent content in RWD on Degang Wang</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 10 May 2026 00:00:00 -0500</lastBuildDate><atom:link href="https://degangwang.com/tags/rwd/index.xml" rel="self" type="application/rss+xml"/><item><title>From Databricks to Your Analytics Tool: A RWD Workflow</title><link>https://degangwang.com/2026/05/10/databricks-rwd-workflow/</link><pubDate>Sun, 10 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/10/databricks-rwd-workflow/</guid><description>&lt;h2 id="why-not-just-load-everything-locally"&gt;Why Not Just Load Everything Locally?&lt;/h2&gt;
&lt;p&gt;Real-world data (RWD) — claims, EHR, registries — is massive. A single claims database can have billions of rows across diagnosis, procedure, and pharmacy tables. You cannot &lt;code&gt;pd.read_csv()&lt;/code&gt; your way through it.&lt;/p&gt;
&lt;p&gt;The workflow that works:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Manipulate data in Databricks&lt;/strong&gt; — filter, join, aggregate using SQL on distributed compute&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Create an analytic-ready table&lt;/strong&gt; — a focused cohort with only the columns and rows you need&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bring the result to your analytics tool&lt;/strong&gt; — Python, R, or SAS for statistical analysis and visualization&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This post walks through that workflow using a diabetes cohort as an example. While we use Databricks here, the same pattern applies to any big data computing engine — Snowflake, Impala, AWS Athena, or Google BigQuery. The principle is the same: reduce data remotely, then bring the result to your analytics tool.&lt;/p&gt;</description></item><item><title>Mass General Brigham EHR Data: A Real-World Data Resource for Research</title><link>https://degangwang.com/2026/05/08/mgb-ehr-data/</link><pubDate>Sun, 10 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/08/mgb-ehr-data/</guid><description>&lt;h2 id="what-is-mass-general-brigham"&gt;What Is Mass General Brigham?&lt;/h2&gt;
&lt;p&gt;Mass General Brigham (MGB), formerly Partners HealthCare, is one of the largest academic health systems in the United States. It includes Massachusetts General Hospital, Brigham and Women&amp;rsquo;s Hospital, and several other hospitals and community health centers across New England. The system serves millions of patients and generates rich longitudinal electronic health record (EHR) data.&lt;/p&gt;</description></item></channel></rss>