<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Degang Wang</title><link>https://degangwang.com/</link><description>Recent content on Degang Wang</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 10 May 2026 00:00:00 -0500</lastBuildDate><atom:link href="https://degangwang.com/index.xml" rel="self" type="application/rss+xml"/><item><title>From Databricks to Your Analytics Tool: A RWD Workflow</title><link>https://degangwang.com/2026/05/10/databricks-rwd-workflow/</link><pubDate>Sun, 10 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/10/databricks-rwd-workflow/</guid><description>&lt;h2 id="why-not-just-load-everything-locally"&gt;Why Not Just Load Everything Locally?&lt;/h2&gt;
&lt;p&gt;Real-world data (RWD) — claims, EHR, registries — is massive. A single claims database can have billions of rows across diagnosis, procedure, and pharmacy tables. You cannot &lt;code&gt;pd.read_csv()&lt;/code&gt; your way through it.&lt;/p&gt;
&lt;p&gt;The workflow that works:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Manipulate data in Databricks&lt;/strong&gt; — filter, join, aggregate using SQL on distributed compute&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Create an analytic-ready table&lt;/strong&gt; — a focused cohort with only the columns and rows you need&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bring the result to your analytics tool&lt;/strong&gt; — Python, R, or SAS for statistical analysis and visualization&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This post walks through that workflow using a diabetes cohort as an example. While we use Databricks here, the same pattern applies to any big data computing engine — Snowflake, Impala, AWS Athena, or Google BigQuery. The principle is the same: reduce data remotely, then bring the result to your analytics tool.&lt;/p&gt;</description></item><item><title>Mass General Brigham EHR Data: A Real-World Data Resource for Research</title><link>https://degangwang.com/2026/05/08/mgb-ehr-data/</link><pubDate>Sun, 10 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/08/mgb-ehr-data/</guid><description>&lt;h2 id="what-is-mass-general-brigham"&gt;What Is Mass General Brigham?&lt;/h2&gt;
&lt;p&gt;Mass General Brigham (MGB), formerly Partners HealthCare, is one of the largest academic health systems in the United States. It includes Massachusetts General Hospital, Brigham and Women&amp;rsquo;s Hospital, and several other hospitals and community health centers across New England. The system serves millions of patients and generates rich longitudinal electronic health record (EHR) data.&lt;/p&gt;</description></item><item><title>Reproducibility in Real-World Evidence: Lessons from the REPEAT Initiative</title><link>https://degangwang.com/2026/05/08/rwe-reproducibility/</link><pubDate>Sun, 10 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/08/rwe-reproducibility/</guid><description>&lt;h2 id="the-paper"&gt;The Paper&lt;/h2&gt;
&lt;p&gt;Wang SV, Kattinakere Sreedhara S, Schneeweiss S, &amp;amp; REPEAT Initiative. &lt;em&gt;Reproducibility of real-world evidence studies using clinical practice data to inform regulatory and coverage decisions.&lt;/em&gt; Nature Communications. 2022;13:5126. &lt;a href="https://doi.org/10.1038/s41467-022-32310-3" target="_blank" rel="noopener noreferrer"&gt;DOI: 10.1038/s41467-022-32310-3&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Databricks Connect: Run Spark Code from Your Local IDE</title><link>https://degangwang.com/2026/05/08/databricks-connect/</link><pubDate>Fri, 08 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/08/databricks-connect/</guid><description>&lt;h2 id="why-databricks-connect"&gt;Why Databricks Connect?&lt;/h2&gt;
&lt;p&gt;The Databricks web notebook is great for interactive exploration, but sometimes you want to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use your preferred local IDE (VS Code, PyCharm, etc.)&lt;/li&gt;
&lt;li&gt;Version control your code with git&lt;/li&gt;
&lt;li&gt;Run scripts in CI/CD pipelines&lt;/li&gt;
&lt;li&gt;Debug with local tools&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Databricks Connect lets you do all of this while still using a remote Databricks cluster for compute.&lt;/p&gt;</description></item><item><title>Life Stories — Coming Soon</title><link>https://degangwang.com/2026/05/08/life-coming-soon/</link><pubDate>Fri, 08 May 2026 00:00:00 -0500</pubDate><guid>https://degangwang.com/2026/05/08/life-coming-soon/</guid><description>&lt;p&gt;Coming soon&amp;hellip;&lt;/p&gt;</description></item><item><title>Authoring mathematical formulae</title><link>https://degangwang.com/2025/07/06/mathematical-formulae/</link><pubDate>Sun, 06 Jul 2025 00:00:00 -0500</pubDate><guid>https://degangwang.com/2025/07/06/mathematical-formulae/</guid><description>&lt;h2 id="authoring-mathematical-and-chemical-equations"&gt;Authoring mathematical and chemical equations&lt;/h2&gt;
&lt;p&gt;Cleanwhite theme now has built-in &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mtext&gt;KaTeX&lt;/mtext&gt;&lt;/mrow&gt;&lt;annotation encoding="application/x-tex"&gt;\KaTeX&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class="katex-html" aria-hidden="true"&gt;&lt;span class="base"&gt;&lt;span class="strut" style="height:0.8988em;vertical-align:-0.2155em;"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord textrm"&gt;K&lt;/span&gt;&lt;span class="mspace" style="margin-right:-0.17em;"&gt;&lt;/span&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist" style="height:0.6833em;"&gt;&lt;span style="top:-2.905em;"&gt;&lt;span class="pstrut" style="height:2.7em;"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord textrm mtight sizing reset-size6 size3"&gt;A&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace" style="margin-right:-0.15em;"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord textrm"&gt;T&lt;/span&gt;&lt;span class="mspace" style="margin-right:-0.1667em;"&gt;&lt;/span&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist" style="height:0.4678em;"&gt;&lt;span style="top:-2.7845em;"&gt;&lt;span class="pstrut" style="height:3em;"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord textrm"&gt;E&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist" style="height:0.2155em;"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace" style="margin-right:-0.125em;"&gt;&lt;/span&gt;&lt;span class="mord textrm"&gt;X&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; support, so that you can easily include
complex mathematical formulae into your web page, either inline or centred
on its own line. The theme uses Hugo&amp;rsquo;s embedded instance of the KaTeX
display engine to render mathematical markup to HTML at build time.
With this server side rendering of formulae, the same output is produced,
regardless of your browser or your environment.&lt;/p&gt;</description></item><item><title/><link>https://degangwang.com/about/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://degangwang.com/about/</guid><description>&lt;h2 id="about-me"&gt;About Me&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Degang Wang&lt;/em&gt;&lt;/strong&gt; is a real-world data (RWD) and analytics practitioner who combines epidemiology, statistical methodology, and modern data platforms to generate credible evidence for clinical and business decisions. His experience centers on RWD study execution, data source evaluation, and the development of scalable, cloud-native analytics infrastructure.&lt;/p&gt;
&lt;p&gt;Degang works in HEOR Epidemiology and Real-World Evidence at &lt;a href="https://www.regeneron.com/" target="_blank" rel="noopener noreferrer"&gt;Regeneron&lt;/a&gt;, with prior experience at &lt;a href="https://www.houston.hsrd.research.va.gov/" target="_blank" rel="noopener noreferrer"&gt;Baylor College of Medicine&lt;/a&gt;, &lt;a href="https://www.merative.com/truven" target="_blank" rel="noopener noreferrer"&gt;Truven Health Analytics&lt;/a&gt; (now part of Merative), &lt;a href="https://www.amgen.com/" target="_blank" rel="noopener noreferrer"&gt;Amgen&lt;/a&gt;, and &lt;a href="https://www.abbvie.com/" target="_blank" rel="noopener noreferrer"&gt;Allergan&lt;/a&gt; (now part of AbbVie), spanning academic, consulting, and industry settings.&lt;/p&gt;</description></item><item><title/><link>https://degangwang.com/search/placeholder/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://degangwang.com/search/placeholder/</guid><description/></item><item><title>Posts Archive</title><link>https://degangwang.com/archive/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://degangwang.com/archive/</guid><description/></item></channel></rss>