-
Notifications
You must be signed in to change notification settings - Fork 29
Article: Exploring Scientific Data Files in VS Code with Xarray #821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
etienneschalk
wants to merge
9
commits into
xarray-contrib:main
Choose a base branch
from
etienneschalk:blog-post-scientific-data-viewer
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
c7e917d
Article: Exploring Scientific Data Files in VS Code with Xarray
etienneschalk d234f03
Update dependencies
etienneschalk 12ad723
bugfix: ERROR: This build is using Turbopack
etienneschalk 82aebe2
Added link to scientific data viewer article in banner
etienneschalk ae17fe0
Updated the writing date of the article
etienneschalk 727c3ea
Added logo, aligned tables and images
etienneschalk 291c443
Update thanks
etienneschalk 07f1e92
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 85d0565
Reverted dependencies updates since the preview was failing
etienneschalk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file added
BIN
+92.4 KB
public/posts/scientific-data-viewer/dark-exported-html-report-in-firefox-0.8.0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+113 KB
public/posts/scientific-data-viewer/light-nc-xarray-html-and-text-repr-0.3.0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+227 KB
...c/posts/scientific-data-viewer/light-zarr-tree-view-focus-on-variable-0.3.0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,217 @@ | ||
| --- | ||
| title: 'Exploring Scientific Data Files in VS Code with Xarray' | ||
| date: '2026-02-08' | ||
| authors: | ||
| - name: Etienne Schalk | ||
| github: etienneschalk | ||
| summary: 'Scientific Data Viewer is a VS Code extension that lets you explore NetCDF, Zarr, HDF5, GRIB, GeoTIFF, and other scientific data files directly in your editor.' | ||
| --- | ||
|
|
||
| ## TL;DR | ||
|
|
||
| Scientific Data Viewer is a VS Code extension that lets you explore NetCDF, Zarr, HDF5, GRIB, GeoTIFF, and other scientific data files directly in your editor. Built on Xarray, it displays the familiar HTML and text representations you know from Jupyter notebooks, making it easy to inspect file structure, dimensions, coordinates, and attributes without leaving your development environment. | ||
|
|
||
| <div align="center"> | ||
| <img src="/posts/scientific-data-viewer/scientific-data-viewer-logo.png" alt="Scientific Data Viewer Icon" width="128" height="128"/> | ||
|
|
||
| Available on: | ||
| [VSCode Marketplace](https://marketplace.visualstudio.com/items?itemName=eschalk0.scientific-data-viewer) • [Open VSX Registry](https://open-vsx.org/extension/eschalk0/scientific-data-viewer) | ||
|
|
||
| </div> | ||
|
|
||
| --- | ||
|
|
||
| ## The Problem | ||
|
|
||
| If you work with scientific data, you've probably developed a routine: open a terminal, start a Python REPL or Jupyter notebook, import xarray, load your dataset, and finally see what's inside. This workflow is fine for analysis, but it adds friction when you just want to quickly check a file's structure. What dimensions does it have? What variables? What's the time range? | ||
|
|
||
| Traditional tools exist, but each comes with limitations. `ncdump` is a classic command-line utility limited to NetCDF files, and its output becomes unwieldy for files with many groups or variables. Without interactivity, you're left scrolling through walls of text. [Panoply](https://www.giss.nasa.gov/tools/panoply/) offers a graphical interface and supports more formats (NetCDF, HDF, GRIB), but it requires opening a separate application, breaking your development flow. Neither tool supports newer formats like Zarr, or less common ones like GeoTIFF and JPEG-2000. In contrast, Scientific Data Viewer aims to open any format that Xarray can handle. | ||
|
|
||
| For quick inspection tasks, this context switch is costly. You might also find yourself with multiple tabs open, trying to remember which notebook was showing which file. And if you're working on a codebase that processes scientific data, you're constantly jumping between your code and external tools to verify inputs and outputs. | ||
|
|
||
| ## A Simpler Approach | ||
|
|
||
| Scientific Data Viewer brings Xarray's data inspection capabilities directly into VS Code. Click on a `.nc` file in the explorer, and instead of seeing binary gibberish, you get the same rich representation you're used to from Jupyter: | ||
|
|
||
| <div align="center"> | ||
|
|
||
|  | ||
|
|
||
| </div> | ||
|
|
||
| The extension uses Xarray under the hood to open files and extract metadata. It supports the formats that Xarray supports, including: | ||
|
|
||
| <div align="center"> | ||
|
|
||
| <table> | ||
| <thead> | ||
| <tr> | ||
| <th>Format</th> | ||
| <th>Extensions</th> | ||
| </tr> | ||
| </thead> | ||
| <tbody> | ||
| <tr> | ||
| <td>NetCDF</td> | ||
| <td> | ||
| <code>.nc</code>, <code>.netcdf</code>, <code>.nc4</code> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td>CDF (NASA)</td> | ||
| <td> | ||
| <code>.cdf</code> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td>Zarr</td> | ||
| <td> | ||
| <code>.zarr</code> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td>HDF5</td> | ||
| <td> | ||
| <code>.h5</code>, <code>.hdf5</code> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td>GRIB</td> | ||
| <td> | ||
| <code>.grib</code>, <code>.grib2</code>, <code>.grb</code> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td>GeoTIFF</td> | ||
| <td> | ||
| <code>.tif</code>, <code>.tiff</code>, <code>.geotiff</code> | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td>JPEG-2000</td> | ||
| <td> | ||
| <code>.jp2</code>, <code>.jpeg2000</code> | ||
| </td> | ||
| </tr> | ||
| </tbody> | ||
| </table> | ||
|
|
||
| </div> | ||
|
|
||
| ## What You Can Do | ||
|
|
||
| ### Browse Structure Without Code | ||
|
|
||
| The viewer displays comprehensive file information: | ||
|
|
||
| - **File metadata**: path, size, format | ||
| - **Xarray HTML representation**: the interactive, collapsible view you know from notebooks | ||
| - **Xarray text representation**: the traditional `print(ds)` output | ||
| - **Dimensions and coordinates**: with their types, shapes, and sample values | ||
| - **Variables**: with data types, dimensions, and memory usage | ||
| - **Attributes**: both global and per-variable | ||
|
|
||
| For files with hierarchical structure (like nested Zarr groups or HDF5 groups), the extension flattens the tree and displays each group's contents separately. | ||
|
|
||
| ### Tree View in the Sidebar | ||
|
|
||
| A "Data Structure" panel appears in VS Code's explorer sidebar when viewing a scientific data file. This tree view mirrors the structure shown in the main panel and lets you quickly navigate to specific variables or groups. | ||
|
|
||
| <div align="center"> | ||
|
|
||
|  | ||
|
|
||
| </div> | ||
|
|
||
| ### Basic Plotting (Experimental) | ||
|
|
||
| The extension includes experimental plotting capabilities using Matplotlib. You can generate quick visualizations of variables directly in the editor—useful for sanity checks, though not intended to replace proper analysis tools. The plotting automatically adapts to your VS Code theme (light or dark). | ||
|
|
||
| <div align="center"> | ||
|
|
||
|  | ||
|
|
||
| </div> | ||
|
|
||
| Available on: VSCode Marketplace • Open VSX Registry | ||
|
|
||
| ### Export to HTML | ||
|
|
||
| Need to share your data inspection results? The extension can export the entire viewer contents as a self-contained HTML report, including all metadata and representations. This is handy for documentation or for sharing with colleagues who don't have the data file. | ||
|
|
||
| <div align="center"> | ||
|
|
||
|  | ||
|
|
||
| </div> | ||
|
|
||
| ## Getting Started | ||
|
|
||
| ### Installation | ||
|
|
||
| 1. Install from the [VS Code Marketplace](https://marketplace.visualstudio.com/items?itemName=eschalk0.scientific-data-viewer) or [Open VSX Registry](https://open-vsx.org/extension/eschalk0/scientific-data-viewer) | ||
| 2. Ensure you have Python with Xarray and Matplotlib installed, or let the extension create its own isolated environment using [uv](https://docs.astral.sh/uv/) | ||
|
|
||
| ### Required Python Packages | ||
|
|
||
| The extension needs: | ||
|
|
||
| - `xarray` | ||
| - `matplotlib` | ||
|
|
||
| Plus format-specific packages as needed: | ||
|
|
||
| - `netCDF4` or `h5netcdf` for NetCDF | ||
| - `zarr` for Zarr | ||
| - `h5py` for HDF5 | ||
| - `cfgrib` for GRIB | ||
| - `rioxarray` for GeoTIFF/JPEG-2000 | ||
| - `cdflib` for NASA CDF | ||
|
|
||
| The extension will prompt you to install missing packages when you first open a file that needs them. | ||
|
|
||
| ### Usage | ||
|
|
||
| Once installed, simply click on any supported file in VS Code's file explorer. The file opens in the Scientific Data Viewer instead of showing raw binary. You can also: | ||
|
|
||
| - Right-click a file and select "Open Scientific Data Viewer" | ||
| - Use the command palette: `Ctrl+Shift+P` → "Open Scientific Data Viewer" | ||
| - Drag and drop files into the editor | ||
|
|
||
| ## How It Works | ||
|
|
||
| The extension is a bridge between VS Code's webview API and Python. When you open a file: | ||
|
|
||
| 1. The extension spawns a Python subprocess | ||
| 2. Python uses Xarray to open the file and extract metadata | ||
| 3. Xarray's HTML representation is captured and sent to the webview | ||
| 4. The TypeScript frontend renders everything in a VS Code tab | ||
|
|
||
| This approach leverages Xarray's existing format support and representation logic, rather than reimplementing file parsing in TypeScript. | ||
|
|
||
| ## Limitations and Future Work | ||
|
|
||
| The extension is designed for **inspection, not analysis**. It's intentionally lightweight: you won't find sophisticated slicing, aggregation, or data manipulation features here. For that, using a proper notebook or script remain the best option. | ||
|
|
||
| The plotting features are basic and best-effort, but hopefully they will improve with future versions. | ||
|
|
||
| Contributions are welcome! The project is [open source on GitHub](https://github.com/etienneschalk/scientific-data-viewer). If you encounter issues or have feature requests, please open an issue. | ||
|
|
||
| ## Conclusion | ||
|
|
||
| Scientific Data Viewer fills a small but useful niche: quick, frictionless inspection of scientific data files without leaving your editor. It's not a replacement for Xarray in notebooks: it's a complement that makes the "what's in this file?" question faster to answer. | ||
|
|
||
| If you spend time working with NetCDF, Zarr, or other scientific formats, give it a try. And if you find it useful, consider contributing back, whether by reporting bugs, suggesting improvements, or helping with development! | ||
|
|
||
| ## Thanks | ||
|
|
||
| I would like to thank my colleagues: Nicolas Bertaud, Charles Le Mero, Yoann Rey-Ricord, Thomas Vidal, and Fabien Vidor for their valuable feedback and reviews on the extension stores. I am also grateful to the community members who contributed by reporting issues and suggesting features on GitHub: hbeukers, paulsally, ChunkyPandas03, fpartous and efvik. | ||
|
|
||
| --- | ||
|
|
||
| **Links:** | ||
|
|
||
| - [VS Code Marketplace](https://marketplace.visualstudio.com/items?itemName=eschalk0.scientific-data-viewer) | ||
| - [Open VSX Registry](https://open-vsx.org/extension/eschalk0/scientific-data-viewer) | ||
| - [GitHub Repository](https://github.com/etienneschalk/scientific-data-viewer) | ||
| - [Getting Started Guide](https://github.com/etienneschalk/scientific-data-viewer/wiki/Getting-Started) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.