Reading and writing files#

This tutorial includes an overview of the different ways available to load the binary arrays from the disc after running a numerical simulation with XCompact3d. Besides that, some options are presented to save the results from our analysis, together with some tips and tricks.

Preparation#

Here we prepare the dataset for this notebook, so it can be reproduced on local machines or on the cloud, you are invited to test and interact with many of the concepts. It also provides nice support for courses and tutorials, let us know if you produce any of them.

The very first step is to import the toolbox and other packages:

import warnings

import numpy as np
import xarray as xr

import xcompact3d_toolbox as x3d

Then we can download an example from the online database, the flow around a cylinder in this case. We set cache=True and a local destination where it can be saved in our computer cache_dir="./example/", so there is no need to download it every time the kernel is restarted.

cylinder_ds, prm = x3d.tutorial.open_dataset("cylinder", cache=True, cache_dir="./example/")

let’s take a look at the dataset:

cylinder_ds.info()
xarray.Dataset {
dimensions:
	i = 2 ;
	x = 257 ;
	y = 128 ;
	t = 201 ;

variables:
	float32 u(i, x, y, t) ;
	float32 pp(x, y, t) ;
	float32 epsi(x, y) ;
	float64 x(x) ;
	float64 y(y) ;
	float64 t(t) ;
	<U1 i(i) ;

// global attributes:
	:xcompact3d_version = v3.0-397-gff531df ;
	:xcompact3d_toolbox_version = 1.0.1 ;
	:url = https://github.com/fschuch/xcompact3d_toolbox_data ;
	:dataset_license = MIT License ;
}

We got a xarray.Dataset with the variables u (velocity vector), pp (pressure) and epsi (describes the geometry), their coordinates (x, y, t and i) and some attributes like the xcompact3d_version used to run this simulation, the url where you can find the dataset, and others.

In the next block, we configure the toolbox and some attributes at the dataset, so we can write all the binary fields to the disc. Do not worry about the details right now, this is just the preparation step, we are going to discuss them later.

x3d.param["mytype"] = np.float32

prm.dataset.set(data_path="./data/", drop_coords="z")

cylinder_ds.u.attrs["file_name"] = "u"
cylinder_ds.pp.attrs["file_name"] = "pp"
cylinder_ds.epsi.attrs["file_name"] = "epsilon"

prm.write("input.i3d")

prm.dataset.write(cylinder_ds)

prm.dataset.write_xdmf("xy-planes.xdmf")

del cylinder_ds, prm

After that, the files are organized as follow:

tutorial
│   computing_and_plotting.ipynb
│   io.ipynb
│   input.i3d
│   parameters.ipynb
│   xy-planes.xdmf
│
└─── data
│       │   epsilon.bin
│       │   pp-000.bin
│       │   pp-001.bin
│       │   ... 
│       │   pp-199.bin
│       │   pp-200.bin
│       │   ux-000.bin
│       │   ux-001.bin
│       │   ... 
│       │   ux-199.bin
│       │   ux-200.bin
│       │   uy-000.bin
│       │   uy-001.bin
│       │   ... 
│       │   uy-199.bin
│       │   uy-200.bin
│       │   uz-000.bin
│       │   uz-001.bin
│       │   ... 
│       │   uz-199.bin
│       │   uz-200.bin
│
└─── example
│       │   cylinder.nc

It is very similar to what we get after successfully running a simulation, so now we can move on to the tutorial.

Why xarray?#

The data structures are provided by xarray, that introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like arrays, which allows for a more intuitive, more concise, and less error-prone developer experience. It integrates tightly with dask for parallel computing.

The goal here is to speed up the development of customized post-processing applications with the concise interface provided by xarray. Ultimately, we can compute solutions with fewer lines of code and better readability, so we expend less time testing and debugging and more time exploring our datasets and getting insights.

Additionally, xcompact3d-toolbox includes extra functionalities for DataArray and Dataset.

Before going forward, please, take a look at Overview: Why xarray? and Quick overview to understand the motivation to use xarray’s data structures instead of just numpy-like arrays.

Xarray objects on demand#

To start our post-processing, let’s load the parameters file:

prm = x3d.Parameters(loadfile="input.i3d")

Notice there is an entire tutorial dedicated to it.

To save space on the disc, our dataset was converted from double precision to single, so we have to configure the toolbox to:

x3d.param["mytype"] = np.float32

The methods in the toolbox support different filename properties, like the classic ux000 or the new ux-0000.bin, besides some combinations between them. For our case, we set the parameters as:

prm.dataset.filename_properties.set(
    separator="-",
    file_extension=".bin",
    number_of_digits=3,
)

Now we specify the parameters for our dataset, like where it is found (data_path), if it needs to drop some coordinate (drop_coords, again, to save space, we are working with a span-wise averaged dataset, so we drop z to work with xy planes), we inform the parameter that controls the number of timesteps snapshot_counting and their step snapshot_step. Consult the dataset documentation to see different ways to customize your experience, and choose the ones that best suits your post-processing application. In this example, they are defined as:

prm.dataset.set(
    data_path="./data/",
    drop_coords="z",
    snapshot_counting="ilast",
    snapshot_step="ioutput",
)

Now we are good to go.

We can check the length of the dataset we are dealing with:

len(prm.dataset)
201

Meaning that our binary files range from 0 (i.g., ux-000.bin) to 200 (i.g., ux-200.bin), exactly as expected.

It is possible to load any given array:

epsilon = prm.dataset.load_array("./data/epsilon.bin", add_time=False)

Notice that load_array requires the entire path to the file, and we use add_time=False because this array does not evolve in time like the others, i.e., it is not numerated for several snapshots.

We can see it on the screen:

epsilon
<xarray.DataArray (x: 257, y: 128)> Size: 132kB
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91

Let’s do it again, this time for ux and using add_time=True:

ux = prm.dataset.load_array("./data/ux-100.bin", add_time=True)

See that t is now a coordinate, and for this snapshot it was computed automatically as dimensionless time 75.0:

ux
<xarray.DataArray 'ux' (x: 257, y: 128, t: 1)> Size: 132kB
array([[[1.        ],
        [1.        ],
        [1.        ],
        ...,
        [1.        ],
        [1.        ],
        [1.        ]],

       [[1.0000466 ],
        [0.99996716],
        [1.0000466 ],
        ...,
        [0.9999681 ],
        [1.0000459 ],
        [0.9999675 ]],

       [[1.0000602 ],
        [1.0000144 ],
        [1.0000602 ],
        ...,
...
        ...,
        [1.0140737 ],
        [1.0142432 ],
        [1.0144366 ]],

       [[1.0146521 ],
        [1.0148891 ],
        [1.0151588 ],
        ...,
        [1.0141058 ],
        [1.0142633 ],
        [1.0144445 ]],

       [[1.0146475 ],
        [1.0148702 ],
        [1.0151254 ],
        ...,
        [1.014144  ],
        [1.0142874 ],
        [1.014454  ]]], dtype=float32)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 4B 75.0

That is not all. If you have enough memory, you can load the entire time series for a given variable with load_time_series, or simply by:

ux = prm.dataset["ux"]

Let’s see it (note 201 files are loaded and wrapped with the appropriate coordinates):

ux
<xarray.DataArray 'ux' (x: 257, y: 128, t: 201)> Size: 26MB
array([[[0.9999885 , 1.        , 1.        , ..., 1.        ,
         1.        , 1.        ],
        [0.9999782 , 1.        , 1.        , ..., 1.        ,
         1.        , 1.        ],
        [0.9999907 , 1.        , 1.        , ..., 1.        ,
         1.        , 1.        ],
        ...,
        [1.0000122 , 1.        , 1.        , ..., 1.        ,
         1.        , 1.        ],
        [0.99999744, 1.        , 1.        , ..., 1.        ,
         1.        , 1.        ],
        [0.9999945 , 1.        , 1.        , ..., 1.        ,
         1.        , 1.        ]],

       [[0.99999535, 1.0000458 , 1.0000486 , ..., 1.0000464 ,
         1.0000468 , 1.0000476 ],
        [1.0000107 , 0.99997103, 0.99996907, ..., 0.9999675 ,
         0.99996674, 0.9999665 ],
        [1.0000069 , 1.0000455 , 1.0000486 , ..., 1.0000459 ,
         1.0000468 , 1.0000478 ],
...
        [0.9999908 , 1.0001078 , 1.000192  , ..., 1.0155317 ,
         1.0153327 , 1.0146557 ],
        [0.9999987 , 1.0001054 , 1.0001911 , ..., 1.0152053 ,
         1.0150691 , 1.0146141 ],
        [0.9999874 , 1.0000997 , 1.0001901 , ..., 1.0149139 ,
         1.0148364 , 1.0145978 ]],

       [[1.0000043 , 1.0000877 , 1.0001792 , ..., 1.0146514 ,
         1.0146394 , 1.0146087 ],
        [1.0000119 , 1.0000889 , 1.0001761 , ..., 1.014433  ,
         1.0144511 , 1.0146273 ],
        [1.0000004 , 1.0000932 , 1.000174  , ..., 1.0142416 ,
         1.0142918 , 1.0146769 ],
        ...,
        [1.0000123 , 1.0000978 , 1.0001832 , ..., 1.0154953 ,
         1.0153852 , 1.0147215 ],
        [0.99998474, 1.000096  , 1.0001816 , ..., 1.0151795 ,
         1.0151051 , 1.0146575 ],
        [0.99999154, 1.0000918 , 1.0001806 , ..., 1.0148991 ,
         1.0148568 , 1.0146192 ]]], dtype=float32)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 804B 0.0 0.75 1.5 2.25 3.0 ... 147.8 148.5 149.2 150.0

You can store each array in a different variable, like:

ux = prm.dataset["ux"]
uy = prm.dataset["uy"]
pp = prm.dataset["pp"]

Or organize many arrays in a dataset:

# create an empty dataset
ds = xr.Dataset()

# populate it
for var in ["ux", "uy", "pp"]:
    ds[var] = prm.dataset[var]

# show on the screen
ds
<xarray.Dataset> Size: 79MB
Dimensions:  (x: 257, y: 128, t: 201)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 804B 0.0 0.75 1.5 2.25 3.0 ... 147.8 148.5 149.2 150.0
Data variables:
    ux       (x, y, t) float32 26MB 1.0 1.0 1.0 1.0 ... 1.015 1.015 1.015 1.015
    uy       (x, y, t) float32 26MB 9.98e-06 8.496e-08 ... -0.0005357 0.003209
    pp       (x, y, t) float32 26MB 0.0 0.03264 0.03613 ... 0.03922 0.03859

We can also write an one-liner solution for the previous code:

ds = xr.Dataset({var: prm.dataset[var] for var in "ux uy pp".split()})

It is possible to load all the variables from a given snapshot with load_snapshot, or simply:

snapshot = prm.dataset[100]

And we got a xarray.Dataset with all the variables and their coordinates. You can access each of them with the dot notation (i.g., snapshot.pp, snapshot.ux, snapshot.uy) or the dict-like notation (i.g., snapshot["pp"], snapshot["ux"], snapshot["uy"]). See the dataset:

snapshot
<xarray.Dataset> Size: 396kB
Dimensions:  (x: 257, y: 128, t: 1)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 4B 75.0
Data variables:
    pp       (x, y, t) float32 132kB 0.05232 0.05219 0.05243 ... 0.03986 0.03989
    ux       (x, y, t) float32 132kB 1.0 1.0 1.0 1.0 ... 1.014 1.014 1.014 1.014
    uy       (x, y, t) float32 132kB 3.407e-07 1.503e-07 ... 0.007724 0.007703

Do you need the snapshots in a range? No problem. Let’s do a slice to load the last 100, and just to exemplify, compute a time average:

time_averaged = prm.dataset[-100:].mean("t")
time_averaged
<xarray.Dataset> Size: 396kB
Dimensions:  (x: 257, y: 128)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
Data variables:
    pp       (x, y) float32 132kB 0.05351 0.05335 0.05356 ... 0.03886 0.03886
    ux       (x, y) float32 132kB 1.0 1.0 1.0 1.0 ... 1.015 1.015 1.015 1.015
    uy       (x, y) float32 132kB -6.206e-09 2.081e-09 ... -6.504e-05 -6.531e-05

You can even use the slice notation to load all the snapshots at once:

prm.dataset[:]
<xarray.Dataset> Size: 79MB
Dimensions:  (x: 257, y: 128, t: 201)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 804B 0.0 0.75 1.5 2.25 3.0 ... 147.8 148.5 149.2 150.0
Data variables:
    pp       (x, y, t) float32 26MB 0.0 0.03264 0.03613 ... 0.03922 0.03859
    ux       (x, y, t) float32 26MB 1.0 1.0 1.0 1.0 ... 1.015 1.015 1.015 1.015
    uy       (x, y, t) float32 26MB 9.98e-06 8.496e-08 ... -0.0005357 0.003209

Of course, some simulations may not fit in the memory like in this tutorial. For these cases we can iterate over all snapshots, loading them one by one:

for ds in prm.dataset:
    # Computing the vorticity, just to exemplify
    vort = ds.uy.x3d.first_derivative("x") - ds.ux.x3d.first_derivative("y")

Note that reversed(prm.dataset) also works.

Or for better control, we can iterate over a selected range of snapshots loading them one by one. The arguments are the same of a classic range in Python:

for ds in prm.dataset(100, 200, 1):
    # Computing the vorticity, just to exemplify
    vort = ds.uy.x3d.first_derivative("x") - ds.ux.x3d.first_derivative("y")
# Result from the last iteration
vort
<xarray.DataArray (y: 128, t: 1, x: 257)> Size: 132kB
array([[[-0.00268999,  0.00181677,  0.0001196 , ...,  0.00324146,
          0.00302246, -0.01646465]],

       [[-0.0034743 , -0.00406752, -0.00344934, ...,  0.0029434 ,
          0.00254392, -0.01612649]],

       [[-0.00303078, -0.00340789, -0.0030852 , ...,  0.0026302 ,
          0.0024012 , -0.01718407]],

       ...,

       [[-0.00178402, -0.00190772, -0.00161242, ...,  0.00415556,
          0.00375753, -0.01497194]],

       [[-0.00232817, -0.00317189, -0.00255065, ...,  0.00385816,
          0.00362989, -0.01590095]],

       [[-0.00263563,  0.00207518,  0.00038911, ...,  0.00355052,
          0.00316822, -0.01547501]]], dtype=float32)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 4B 149.2

Writing the results to binary files#

In the last example we computed the vorticity but did nothing with it. This time, let’s write it to the disc using write:

for ds in prm.dataset:
    vort = ds.uy.x3d.first_derivative("x") - ds.ux.x3d.first_derivative("y")
    prm.dataset.write(data=vort, file_prefix="w3")

The example above works for a xarray.DataArray. We can do it for a xarray.Dataset as well, but with one key difference. Only the arrays with an attribute called file_name will be written. It is done to avoid overwriting the base fields (ux, uy, uz, …) by accident.

Let’s rewrite the previous example to store vort in the dataset ds. We set an attribute file_name to w3, so the arrays will be written as w3-000.bin, w3-001.bin, w3-002.bin, etc.

We are also suppressing warnings, because the application will tell us it can not save pp, ux and uy, since they do not have a file_name. But in fact, we do not want to rewrite them anyway.

See the code:

with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=UserWarning)
    for ds in prm.dataset:
        ds["vort"] = ds.uy.x3d.first_derivative("x") - ds.ux.x3d.first_derivative("y")
        ds["vort"].attrs["file_name"] = "w3"
        prm.dataset.write(ds)

The method prm.dataset.write() writes the files as raw binaries in the same way that XCompact3d would do. It means you can read them at the flow solver and also process them on any other tool that you are already familiar with, including the toolbox.

For instance, we get w3 if we load snapshot 0 again:

prm.dataset[0]
<xarray.Dataset> Size: 528kB
Dimensions:  (x: 257, y: 128, t: 1)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 4B 0.0
Data variables:
    pp       (x, y, t) float32 132kB 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    ux       (x, y, t) float32 132kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0
    uy       (x, y, t) float32 132kB 9.98e-06 1.955e-05 ... -2.285e-05 1.962e-05
    w3       (x, y, t) float32 132kB 0.00038 -0.000863 ... -0.001097 4.864e-05

Update the xdmf file#

After computing and writing new results to the disc, you can open them on any external tools, like Paraview or Visit. You can update the xdmf file to include the recently computed w3. See the code:

prm.dataset.write_xdmf("xy-planes.xdmf")

Other formats#

Xarray objects can be exported to many other formats, depending on your needs.

For instance, xarray.DataArray and xarray.Dataset can be written as netCDF. In this way, they will keep all dimensions, coordinates, and attributes. This format is easier to handle and share because the files are self-sufficient. It is the format used to download the dataset used in this tutorial, and it is a good alternative to use when sharing the results of your research.

Just to give you an estimation about the disk usage, the size of the dataset cylinder.nc that we downloaded for this tutorial is 75.8 MB. The size of the folder ./data/ after producing the binary files in the same way that XCompact3d would do is 75.7 MB.

To exemplify the use of netCDF, let’s take one snapshot:

snapshot = prm.dataset[0]
snapshot
<xarray.Dataset> Size: 528kB
Dimensions:  (x: 257, y: 128, t: 1)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 4B 0.0
Data variables:
    pp       (x, y, t) float32 132kB 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    ux       (x, y, t) float32 132kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0
    uy       (x, y, t) float32 132kB 9.98e-06 1.955e-05 ... -2.285e-05 1.962e-05
    w3       (x, y, t) float32 132kB 0.00038 -0.000863 ... -0.001097 4.864e-05

Now, let’s include additional information for the ones that are going to use our data. You can set attributes for each array, coordinate, and also global attributes for the dataset. They are stored in a dictionary.

See the example:

# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}
# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}
# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}
# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}

Exporting it as a netCDF file:

snapshot.to_netcdf("snapshot-000.nc")

Importing the netCDF file:

snapshot_in = xr.open_dataset("snapshot-000.nc")

See the result, it keeps all dimensions, coordinates, and attributes:

snapshot_in
<xarray.Dataset> Size: 528kB
Dimensions:  (x: 257, y: 128, t: 1)
Coordinates:
  * x        (x) float32 1kB 0.0 0.07812 0.1562 0.2344 ... 19.84 19.92 20.0
  * y        (y) float32 512B 0.0 0.09375 0.1875 0.2812 ... 11.72 11.81 11.91
  * t        (t) float32 4B 0.0
Data variables:
    pp       (x, y, t) float32 132kB ...
    ux       (x, y, t) float32 132kB ...
    uy       (x, y, t) float32 132kB ...
    w3       (x, y, t) float32 132kB ...
Attributes:
    title:    An example from the tutorials
    url:      https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html
    authors:  List of names
    doi:      maybe a fancy doi from zenodo

We can compare them and see that their data, dimensions and coordinates are exactly the same:

xr.testing.assert_equal(snapshot, snapshot_in)

Xarray is built on top of Numpy, so you can access a numpy.ndarray object with the property values (i.g., epsilon.values). It is compatible with numpy.save and many other methods from the Numpy/SciPy ecosystem (many times, you do not even need to explicitly use .values). See the example:

np.save("epsi.npy", epsilon)
epsi_in = np.load("epsi.npy")

print(type(epsi_in))
epsi_in
<class 'numpy.ndarray'>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

You can use it for backwards compatibility with your previous post-processing tools. It is just not so effective, because we lost track of metadata like the coordinates and attributes.

If you manage to reduce the dataset’s dimensions with some integration, mean, or selecting subsets of data, you can convert it to a pandas.Dataframe and then export it to CSV, Excel, and many other options.

For instance, let’s select a vertical profile for all variables where x = 20 and convert it to a dataframe:

snapshot_in.sel(x=20.0).to_dataframe()
x pp ux uy w3
y t
0.00000 0.0 20.0 0.0 1.000004 0.000011 -0.000180
0.09375 0.0 20.0 0.0 1.000012 -0.000005 -0.000235
0.18750 0.0 20.0 0.0 1.000000 0.000024 -0.000465
0.28125 0.0 20.0 0.0 1.000002 -0.000019 -0.000969
0.37500 0.0 20.0 0.0 0.999997 -0.000018 0.002280
... ... ... ... ... ... ...
11.53125 0.0 20.0 0.0 1.000013 0.000023 0.000101
11.62500 0.0 20.0 0.0 1.000009 -0.000010 -0.002189
11.71875 0.0 20.0 0.0 1.000012 -0.000005 0.000339
11.81250 0.0 20.0 0.0 0.999985 -0.000023 -0.001097
11.90625 0.0 20.0 0.0 0.999992 0.000020 0.000049

128 rows × 5 columns

Now, you can refer to pandas documentation for more details.