Reading and writing files

Reading and writing files#

This tutorial includes an overview of the different ways available to load the binary arrays from the disc after running a numerical simulation with XCompact3d. Besides that, some options are presented to save the results from our analysis, together with some tips and tricks.

Preparation#

Here we prepare the dataset for this notebook, so it can be reproduced on local machines or on the cloud, you are invited to test and interact with many of the concepts. It also provides nice support for courses and tutorials, let us know if you produce any of them.

The very first step is to import the toolbox and other packages:

import warnings

import numpy as np
import xarray as xr

import xcompact3d_toolbox as x3d

Then we can download an example from the online database, the flow around a cylinder in this case. We set cache=True and a local destination where it can be saved in our computer cache_dir="./example/", so there is no need to download it every time the kernel is restarted.

cylinder_ds, prm = x3d.tutorial.open_dataset("cylinder", cache=True, cache_dir="./example/")

let’s take a look at the dataset:

cylinder_ds.info()

xarray.Dataset {
dimensions:
	i = 2 ;
	x = 257 ;
	y = 128 ;
	t = 201 ;

variables:
	float32 u(i, x, y, t) ;
	float32 pp(x, y, t) ;
	float32 epsi(x, y) ;
	float64 x(x) ;
	float64 y(y) ;
	float64 t(t) ;
	<U1 i(i) ;

// global attributes:
	:xcompact3d_version = v3.0-397-gff531df ;
	:xcompact3d_toolbox_version = 1.0.1 ;
	:url = https://github.com/fschuch/xcompact3d_toolbox_data ;
	:dataset_license = MIT License ;
}

We got a xarray.Dataset with the variables u (velocity vector), pp (pressure) and epsi (describes the geometry), their coordinates (x, y, t and i) and some attributes like the xcompact3d_version used to run this simulation, the url where you can find the dataset, and others.

In the next block, we configure the toolbox and some attributes at the dataset, so we can write all the binary fields to the disc. Do not worry about the details right now, this is just the preparation step, we are going to discuss them later.

x3d.param["mytype"] = np.float32

prm.dataset.set(data_path="./data/", drop_coords="z")

cylinder_ds.u.attrs["file_name"] = "u"
cylinder_ds.pp.attrs["file_name"] = "pp"
cylinder_ds.epsi.attrs["file_name"] = "epsilon"

prm.write("input.i3d")

prm.dataset.write(cylinder_ds)

prm.dataset.write_xdmf("xy-planes.xdmf")

del cylinder_ds, prm

After that, the files are organized as follow:

tutorial
│   computing_and_plotting.ipynb
│   io.ipynb
│   input.i3d
│   parameters.ipynb
│   xy-planes.xdmf
│
└─── data
│       │   epsilon.bin
│       │   pp-000.bin
│       │   pp-001.bin
│       │   ... 
│       │   pp-199.bin
│       │   pp-200.bin
│       │   ux-000.bin
│       │   ux-001.bin
│       │   ... 
│       │   ux-199.bin
│       │   ux-200.bin
│       │   uy-000.bin
│       │   uy-001.bin
│       │   ... 
│       │   uy-199.bin
│       │   uy-200.bin
│       │   uz-000.bin
│       │   uz-001.bin
│       │   ... 
│       │   uz-199.bin
│       │   uz-200.bin
│
└─── example
│       │   cylinder.nc

It is very similar to what we get after successfully running a simulation, so now we can move on to the tutorial.

Why xarray?#

The data structures are provided by xarray, that introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like arrays, which allows for a more intuitive, more concise, and less error-prone developer experience. It integrates tightly with dask for parallel computing.

The goal here is to speed up the development of customized post-processing applications with the concise interface provided by xarray. Ultimately, we can compute solutions with fewer lines of code and better readability, so we expend less time testing and debugging and more time exploring our datasets and getting insights.

Additionally, xcompact3d-toolbox includes extra functionalities for DataArray and Dataset.

Before going forward, please, take a look at Overview: Why xarray? and Quick overview to understand the motivation to use xarray’s data structures instead of just numpy-like arrays.

Other formats#

Xarray objects can be exported to many other formats, depending on your needs.

For instance, xarray.DataArray and xarray.Dataset can be written as netCDF. In this way, they will keep all dimensions, coordinates, and attributes. This format is easier to handle and share because the files are self-sufficient. It is the format used to download the dataset used in this tutorial, and it is a good alternative to use when sharing the results of your research.

Just to give you an estimation about the disk usage, the size of the dataset cylinder.nc that we downloaded for this tutorial is 75.8 MB. The size of the folder ./data/ after producing the binary files in the same way that XCompact3d would do is 75.7 MB.

To exemplify the use of netCDF, let’s take one snapshot:

Now, let’s include additional information for the ones that are going to use our data. You can set attributes for each array, coordinate, and also global attributes for the dataset. They are stored in a dictionary.

See the example:

# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}

# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}

# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}

# Setting attributes for each coordinate
snapshot.x.attrs = {"name": "x", "long_name": "Stream-wise coordinate", "units": "-"}
snapshot.y.attrs = {"name": "y", "long_name": "Vertical coordinate", "units": "-"}
snapshot.t.attrs = {"name": "t", "long_name": "Time", "units": "-"}

# Setting attributes for each array
snapshot.ux.attrs = {"name": "ux", "long_name": "Stream-wise velocity", "units": "-"}
snapshot.uy.attrs = {"name": "y", "long_name": "Vertical velocity", "units": "-"}
snapshot.pp.attrs = {"name": "p", "long_name": "Pressure", "units": "-"}
snapshot.w3.attrs = {"name": "w3", "long_name": "Vorticity", "units": "-"}

# Setting attributes for the dataset
snapshot.attrs = {
    "title": "An example from the tutorials",
    "url": "https://docs.fschuch.com/xcompact3d_toolbox/tutorial/io.html",
    "authors": "List of names",
    "doi": "maybe a fancy doi from zenodo",
}

Exporting it as a netCDF file:

snapshot.to_netcdf("snapshot-000.nc")

Importing the netCDF file:

snapshot_in = xr.open_dataset("snapshot-000.nc")

See the result, it keeps all dimensions, coordinates, and attributes:

We can compare them and see that their data, dimensions and coordinates are exactly the same:

xr.testing.assert_equal(snapshot, snapshot_in)

Xarray is built on top of Numpy, so you can access a numpy.ndarray object with the property values (i.g., epsilon.values). It is compatible with numpy.save and many other methods from the Numpy/SciPy ecosystem (many times, you do not even need to explicitly use .values). See the example:

np.save("epsi.npy", epsilon)
epsi_in = np.load("epsi.npy")

print(type(epsi_in))
epsi_in

<class 'numpy.ndarray'>

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

You can use it for backwards compatibility with your previous post-processing tools. It is just not so effective, because we lost track of metadata like the coordinates and attributes.

If you manage to reduce the dataset’s dimensions with some integration, mean, or selecting subsets of data, you can convert it to a pandas.Dataframe and then export it to CSV, Excel, and many other options.

For instance, let’s select a vertical profile for all variables where x = 20 and convert it to a dataframe:

snapshot_in.sel(x=20.0).to_dataframe()

		x	pp	ux	uy	w3
y	t
0.00000	0.0	20.0	0.0	1.000004	0.000011	-0.000180
0.09375	0.0	20.0	0.0	1.000012	-0.000005	-0.000235
0.18750	0.0	20.0	0.0	1.000000	0.000024	-0.000465
0.28125	0.0	20.0	0.0	1.000002	-0.000019	-0.000969
0.37500	0.0	20.0	0.0	0.999997	-0.000018	0.002280
...	...	...	...	...	...	...
11.53125	0.0	20.0	0.0	1.000013	0.000023	0.000101
11.62500	0.0	20.0	0.0	1.000009	-0.000010	-0.002189
11.71875	0.0	20.0	0.0	1.000012	-0.000005	0.000339
11.81250	0.0	20.0	0.0	0.999985	-0.000023	-0.001097
11.90625	0.0	20.0	0.0	0.999992	0.000020	0.000049

128 rows × 5 columns

Now, you can refer to pandas documentation for more details.

Reading and writing files

Contents

Reading and writing files#

Preparation#

Why xarray?#

Xarray objects on demand#

Writing the results to binary files#

Update the xdmf file#

Other formats#