GSD Documentation: Release 2.1.1
GSD Documentation: Release 2.1.1
GSD Documentation: Release 2.1.1
Release 2.1.1
1 Installation 3
2 Change Log 7
3 User community 13
4 HOOMD 15
5 File layer 21
7 C API 47
8 Specification 55
9 Contribute 79
10 Code style 81
11 Credits 85
12 License 87
13 Index 89
Index 93
i
ii
GSD Documentation, Release 2.1.1
The GSD file format is the native file format for HOOMD-blue. GSD files store trajectories of the HOOMD-blue
system state in a binary file with efficient random access to frames. GSD allows all particle and topology properties
to vary from one frame to the next. Use the GSD Python API to specify the initial condition for a HOOMD-blue
simulation or analyze trajectory output with a script. Read a GSD trajectory with a visualization tool to explore the
behavior of the simulation.
• GitHub Repository: GSD source code and issue tracker.
• HOOMD-blue: Simulation engine that reads and writes GSD files.
• hoomd-users Google Group: Ask questions to the HOOMD-blue community.
• freud: A powerful set of tools for analyzing trajectories.
• OVITO: The Open Visualization Tool works with GSD files.
• gsd-vmd plugin: VMD plugin to support GSD files.
GETTING STARTED 1
GSD Documentation, Release 2.1.1
2 GETTING STARTED
CHAPTER
ONE
INSTALLATION
gsd binaries are available in the glotzerlab-software Docker/Singularity images and in packages on conda-forge and
PyPI. You can also compile gsd from source, embed gsd.c in your code, or read gsd files with a pure Python reader
pygsd.py.
1.1 Binaries
gsd is available on conda-forge. To install, first download and install miniconda. Then add the conda-forge channel
and install gsd:
See the glotzerlab-software documentation for container usage information and cluster specific instructions.
1.1.3 PyPI
$ curl -O https://glotzerlab.engin.umich.edu/downloads/gsd/gsd-v2.1.1.tar.gz
3
GSD Documentation, Release 2.1.1
When using a shared Python installation, create a virtual environment where you can install gsd:
Activate the environment before configuring and before executing gsd scripts:
$ source /path/to/environment/bin/activate
Note: Other types of virtual environments (such as conda) may work, but are not thoroughly tested.
gsd requires:
• C compiler (tested with gcc 4.8-9.0, clang 4-9, vs2017-2019)
• Python >= 3.5
• numpy >= 1.9.3
• Cython >= 0.22
Additional packages may be needed:
• pytest >= 3.9.0 (unit tests)
• Sphinx (documentation)
• IPython (documentation)
• an internet connection (documentation)
• CMake (for development builds)
Install these tools with your system or virtual environment package manager. gsd developers have had success with
pacman (arch linux), apt-get (ubuntu), Homebrew (macOS), and MacPorts (macOS):
Typical HPC cluster environments provide Python, numpy, and cmake via a module system:
Note: Packages may be named differently, check your system’s package list. Install any -dev packages as needed.
Tip: You can install numpy and other python packages into your virtual environment:
4 Chapter 1. Installation
GSD Documentation, Release 2.1.1
Use pip to install the python module into your virtual environment:
You can assemble a functional python module in the build directory. Configure with CMake and compile with make.
$ mkdir build
$ cd build
$ cmake ../
$ make
Add the build directory path to your PYTHONPATH to test gsd or build documentation:
$ export PYTHONPATH=$PYTHONPATH:/path/to/build
Run pytest in the source directory to execute all unit tests. This requires that the compiled python module is on the
python path.
$ cd /path/to/gsd
$ pytest
Build the user documentation with Sphinx. IPython is required to build the documentation, as is an active internet
connection. First, you need to compile and install gsd. If you compiled with CMake, add gsd to your PYTHONPATH
first:
$ export PYTHONPATH=$PYTHONPATH:/path/to/build
$ cd /path/to/gsd
$ cd doc
$ make html
$ open _build/html/index.html
gsd is implemented in a single C file. Copy gsd/gsd.h and gsd/gsd.c into your project.
If you only need to read files, you can skip installing and just extract the module modules gsd/pygsd.py and gsd/
hoomd.py. Together, these implement a pure Python reader for gsd and HOOMD files - no C compiler required.
6 Chapter 1. Installation
CHAPTER
TWO
CHANGE LOG
2.1 v2.x
Fixed
• Adding missing close method to HOOMDTrajectory.
• Documentation improvements.
Fixed
• List defaults in gsd.fl.open documentation.
Added
• Shape specification for sphere unions.
Note
• This release introduces a new file storage format.
• GSD >= 2.0 can read and write to files created by GSD 1.x.
• Files created or upgraded by GSD >= 2.0 can not be opened by GSD < 1.x.
Added
• The upgrade method converts a GSD 1.0 file to a GSD 2.0 file in place.
• Support arbitrarily long chunk names (only in GSD 2.0 files).
Changed
7
GSD Documentation, Release 2.1.1
• gsd.fl.open accepts None for application, schema, and schema_version when opening files for
reading.
• Improve read latency when accessing files with thousands of chunk names in a frame (only for GSD 2.0 files).
• Buffer small writes to improve write performance.
• Improve performance and reduce memory usage in read/write modes (‘rb+’, ‘wb+’ and (‘xb+’).
• C API: functions return error codes from the gsd_error enum. v2.x integer error codes differ from v1.x, use
the enum to check. For example: if (retval == GSD_ERROR_IO).
• Python, Cython, and C code must follow strict style guidelines.
Removed
• gsd.fl.create - use gsd.fl.open.
• gsd.hoomd.create - use gsd.hoomd.open.
• GSDFile v1.0 compatibility mode - use gsd.fl.open.
• hoomdxml2gsd.py.
Fixed
• Allow more than 127 data chunk names in a single GSD file.
2.2 v1.x
• Correctly raise IndexError when attempting to read frames before the first frame.
• Raise RuntimeError when importing gsd in unsupported Python versions.
• Slicing a HOOMDTrajectory object returns a view that can be used to directly select frames from a subset or
sliced again.
• raise IndexError when attempting to read frames before the first frame.
• Dropped support for Python 2.
2.2. v1.x 9
GSD Documentation, Release 2.1.1
• Documentation updates
• The length of sliced HOOMDTrajectory objects can be determined with the built-in len() function.
• Add pyproject.toml file that defines numpy as a proper build dependency (requires pip >= 10)
• Reorganize documentation
• Documentation fixes.
• Support reading and writing chunks with 0 length. No schema changes are necessary to support this.
• Add gsd.hoomd.open() method which can create and open hoomd gsd files.
• Add gsd.fl.open() method which can create and open gsd files.
• The previous create/class GSDFile instantiation is still supported for backward compatibility.
Initial release.
2.2. v1.x 11
GSD Documentation, Release 2.1.1
THREE
USER COMMUNITY
GSD primarily exists as a file format for HOOMD-blue, so please use the hoomd-users mailing list. Subscribe for re-
lease announcements, to post questions questions for advice on using the software, and discuss potential new features.
13
GSD Documentation, Release 2.1.1
FOUR
HOOMD
In [1]: s = gsd.hoomd.Snapshot()
In [2]: s.particles.N = 4
gsd.hoomd represents the state of a single frame with an instance of the class gsd.hoomd.Snapshot. Instantiate
this class to create a system configuration. All fields default to None and are only written into the file if not None and
do not match the data in the first frame or defaults specified in the schema.
15
GSD Documentation, Release 2.1.1
In [12]: len(f)
Out[12]: 11
Use gsd.hoomd.open to open a GSD file with the high level interface gsd.hoomd.HOOMDTrajectory. It
behaves like a list, with append and extend methods.
Note: gsd.hoomd.HOOMDTrajectory currently does not support files opened in append mode.
Tip: When using extend, pass in a generator or generator expression to avoid storing the entire trajectory in memory
before writing it out.
In [15]: snap.configuration.step
Out[15]: 5
In [16]: snap.particles.N
Out[16]: 9
In [17]: snap.particles.position
Out[17]:
array([[0.6291546 , 0.8447468 , 0.31500977],
[0.310166 , 0.66830224, 0.76962054],
[0.87307405, 0.04317319, 0.34521452],
[0.2597723 , 0.914238 , 0.8930648 ],
[0.9215674 , 0.46188784, 0.03385989],
[0.8863687 , 0.20264274, 0.7333709 ],
[0.9161243 , 0.32221183, 0.42147696],
[0.0146919 , 0.15931436, 0.13138972],
[0.2875532 , 0.24792461, 0.8465807 ]], dtype=float32)
gsd.hoomd.HOOMDTrajectory supports random indexing of frames in the file. Indexing into a trajectory returns
a gsd.hoomd.Snapshot.
16 Chapter 4. HOOMD
GSD Documentation, Release 2.1.1
Slicing a trajectory creates a trajectory view, which can then be queried for length or sliced again. Selecting individual
frames from a view works exactly like selecting individual frames from the original trajectory object.
In [23]: t = gsd.hoomd.HOOMDTrajectory(f);
In [24]: t[3].particles.position
Out[24]:
array([[0.39207488, 0.54220635, 0.41176417],
[0.13017783, 0.45084673, 0.21670276],
[0.16392069, 0.8004633 , 0.8524705 ],
[0.64471924, 0.02428709, 0.692458 ],
[0.21873572, 0.81648225, 0.2876905 ],
[0.67504257, 0.38301533, 0.78710103],
[0.975542 , 0.9488523 , 0.25835207]], dtype=float32)
You can use GSD without needing to compile C code to read GSD files using gsd.pygsd.GSDFile in combination
with gsd.hoomd.HOOMDTrajectory. It only supports the rb mode and does not read files as fast as the C
implementation. It takes in a python file-like object, so it can be used with in-memory IO classes, and grid file classes
that access data over the internet.
Logged data is stored in the log dictionary as numpy arrays. Place data into this dictionary directly without the ‘log/’
prefix and gsd will include it in the output. Store per-particle quantities with the prefix particles/. Choose another
prefix for other quantities.
In [26]: f = gsd.hoomd.open(name='example.gsd', mode='rb')
In [27]: s = f[0]
In [28]: s.log['particles/net_force']
Out[28]:
array([[-1., 2., -3.],
[ 0., 2., -4.],
[-3., 2., 1.],
[ 1., 2., 3.]], dtype=float32)
In [29]: s.log['value/potential_energy']
Out[29]: array([1.5])
~/checkouts/readthedocs.org/user_builds/gsd/conda/latest/lib/python3.7/site-packages/
˓→gsd-2.1.1-py3.7-linux-x86_64.egg/gsd/hoomd.py in append(self, snapshot)
18 Chapter 4. HOOMD
GSD Documentation, Release 2.1.1
State data is stored in the state dictionary as numpy arrays. Place data into this dictionary directly without the
‘state/’ prefix and gsd will include it in the output. Shape vertices are stored in a packed format. In this example, type
‘A’ has 3 vertices (the first 3 in the list) and type ‘B’ has 4 (the next 4).
In [36]: result
Out[36]: [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
gsd.hoomd.HOOMDTrajectory can be pickled when in read mode to allow for multiprocessing through pythons
native multiprocessing library. Here cnt_part finds the number of particles in each frame and appends it to a list.
20 Chapter 4. HOOMD
CHAPTER
FIVE
FILE LAYER
The file layer python module gsd.fl allows direct low level access to read and write GSD files of any schema. The
HOOMD reader (gsd.hoomd) provides higher level access to HOOMD schema files, see HOOMD.
View the page source to find unformatted example code.
In [1]: f = gsd.fl.open(name="file.gsd",
...: mode='wb',
...: application="My application",
...: schema="My Schema",
...: schema_version=[1,0])
...:
Note: When creating a new file, you must specify the application name, schema name, and schema version.
Warning: Opening a gsd file with a ‘w’ or ‘x’ mode overwrites any existing file with the given name.
In [2]: f.close()
21
GSD Documentation, Release 2.1.1
In [3]: f = gsd.fl.open(name="file.gsd",
...: mode='wb',
...: application="My application",
...: schema="My Schema",
...: schema_version=[1,0]);
...:
In [6]: f.end_frame()
In [9]: f.end_frame()
In [10]: f.close()
Add any number of named data chunks to each frame in the file with write_chunk. The data must be a 1 or 2
dimensional numpy array of a simple numeric type (or a data type that will automatically convert when passed to
numpy.array(data). Call end_frame to end the frame and start the next one.
Note: While supported, implicit conversion to numpy arrays creates a copy of the data in memory and adds conversion
overhead.
Warning: Call end_frame to write the last frame before closing the file.
In [14]: f.close()
read_chunk reads the named chunk at the given frame index in the file and returns it as a numpy array.
In [19]: f.close()
chunk_exists tests to see if a chunk by the given name exists in the file at the given frame.
In [21]: f.find_matching_chunk_names('')
Out[21]: ['chunk1', 'chunk2']
In [22]: f.find_matching_chunk_names('chunk')
Out[22]: ['chunk1', 'chunk2']
In [23]: f.find_matching_chunk_names('chunk1')
Out[23]: ['chunk1']
In [24]: f.find_matching_chunk_names('other')
Out[24]: []
find_matching_chunk_names finds all chunk names present in a GSD file that start with the given string.
In [27]: data
Out[27]: array([1., 2., 3., 4.], dtype=float32)
gsd/fl.pyx in gsd.fl.GSDFile.write_chunk()
gsd/fl.pyx in gsd.fl.__raise_on_error()
In [29]: f.close()
In [31]: f.name
Out[31]: 'file.gsd'
In [32]: f.mode
Out[32]: 'rb'
In [33]: f.gsd_version
Out[33]: (2, 0)
In [34]: f.application
Out[34]: 'My application'
In [35]: f.schema
Out[35]: 'My Schema'
In [36]: f.schema_version
Out[36]: (1, 0)
In [37]: f.nframes
Out[37]: 2
In [38]: f.close()
In [39]: f = gsd.fl.open(name="file.gsd",
....: mode='wb+',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [42]: f.nframes
Out[42]: 1
In [46]: f.end_frame()
In [47]: f.nframes
Out[47]: 2
gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()
In [49]: f.close()
Open a file in append mode to write additional chunks to an existing file, but prevent reading.
In [51]: data
Out[51]: array([1., 2., 3., 4.])
Use gsd.fl.GSDFile as a context manager for guaranteed file closure and cleanup when exceptions occur.
In [52]: f = gsd.fl.open(name="file.gsd",
....: mode='wb+',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [53]: f.mode
Out[53]: 'wb+'
In [56]: b = b.view(dtype=numpy.int8)
In [57]: b
Out[57]:
array([ 84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 116, 114,
105, 110, 103, 0], dtype=int8)
In [59]: f.end_frame()
In [61]: r
Out[61]:
array([ 84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 116, 114,
105, 110, 103, 0], dtype=int8)
In [63]: r[0].decode('UTF-8')
Out[63]: 'This is a string'
In [64]: f.close()
To store a string in a gsd file, convert it to a numpy array of bytes and store that data in the file. Decode the byte
sequence to get back a string.
5.13 Truncate
In [66]: f.nframes
Out[66]: 1
In [69]: f.nframes
Out[69]: 0
In [71]: f.close()
Truncating a gsd file removes all data chunks from it, but retains the same schema, schema version, and application
name. The file is not closed during this process. This is useful when writing restart files on a Lustre file system when
file open operations need to be kept to a minimum.
5.13. Truncate 27
GSD Documentation, Release 2.1.1
SIX
GSD provides a Python API intended for most users. Developers, or users not working with the Python language,
may want to use the C API.
6.1 Submodules
29
GSD Documentation, Release 2.1.1
schema_version
Schema version number (major, minor).
Type typing.Tuple [int, int]
nframes
Number of frames.
Type int
chunk_exists(frame, name)
Test if a chunk exists.
Parameters
• frame (int) – Index of the frame to check
• name (str) – Name of the chunk
Returns True if the chunk exists in the file. False if it does not.
Return type bool
Example
In [7]: f.close()
close()
Close the file.
Once closed, any other operation on the file object will result in a ValueError. close() may be called
more than once. The file is automatically closed when garbage collected or when the context manager exits.
Example
In [2]: f.write_chunk(name='chunk1',
...: data=numpy.array([1,2,3,4], dtype=numpy.float32))
...:
In [3]: f.end_frame()
In [5]: f.close()
gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()
end_frame()
Complete writing the current frame. After calling end_frame() future calls to write_chunk() will
write to the next frame in the file.
Danger: Call end_frame() to complete the current frame before closing the file. If you fail to call
end_frame(), the last frame will not be written to disk.
6.1. Submodules 31
GSD Documentation, Release 2.1.1
Example
In [2]: f.write_chunk(name='chunk1',
...: data=numpy.array([1,2,3,4], dtype=numpy.float32))
...:
In [3]: f.end_frame()
In [4]: f.write_chunk(name='chunk1',
...: data=numpy.array([9,10,11,12],
...: dtype=numpy.float32))
...:
In [5]: f.end_frame()
In [6]: f.write_chunk(name='chunk1',
...: data=numpy.array([13,14],
...: dtype=numpy.float32))
...:
In [7]: f.end_frame()
In [8]: f.nframes
Out[8]: 3
In [9]: f.close()
find_matching_chunk_names(match)
Find all the chunk names in the file that start with the string match.
Parameters match (str) – Start of the chunk name to match
Returns Matching chunk names
Return type typing.List[str]
Example
In [3]: f.find_matching_chunk_names('')
Out[3]: ['data/chunk1', 'data/chunk2', 'input/chunk3', 'input/chunk4']
In [4]: f.find_matching_chunk_names('data')
Out[4]: ['data/chunk1', 'data/chunk2']
In [5]: f.find_matching_chunk_names('input')
Out[5]: ['input/chunk3', 'input/chunk4']
In [6]: f.find_matching_chunk_names('other')
Out[6]: []
In [7]: f.close()
read_chunk(frame, name)
Read a data chunk from the file and return it as a numpy array.
Parameters
• frame (int) – Index of the frame to read
• name (str) – Name of the chunk
Returns Data read from file. type is determined by the chunk metadata. If the data is NxM in
the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.
Return type numpy.ndarray[type, ndim=?, mode='c']
Tip: Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead,
don’t call read_chunk() on the same chunk repeatedly. Cache the arrays instead.
Example
6.1. Submodules 33
GSD Documentation, Release 2.1.1
gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()
In [6]: f.close()
truncate()
Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application,
schema, and schema version remain the same.
Example
In [3]: f.nframes
Out[3]: 10
In [5]: f.truncate()
In [6]: f.nframes
Out[6]: 0
In [8]: f.close()
upgrade()
Upgrade a GSD file to the v2 specification in place. The file must be open in a writable mode.
write_chunk(name, data)
Write a data chunk to the file. After writing all chunks in the current frame, call end_frame().
Parameters
• name (str) – Name of the chunk
• data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer
dimensions.
Warning: write_chunk() will implicitly converts array-like and non-contiguous numpy arrays
to contiguous numpy arrays with numpy.ascontiguousarray(data). This may or may not
produce desired data types in the output file and incurs overhead.
Example
In [2]: f.write_chunk(name='float1d',
...: data=numpy.array([1,2,3,4],
...: dtype=numpy.float32))
...:
In [3]: f.write_chunk(name='float2d',
...: data=numpy.array([[13,14],[15,16],[17,19]],
...: dtype=numpy.float32))
...:
In [4]: f.write_chunk(name='double2d',
...: data=numpy.array([[1,4],[5,6],[7,9]],
...: dtype=numpy.float64))
...:
In [5]: f.write_chunk(name='int1d',
...: data=numpy.array([70,80,90],
...: dtype=numpy.int64))
(continues on next page)
6.1. Submodules 35
GSD Documentation, Release 2.1.1
In [6]: f.end_frame()
In [7]: f.nframes
Out[7]: 1
In [8]: f.close()
mode description
'rb' Open an existing file for reading.
'rb+' Open an existing file for reading and writing.
'wb' Open a file for writing. Creates the file if needed, or overwrites an existing file.
'wb+' Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.
'xb' Create a gsd file exclusively and opens it for writing. Raise an FileExistsError exception if
it already exists.
'xb+' Create a gsd file exclusively and opens it for reading and writing. Raise an FileExistsError
exception if it already exists.
'ab' Open an existing file for writing. Does not create or overwrite existing files.
When opening a file for reading ('r' and 'a' modes): ``application and schema_version
are ignored and may be None. When schema is not None, open() throws an exception if the file’s schema
does not match schema.
When opening a file for writing ('w' or 'x' modes): The given application, schema, and
schema_version are saved in the file and must not be None.
Example
In [4]: data
Out[4]: array([1., 2., 3., 4.], dtype=float32)
In [5]: f.close()
Note: M varies depending on the type of bond. BondData represents all types of bonds.
6.1. Submodules 37
GSD Documentation, Release 2.1.1
Type M
Bond 2
Angle 3
Dihedral 4
Improper 4
Pair 2
N
Number of particles in the snapshot (bonds/N , angles/N , dihedrals/N , impropers/N ,
pairs/N ).
Type int
types
Names of the particle types (bonds/types, angles/types, dihedrals/types, impropers/
types, pairs/types).
Type typing.List [str]
typeid
Bond type id (bonds/typeid, angles/typeid, dihedrals/typeid, impropers/typeid,
pairs/types).
Type (N, 3) numpy.ndarray of numpy.uint32
group
Tags of the particles in the bond (bonds/group, angles/group, dihedrals/group,
impropers/group, pairs/group).
Type (N, M) numpy.ndarray of numpy.uint32
validate()
Validate all attributes.
Convert every array attribute to a numpy.ndarray of the proper type and check that all attributes have
the correct dimensions.
Ignore any attributes that are None.
Warning: Array attributes that are not contiguous numpy arrays will be replaced with contiguous
numpy arrays of the appropriate type.
class gsd.hoomd.ConfigurationData
Store configuration data.
Use the Snapshot.configuration attribute of a to access the configuration.
step
Time step of this frame (configuration/step).
Type int
dimensions
Number of dimensions (configuration/dimensions).
Type int
box
Box dimensions (configuration/box) [lx, ly, lz, xy, xz, yz].
Type (6, 1) numpy.ndarray of numpy.float32
validate()
Validate all attributes.
Convert every array attribute to a numpy.ndarray of the proper type and check that all attributes have
the correct dimensions.
Ignore any attributes that are None.
Warning: Array attributes that are not contiguous numpy arrays will be replaced with contiguous
numpy arrays of the appropriate type.
class gsd.hoomd.ConstraintData
Store constraint data chunks.
Use the Snapshot.constraints attribute to access the constraints.
Instances resulting from file read operations will always store array quantities in numpy.ndarray objects of
the defined types. User created snapshots may provide input data that can be converted to a numpy.ndarray.
N
Number of particles in the snapshot (constraints/N ).
Type int
value
Constraint length (constraints/value).
Type (N, ) numpy.ndarray of numpy.float32
group
Tags of the particles in the constraint (constraints/group).
Type (N, 2) numpy.ndarray of numpy.uint32
validate()
Validate all attributes.
Convert every array attribute to a numpy.ndarray of the proper type and check that all attributes have
the correct dimensions.
Ignore any attributes that are None.
Warning: Array attributes that are not contiguous numpy arrays will be replaced with contiguous
numpy arrays of the appropriate type.
class gsd.hoomd.HOOMDTrajectory(file)
Read and write hoomd gsd files.
Parameters file (gsd.fl.GSDFile) – File to access.
Open hoomd GSD files with open.
append(snapshot)
Append a snapshot to a hoomd gsd file.
6.1. Submodules 39
GSD Documentation, Release 2.1.1
N
Number of particles in the snapshot (particles/N ).
Type int
types
Names of the particle types (particles/types).
Type typing.List [str]
position
Particle position (particles/position).
Type (N, 3) numpy.ndarray of numpy.float32
orientation
Particle orientation. (particles/orientation).
Type (N, 4) numpy.ndarray of numpy.float32
typeid
Particle type id (particles/typeid).
Type (N, ) numpy.ndarray of numpy.uint32
mass
Particle mass (particles/mass).
Type (N, ) numpy.ndarray of numpy.float32
charge
Particle charge (particles/charge).
Type (N, ) numpy.ndarray of numpy.float32
diameter
Particle diameter (particles/diameter).
Type (N, ) numpy.ndarray of numpy.float32
body
Particle body (particles/body).
Type (N, ) numpy.ndarray of numpy.int32
moment_inertia
Particle moment of inertia (particles/moment_inertia).
Type (N, 3) numpy.ndarray of numpy.float32
velocity
Particle velocity (particles/velocity).
Type (N, 3) numpy.ndarray of numpy.float32
angmom
Particle angular momentum (particles/angmom).
Type (N, 4) numpy.ndarray of numpy.float32
image
Particle image (particles/image).
Type (N, 3) numpy.ndarray of numpy.int32
type_shapes
Shape specifications for visualizing particle types (particles/type_shapes).
Type typing.List [typing.Dict]
validate()
Validate all attributes.
Convert every array attribute to a numpy.ndarray of the proper type and check that all attributes have
the correct dimensions.
Ignore any attributes that are None.
Warning: Array attributes that are not contiguous numpy arrays will be replaced with contiguous
numpy arrays of the appropriate type.
class gsd.hoomd.Snapshot
Snapshot of a system state.
configuration
Configuration data.
Type ConfigurationData
6.1. Submodules 41
GSD Documentation, Release 2.1.1
particles
Particles.
Type ParticleData
bonds
Bonds.
Type BondData
angles
Angles.
Type BondData
dihedrals
Dihedrals.
Type BondData
impropers
Impropers.
Type BondData
pairs
Special pair.
Type BondData
constraints
Distance constraints.
Type ConstraintData
state
State data.
Type typing.Dict
log
Logged data (values must be numpy.ndarray or array_like)
Type typing.Dict
validate()
Validate all contained snapshot data.
gsd.hoomd.open(name, mode='rb')
Open a hoomd schema GSD file.
The return value of open can be used as a context manager.
Parameters
• name (str) – File name to open.
• mode (str) – File open mode.
Returns An HOOMDTrajectory instance that accesses the file name with the given mode.
Valid values for mode:
mode description
'rb' Open an existing file for reading.
'rb+' Open an existing file for reading and writing.
'wb' Open a file for writing. Creates the file if needed, or overwrites an existing file.
'wb+' Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.
'xb' Create a gsd file exclusively and opens it for writing. Raise an FileExistsError exception if
it already exists.
'xb+' Create a gsd file exclusively and opens it for reading and writing. Raise an FileExistsError
exception if it already exists.
'ab' Open an existing file for writing. Does not create or overwrite existing files.
class gsd.pygsd.GSDFile(file)
GSD file access interface.
Implemented in pure python and accepts any python file-like object.
Parameters file – File-like object to read.
GSDFile implements an object oriented class interface to the GSD file layer. Use it to open an existing file in a
read-only mode. For read-write access to files, use the full featured C implementation in gsd.fl. Otherwise,
this implementation has all the same methods and the two classes can be used interchangeably.
6.1. Submodules 43
GSD Documentation, Release 2.1.1
Examples
f = GSDFile(open('file.gsd', mode='rb'))
if f.chunk_exists(frame=0, name='chunk'):
data = f.read_chunk(frame=0, name='chunk')
f = GSDFile(open('file.gsd', mode='rb'))
print(f.name, f.mode, f.gsd_version)
print(f.application, f.schema, f.schema_version)
print(f.nframes)
property application
Name of the generating application.
Type str
chunk_exists(frame, name)
Test if a chunk exists.
Parameters
• frame (int) – Index of the frame to check
• name (str) – Name of the chunk
Returns True if the chunk exists in the file. False if it does not.
Return type bool
Example
close()
Close the file.
Once closed, any other operation on the file object will result in a ValueError. close() may be called
more than once. The file is automatically closed when garbage collected or when the context manager exits.
end_frame()
Not implemented.
property file
File-like object opened.
find_matching_chunk_names(match)
Find chunk names in the file that start with the string match.
Parameters match (str) – Start of the chunk name to match
Returns Matching chunk names
Return type list[str]
property gsd_version
GSD file layer version number.
The tuple is in the order (major, minor).
Type typing.Tuple [int, int]
property mode
Mode of the open file.
Type str
property name
file.name.
Type (str)
property nframes
Number of frames in the file.
Type int
read_chunk(frame, name)
Read a data chunk from the file and return it as a numpy array.
Parameters
• frame (int) – Index of the frame to read
• name (str) – Name of the chunk
Returns Data read from file.
Return type numpy.ndarray
Examples
Read a 1D array:
Read a 2D array:
6.1. Submodules 45
GSD Documentation, Release 2.1.1
Tip: Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead,
don’t call read_chunk() on the same chunk repeatedly. Cache the arrays instead.
property schema
Name of the data schema.
Type str
property schema_version
Schema version number.
The tuple is in the order (major, minor).
Type typing.Tuple [int, int]
truncate()
Not implemented.
write_chunk(name, data)
Not implemented.
import gsd.fl
f = gsd.fl.GSDFile('filename', 'rb');
__version_\_
GSD software version number. This is the version number of the software package as a whole, not the file layer
version it reads/writes.
Type str
6.3 Logging
All python modules in GSD use the python standard library module logging to log events. Use this module to
control the verbosity and output destination:
import logging
logging.basicConfig(level=logging.INFO)
See also:
Module logging Documentation of the logging standard module.
SEVEN
C API
The GSD C API consists of a single header and source file. Developers can drop the implementation into any package
that needs it.
7.1 Functions
int gsd_create(const char *fname, const char *application, const char *schema, uint32_t
schema_version)
Create an empty gsd file with the given name. Overwrite any existing file at that location. The generated gsd
file is not opened. Call gsd_open() to open it for writing.
Parameters
• fname – File name.
• application – Generating application name (truncated to 63 chars).
• schema – Schema name for data to be written in this GSD file (truncated to 63 chars).
• schema_version – Version of the scheme data to be written (make with
gsd_make_version()).
Returns
• GSD_SUCCESS (0) on success. Negative value on failure:
• GSD_ERROR_IO: IO error (check errno).
int gsd_create_and_open(struct gsd_handle *handle, const char *fname, const char *applica-
tion, const char *schema, uint32_t schema_version, gsd_open_flag flags, int
exclusive_create)
Create an empty gsd file with the given name. Overwrite any existing file at that location. Open the generated
gsd file in handle.
Parameters
• handle – Handle to open.
• fname – File name.
• application – Generating application name (truncated to 63 chars).
• schema – Schema name for data to be written in this GSD file (truncated to 63 chars).
• schema_version – Version of the scheme data to be written (make with
gsd_make_version()).
• flags – Either GSD_OPEN_READWRITE, or GSD_OPEN_APPEND.
47
GSD Documentation, Release 2.1.1
48 Chapter 7. C API
GSD Documentation, Release 2.1.1
Parameters
• handle – GSD file to close.
Warning: Ensure that all gsd_write_chunk() calls are committed with gsd_end_frame() before
closing the file.
Returns
• GSD_SUCCESS (0) on success. Negative value on failure:
• GSD_ERROR_IO: IO error (check errno).
• GSD_ERROR_INVALID_ARGUMENT: handle is NULL.
Note: If the GSD file is version 1.0, the chunk name is truncated to 63 bytes. GSD version 2.0 files support
arbitrarily long names.
Returns
• GSD_SUCCESS (0) on success. Negative value on failure:
7.1. Functions 49
GSD Documentation, Release 2.1.1
50 Chapter 7. C API
GSD Documentation, Release 2.1.1
7.1. Functions 51
GSD Documentation, Release 2.1.1
7.2 Constants
gsd_type GSD_TYPE_UINT8
Type ID: 8-bit unsigned integer.
gsd_type GSD_TYPE_UINT16
Type ID: 16-bit unsigned integer.
gsd_type GSD_TYPE_UINT32
Type ID: 32-bit unsigned integer.
gsd_type GSD_TYPE_UINT64
Type ID: 64-bit unsigned integer.
gsd_type GSD_TYPE_INT8
Type ID: 8-bit signed integer.
gsd_type GSD_TYPE_INT16
Type ID: 16-bit signed integer.
gsd_type GSD_TYPE_INT32
Type ID: 32-bit signed integer.
gsd_type GSD_TYPE_INT64
Type ID: 64-bit signed integer.
gsd_type GSD_TYPE_FLOAT
Type ID: 32-bit single precision floating point.
gsd_type GSD_TYPE_DOUBLE
Type ID: 64-bit double precision floating point.
gsd_open_flag GSD_OPEN_READWRITE
Open file in read/write mode.
gsd_open_flag GSD_OPEN_READONLY
Open file in read only mode.
gsd_open_flag GSD_OPEN_APPEND
Open file in append only mode.
gsd_error GSD_SUCCESS
Success.
gsd_error GSD_ERROR_IO
IO error. Check errno for details.
gsd_error GSD_ERROR_INVALID_ARGUMENT
Invalid argument passed to function.
gsd_error GSD_ERROR_NOT_A_GSD_FILE
The file is not a GSD file.
52 Chapter 7. C API
GSD Documentation, Release 2.1.1
gsd_error GSD_ERROR_INVALID_GSD_FILE_VERSION
The GSD file version cannot be read.
gsd_error GSD_ERROR_FILE_CORRUPT
The GSD file is corrupt.
gsd_error GSD_ERROR_MEMORY_ALLOCATION_FAILED
GSD failed to allocated memory.
gsd_error GSD_ERROR_NAMELIST_FULL
The GSD file cannot store any additional unique data chunk names.
gsd_error GSD_ERROR_FILE_MUST_BE_WRITABLE
This API call requires that the GSD file opened in with the mode GSD_OPEN_APPEND or
GSD_OPEN_READWRITE.
gsd_error GSD_ERROR_FILE_MUST_BE_READABLE
This API call requires that the GSD file opened the mode GSD_OPEN_READ or GSD_OPEN_READWRITE.
uint8_t type
Data type of the chunk. See Data types.
type gsd_open_flag
Enum defining the file open flag. Valid values are GSD_OPEN_READWRITE, GSD_OPEN_READONLY, and
GSD_OPEN_APPEND.
type gsd_type
Enum defining the file type of the GSD data chunk.
type gsd_error
Enum defining the possible error return values.
type uint8_t
8-bit unsigned integer (defined by C compiler).
type uint32_t
32-bit unsigned integer (defined by C compiler).
type uint64_t
64-bit unsigned integer (defined by C compiler).
type int64_t
64-bit signed integer (defined by C compiler).
type size_t
unsigned integer (defined by C compiler).
54 Chapter 7. C API
CHAPTER
EIGHT
SPECIFICATION
HOOMD-blue supports a wide variety of per particle attributes and properties. Particles, bonds, and types can be
dynamically added and removed during simulation runs. The hoomd schema can handle all of these situations in a
reasonably space efficient and high performance manner. It is also backwards compatible with previous versions of
itself, as we only add new additional data chunks in new versions and do not change the interpretation of the existing
data chunks. Any newer reader will initialize new data chunks with default values when they are not present in an
older version file.
Schema name hoomd
Schema version 1.4
8.1.1 Use-cases
Each frame the hoomd schema may contain one or more data chunks. The layout and names of the chunksmatch that
of the binary snapshot API in HOOMD-blue itself. Data chunks are organized in categories. These categories have no
meaning in the hoomd schema specification, and are simply an organizational tool. Some file writers may implement
options that act on categories (i.e. write attributes out to every frame, or just frame 0).
Values are well defined for all fields at all frames. When a data chunk is present in frame i, it defines the values for the
frame. When it is not present, the data chunk of the same name at frame 0 defines the values for frame i (when N is
equal between the frames). If the data chunk is not present in frame 0, or N differs between frames, values are default.
Default values allow files sizes to remain small. For example, a simulation with point particles where orientation is
always (1,0,0,0) would not write any orientation chunk to the file.
55
GSD Documentation, Release 2.1.1
N may be zero. When N is zero, an index entry may be written for a data chunk with no actual data written to the file
for that chunk.
56 Chapter 8. Specification
GSD Documentation, Release 2.1.1
8.1.3 Configuration
configuration/step
Type uint64
Size 1x1
Default 0
Units number
Simulation time step.
configuration/dimensions
Type uint8
Size 1x1
Default 3
Units number
Number of dimensions in the simulation. Must be 2 or 3.
configuration/box
Type float
Size 6x1
Default [1,1,1,0,0,0]
Units varies
Simulation box. Each array element defines a different box property. See the hoomd documentation for a full
description on how these box parameters map to a triclinic geometry.
• box[0:3]: (𝑙𝑥 , 𝑙𝑦 , 𝑙𝑧 ) the box length in each direction, in length units
• box[3:]: (𝑥𝑦, 𝑥𝑧, 𝑦𝑧) the tilt factors, unitless values
Within a single frame, the number of particles N and NT are fixed for all chunks. N and NT may vary from one frame
to the next. All values are stored in hoomd native units.
Attributes
particles/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of particles, for all data chunks particles/*.
particles/types
Type int8
Size NTxM
Default [‘A’]
Units UTF-8
Implicitly define NT, the number of particle types, for all data chunks particles/*. M must be large enough
to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type
name for particle type i.
particles/typeid
Type uint32
Size Nx1
Default 0
Units number
Store the type id of each particle. All id’s must be less than NT. A particle with type id has a type name matching
the corresponding row in particles/types.
particles/type_shapes
Type int8
Size NTxM
Default empty
Units UTF-8
Store a per-type shape definition for visualization. A dictionary is stored for each of the NT types, corresponding
to a shape for visualization of that type. M must be large enough to accommodate the shape definition as a null-
terminated UTF-8 JSON-encoded string. See: Shape Visualization for examples.
particles/mass
Type float (32-bit)
Size Nx1
Default 1.0
Units mass
Store the mass of each particle.
particles/charge
58 Chapter 8. Specification
GSD Documentation, Release 2.1.1
Properties
particles/position
Type float (32-bit)
Size Nx3
Default 0,0,0
Units length
Store the position of each particle (x, y, z).
All particles in the simulation are referenced by a tag. The position data chunk (and all other per particle data
chunks) list particles in tag order. The first particle listed has tag 0, the second has tag 1, . . . , and the last has tag
N-1 where N is the number of particles in the simulation.
All particles must be inside the box:
Momenta
particles/velocity
Type float (32-bit)
Size Nx3
Default 0,0,0
Units length/time
Store the velocity of each particle (𝑣𝑥 , 𝑣𝑦 , 𝑣𝑧 ).
particles/angmom
Type float (32-bit)
Size Nx4
Default 0,0,0,0
Units quaternion
Store the angular momentum of each particle as a quaternion. See the HOOMD documentation for information
on how to convert to a vector representation.
particles/image
Type int32
Size Nx3
Default 0,0,0
Units number
Store the number of times each particle has wrapped around the box (𝑖𝑥 , 𝑖𝑦 , 𝑖𝑧 ). In constant volume simulations,
the unwrapped position in the particle’s full trajectory is
• 𝑥𝑢 = 𝑥 + 𝑖𝑥 · 𝑙𝑥 + 𝑥𝑦 · 𝑖𝑦 · 𝑙𝑦 + 𝑥𝑧 · 𝑖𝑧 · 𝑙𝑧
• 𝑦𝑢 = 𝑦 + 𝑖𝑦 · 𝑙𝑦 + 𝑦𝑧 · 𝑖𝑧 · 𝑙𝑧
• 𝑧𝑢 = 𝑧 + 𝑖𝑧 · 𝑙𝑧
60 Chapter 8. Specification
GSD Documentation, Release 2.1.1
8.1.5 Topology
bonds/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of bonds, for all data chunks bonds/*.
bonds/types
Type int8
Size NTxM
Default empty
Units UTF-8
Implicitly define NT, the number of bond types, for all data chunks bonds/*. M must be large enough to
accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type
name for bond type i. By default, there are 0 bond types.
bonds/typeid
Type uint32
Size Nx1
Default 0
Units number
Store the type id of each bond. All id’s must be less than NT. A bond with type id has a type name matching the
corresponding row in bonds/types.
bonds/group
Type uint32
Size Nx2
Default 0,0
Units number
Store the particle tags in each bond.
angles/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of angles, for all data chunks angles/*.
angles/types
Type int8
Size NTxM
Default empty
Units UTF-8
Implicitly define NT, the number of angle types, for all data chunks angles/*. M must be large enough to
accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type
name for angle type i. By default, there are 0 angle types.
angles/typeid
Type uint32
Size Nx1
Default 0
Units number
Store the type id of each angle. All id’s must be less than NT. A angle with type id has a type name matching
the corresponding row in angles/types.
angles/group
Type uint32
Size Nx2
Default 0,0
Units number
Store the particle tags in each angle.
dihedrals/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of dihedrals, for all data chunks dihedrals/*.
dihedrals/types
Type int8
Size NTxM
Default empty
Units UTF-8
Implicitly define NT, the number of dihedral types, for all data chunks dihedrals/*. M must be large enough
to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type
name for dihedral type i. By default, there are 0 dihedral types.
dihedrals/typeid
Type uint32
Size Nx1
Default 0
Units number
62 Chapter 8. Specification
GSD Documentation, Release 2.1.1
Store the type id of each dihedral. All id’s must be less than NT. A dihedral with type id has a type name
matching the corresponding row in dihedrals/types.
dihedrals/group
Type uint32
Size Nx2
Default 0,0
Units number
Store the particle tags in each dihedral.
impropers/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of impropers, for all data chunks impropers/*.
impropers/types
Type int8
Size NTxM
Default empty
Units UTF-8
Implicitly define NT, the number of improper types, for all data chunks impropers/*. M must be large
enough to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is
the type name for improper type i. By default, there are 0 improper types.
impropers/typeid
Type uint32
Size Nx1
Default 0
Units number
Store the type id of each improper. All id’s must be less than NT. A improper with type id has a type name
matching the corresponding row in impropers/types.
impropers/group
Type uint32
Size Nx2
Default 0,0
Units number
Store the particle tags in each improper.
constraints/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of constraints, for all data chunks constraints/*.
constraints/value
Type float
Size Nx1
Default 0
Units length
Store the distance of each constraint. Each constraint defines a fixed distance between two particles.
constraints/group
Type uint32
Size Nx2
Default 0,0
Units number
Store the particle tags in each constraint.
pairs/N
Type uint32
Size 1x1
Default 0
Units number
Define N, the number of special pair interactions, for all data chunks pairs/*.
New in version 1.1.
pairs/types
Type int8
Size NTxM
Default empty
Units UTF-8
Implicitly define NT, the number of special pair types, for all data chunks pairs/*. M must be large enough
to accommodate each type name as a null terminated UTF-8 character string. Row i of the 2D matrix is the type
name for particle type i. By default, there are 0 special pair types.
New in version 1.1.
pairs/typeid
Type uint32
Size Nx1
Default 0
Units number
64 Chapter 8. Specification
GSD Documentation, Release 2.1.1
Store the type id of each special pair interaction. All id’s must be less than NT. A pair with type id has a type
name matching the corresponding row in pairs/types.
New in version 1.1.
pairs/group
Type uint32
Size Nx2
Default 0,0
Units number
Store the particle tags in each special pair interaction.
New in version 1.1.
Users may store logged data in log/* data chunks. Logged data encompasses values computed at simulation time
that are too expensive or cumbersome to re-compute in post processing. This specification does not define specific
chunk names or define logged data. Users may select any valid name for logged data chunks as appropriate for their
workflow.
For any named logged data chunks present in any frame frame the file: If a chunk is not present in a given frame i !=
0, the implementation should provide the quantity as read from frame 0 for that frame. GSD files that include a logged
data chunk only in some frames i != 0 and not in frame 0 are invalid.
By convention, per-particle and per-bond logged data should have a chunk name starting with log/particles/
and log/bonds, respectively. Scalar, vector, and string values may be stored under a different prefix starting with
log/. This specification may recognize additional conventions in later versions without invalidating existing files.
log/particles/user_defined
Type user-defined
Size NxM
Units user-defined
This chunk is a place holder for any number of user defined per-particle quantities. N is the number of particles
in this frame. M, the data type, the units, and the chunk name (after the prefix log/particles/) are user-
defined.
New in version 1.4.
log/bonds/user_defined
Type user-defined
Size NxM
Units user-defined
This chunk is a place holder for any number of user defined per-bond quantities. N is the number of bonds in
this frame. M, the data type, the units, and the chunk name (after the prefix log/bonds/) are user-defined.
New in version 1.4.
log/user_defined
Type user-defined
Size NxM
Units user-defined
This chunk is a place holder for any number of user defined quantities. N, M, the data type, the units, and the
chunk name (after the prefix log/) are user-defined.
New in version 1.4.
HOOMD stores auxiliary state information in state/* data chunks. Auxiliary state encompasses internal state to
any integrator, updater, or other class that is not part of the particle system state but is also not a fixed parameter. For
example, the internal degrees of freedom in integrator. Auxiliary state is useful when restarting simulations.
HOOMD only stores state in GSD files when requested explicitly by the user. Only a few of the documented state data
chunks will be present in any GSD file and not all state chunks are valid. Thus, state data chunks do not have default
values. If a chunk is not present in the file, that state does not have a well-defined value.
Note: HOOMD-blue versions 3.0 and newer write state data in an application defined format in log/*, not in
state/*. See the HOOMD-blue documentation for details on the data chunks it reads and writes.
66 Chapter 8. Specification
GSD Documentation, Release 2.1.1
68 Chapter 8. Specification
GSD Documentation, Release 2.1.1
Units length
Sweep radius for each type.
New in version 1.2.
state/hpmc/convex_polygon/N
Type uint32
Size NTx1
Units number
Number of vertices defined for each type.
New in version 1.2.
state/hpmc/convex_polygon/vertices
Type float
Size sum(N)x2
Units length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for
type 1 is the next N[1] vertices, and so on. . .
New in version 1.2.
state/hpmc/convex_spheropolygon/N
Type uint32
Size NTx1
Units number
Number of vertices defined for each type.
New in version 1.2.
state/hpmc/convex_spheropolygon/vertices
Type float
Size sum(N)x2
Units length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for
type 1 is the next N[1] vertices, and so on. . .
New in version 1.2.
state/hpmc/convex_spheropolygon/sweep_radius
Type float
Size NTx1
Units length
Sweep radius for each type.
New in version 1.2.
state/hpmc/simple_polygon/N
Type uint32
Size NTx1
Units number
Number of vertices defined for each type.
New in version 1.2.
state/hpmc/simple_polygon/vertices
Type float
Size sum(N)x2
Units length
Position of the vertices in the shape for all types. The shape for type 0 is the first N[0] vertices, the shape for
type 1 is the next N[1] vertices, and so on. . .
New in version 1.2.
The chunk particles/type_shapes stores information about shapes corresponding to particle types. Shape
definitions are stored for each type as a UTF-8 encoded JSON string containing key-value pairs. The class of a shape
is defined by the type key. All other keys define properties of that shape. Keys without a default value are required
for a valid shape specification.
An empty dictionary can be used for undefined shapes. A visualization application may choose how to interpret this,
e.g. by drawing nothing or drawing spheres.
Example:
{}
8.2.2 Spheres
Type: Sphere
Spheres’ dimensionality (2D circles or 3D spheres) can be inferred from the system box dimensionality.
Example:
{
"type": "Sphere",
"diameter": 2.0
}
70 Chapter 8. Specification
GSD Documentation, Release 2.1.1
8.2.3 Ellipsoids
Type: Ellipsoid
The ellipsoid class has principal axes a, b, c corresponding to its radii in the x, y, and z directions.
Example:
{
"type": "Ellipsoid",
"a": 7.0,
"b": 5.0,
"c": 3.0
}
8.2.4 Polygons
Type: Polygon
A simple polygon with its vertices specified in a counterclockwise order. Spheropolygons can be represented using
this shape type, through the rounding_radius key.
Example:
{
"type": "Polygon",
"rounding_radius": 0.1,
"vertices": [[-0.5, -0.5], [0.5, -0.5], [0.5, 0.5]]
}
Type: ConvexPolyhedron
A convex polyhedron with vertices specifying the convex hull of the shape. Spheropolyhedra can be represented using
this shape type, through the rounding_radius key.
Example:
{
"type": "ConvexPolyhedron",
"rounding_radius": 0.1,
"vertices": [[0.5, 0.5, 0.5], [0.5, -0.5, -0.5], [-0.5, 0.5, -0.5], [-0.5, -0.5,
˓→0.5]]
Type: Mesh
A list of lists of indices are used to specify faces. Faces must contain 3 or more vertex indices. The vertex indices
must be zero-based. Faces must be defined with a counterclockwise winding order (to produce an “outward” normal).
Example:
{
"type": "Mesh",
"vertices": [[0.5, 0.5, 0.5], [0.5, -0.5, -0.5], [-0.5, 0.5, -0.5], [-0.5, -0.5,
˓→0.5]],
Type: SphereUnion
A collection of spheres, defined by their diameters and centers.
Example:
{
"type": "SphereUnion",
"centers": [[0, 0, 1.0], [0, 1.0, 0], [1.0, 0, 0]],
"diameters": [0.5, 0.5, 0.5]
}
72 Chapter 8. Specification
GSD Documentation, Release 2.1.1
Version: 2.0
General simulation data (GSD) file layer design and rationale. These use cases and design specifications define the
low level GSD file format.
Differences from the 1.0 specification are noted.
8.3.1 Use-cases
• capabilities
– efficiently store many frames of data from simulation runs
– high performance file read and write
– support arbitrary chunks of data in each frame (position, orientation, type, etc. . . )
– variable number of named chunks in each frame
– variable size of chunks in each frame
– each chunk identifies data type
– common use cases: NxM arrays in double, float, int, char types.
– generic use case: binary blob of N bytes
– can be integrated into other tools
– append frames to an existing file with a monotonically increasing frame number
– resilient to job kills
• queries
– number of frames
– is named chunk present in frame i
– type and size of named chunk in frame i
– read data for named chunk in frame i
– read only a portion of a chunk
– list chunk names in the file
• writes
– write data to named chunk in the current frame
– end frame and commit to disk
These capabilities enable a simple and rich higher level schema for storing particle and other types of data. The schema
determine which named chunks exist in a given file and what they mean.
These capabilities are use-cases that GSD does not support, by design.
1. Modify data in the file: GSD is designed to capture simulation data.
2. Add chunks to frames in the middle of a file: See (1).
3. Transparent conversion between float and double: Callers must take care of this.
4. Transparent compression: this gets in the way of parallel I/O. Disk space is cheap.
8.3.3 Dependencies
The file layer is implemented in C (not C++) with no dependencies to enable trivial installation and incorporation into
existing projects. A single header and C file completely implement the entire file layer. Python based projects that
need only read access can use gsd.pygsd, a pure Python gsd reader implementation.
A Python interface to the file layer allows reference implementations and convenience methods for schemas. Most
non-technical users of GSD will probably use these reference implementations directly in their scripts.
The low level C library is wrapped with cython. A Python setup.py file will provide simple installation on as many
systems as possible. Cython c++ output is checked in to the repository so users do not even need cython as a depen-
dency.
8.3.4 Specifications
Support:
• Files as large as the underlying filesystem allows (up to 64-bit address limits)
• Data chunk names of arbitrary length (v1.0 limits chunk names to 63 bytes)
• Reference up to 65535 different chunk names within a file
• Application and schema names up to 63 characters
• Store as many frames as can fit in a file up to file size limits
• Data chunks up to (64-bit) x (32-bit) elements
The limits on only 16-bit name indices and 32-bit column indices are to keep the size of each index entry as small
as possible to avoid wasting space in the file index. The primary use cases in mind for column indices are Nx3 and
Nx4 arrays for position and quaternion values. Schemas that wish to store larger truly n-dimensional arrays can store
their dimensionality in metadata in another chunk and store as an Nx1 index entry. Or use a file format more suited to
N-dimensional arrays such as HDF5.
74 Chapter 8. Specification
GSD Documentation, Release 2.1.1
2. Index block
• Index the frame data, size information, location, name id, etc. . .
• The index contains space for any number of index_entry structs
• The first index in the list with a location of 0 marks the end of the list.
• When the index fills up, a new index block is allocated at the end of the file with more space and all current
index entries are rewritten there.
• Index entry size: 32 bytes
3. Name list
• List of string names used by index entries.
• v1.0 files: Each name is a 64-byte character string.
• v2.0 files: Names may have any length and are separated by 0 terminators.
• The first name that starts with the 0 byte marks the end of the list
• The header stores the total size of the name list block.
4. Data chunk
• Raw binary data stored for the named frame data blocks.
Header index, and name blocks are stored in memory as C structs (or arrays of C structs) and written to disk in whole
chunks.
Header block
struct gsd_header
{
uint64_t magic;
uint64_t index_location;
uint64_t index_allocated_entries;
uint64_t namelist_location;
uint64_t namelist_allocated_entries;
uint32_t schema_version;
uint32_t gsd_version;
char application[64];
char schema[64];
char reserved[80];
};
Index block
An Index block is made of a number of line items that store a pointer to a single data chunk:
struct gsd_index_entry
{
uint64_t frame;
uint64_t N;
int64_t location;
uint32_t M;
uint16_t *id*;
uint8_t type;
uint8_t flags;
};
Namelist block
In v2.0 files, the namelist block stores a list of strings separated by 0 terminators.
In v1.0 files, the namelist block stores a list of 0-terminated strings in 64-byte segments.
The first string that starts with 0 marks the end of the list.
76 Chapter 8. Specification
GSD Documentation, Release 2.1.1
Data block
A data block stores raw data bytes on the disk. For a given index entry entry, the data starts at location entry.
location and is the next entry.N * entry.M * gsd_sizeof_type(entry.type) bytes.
78 Chapter 8. Specification
CHAPTER
NINE
CONTRIBUTE
GSD is an open source project. Contributions are accepted via pull request to GSD’s github repository. Please review
CONTRIBUTING.MD in the repository before starting development. You are encouraged to discuss your proposed
contribution with the GSD user and developer community who can help you design your contribution to fit smoothly
into the existing ecosystem.
79
GSD Documentation, Release 2.1.1
80 Chapter 9. Contribute
CHAPTER
TEN
CODE STYLE
All code in GSD must follow a consistent style to ensure readability. We provide configuration files for linters (speci-
fied below) so that developers can automatically validate and format files.
10.1 Python
Python code in GSD should follow PEP8 with the formatting performed by yapf (configuration in setup.cfg).
Code should pass all flake8 tests and formatted by yapf.
10.1.1 Tools
• Linter: flake8
– With these plugins:
* pep8-naming
* flake8-docstrings
* flake8-rst-docstrings
– Run: flake8 to see a list of linter violations.
• Autoformatter: yapf
– Run: yapf -d -r . to see needed style changes.
– Run: yapf -i file.py to apply style changes to a whole file, or use your IDE to apply yapf to a
selection.
10.1.2 Documentation
Python code should be documented with docstrings and added to the Sphinx documentation index in doc/. Docstrings
should follow Google style formatting for use in Napoleon.
81
GSD Documentation, Release 2.1.1
10.2 C
10.2.1 Tools
• Autoformatter: clang-format.
– Run: ./run-clang-format.py -r . to see needed changes.
– Run: clang-format -i file.c to apply the changes.
• Linter: clang-tidy
– Compile GSD with CMake to see clang-tidy output.
10.2.2 Documentation
Documentation comments should be in Javadoc format and precede the item they document for compatibility with
Doxygen and most source code editors. Multi-line documentation comment blocks start with /** and single line ones
start with ///.
See gsd.h for an example.
Use your best judgment and follow existing patterns when styling CMake and other files types. The following general
guidelines apply:
• 100 character line width.
• 4 spaces per indent level.
• 4 space indent.
Visual Studio Code users: Open the provided workspace file (gsd.code-workspace) which provides configura-
tion settings for these style guidelines.
ELEVEN
CREDITS
85
GSD Documentation, Release 2.1.1
TWELVE
LICENSE
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
87
GSD Documentation, Release 2.1.1
THIRTEEN
INDEX
• genindex
• modindex
89
GSD Documentation, Release 2.1.1
g
gsd, 46
gsd.fl, 29
gsd.hoomd, 37
gsd.pygsd, 43
91
GSD Documentation, Release 2.1.1
93
GSD Documentation, Release 2.1.1
GSD_ERROR_INVALID_GSD_FILE_VERSION (C H
var), 52 HOOMDTrajectory (class in gsd.hoomd), 39
GSD_ERROR_IO (C var), 52
GSD_ERROR_MEMORY_ALLOCATION_FAILED (C I
var), 53
image (gsd.hoomd.ParticleData attribute), 41
GSD_ERROR_NAMELIST_FULL (C var), 53
impropers (gsd.hoomd.Snapshot attribute), 42
GSD_ERROR_NOT_A_GSD_FILE (C var), 52
impropers/group (data chunk), 63
gsd_find_chunk (C function), 50
impropers/N (data chunk), 63
gsd_find_matching_chunk_name (C function),
impropers/typeid (data chunk), 63
51
impropers/types (data chunk), 63
gsd_get_nframes (C function), 50
int64_t (C type), 54
gsd_handle (C type), 53
gsd_handle.file_size (C member), 53 L
gsd_handle.header (C member), 53
gsd_handle.open_flags (C member), 53 log (gsd.hoomd.Snapshot attribute), 42
gsd_header_t (C type), 53 log/bonds/user_defined (data chunk), 65
gsd_header_t.application (C member), 53 log/particles/user_defined (data chunk), 65
gsd_header_t.gsd_version (C member), 53 log/user_defined (data chunk), 66
gsd_header_t.schema (C member), 53
gsd_header_t.schema_version (C member), 53
M
gsd_index_entry_t (C type), 53 mass (gsd.hoomd.ParticleData attribute), 40
gsd_index_entry_t.frame (C member), 53 mode (gsd.fl.GSDFile attribute), 29
gsd_index_entry_t.M (C member), 53 mode() (gsd.pygsd.GSDFile property), 45
gsd_index_entry_t.N (C member), 53 module
gsd_index_entry_t.type (C member), 53 gsd, 46
gsd_make_version (C function), 51 gsd.fl, 29
gsd_open (C function), 48 gsd.hoomd, 37
GSD_OPEN_APPEND (C var), 52 gsd.pygsd, 43
gsd_open_flag (C type), 54 moment_inertia (gsd.hoomd.ParticleData attribute),
GSD_OPEN_READONLY (C var), 52 41
GSD_OPEN_READWRITE (C var), 52
gsd_read_chunk (C function), 50 N
gsd_sizeof_type (C function), 50 N (gsd.hoomd.BondData attribute), 38
GSD_SUCCESS (C var), 52 N (gsd.hoomd.ConstraintData attribute), 39
gsd_truncate (C function), 48 N (gsd.hoomd.ParticleData attribute), 40
gsd_type (C type), 54 name (gsd.fl.GSDFile attribute), 29
GSD_TYPE_DOUBLE (C var), 52 name() (gsd.pygsd.GSDFile property), 45
GSD_TYPE_FLOAT (C var), 52 nframes (gsd.fl.GSDFile attribute), 30
GSD_TYPE_INT16 (C var), 52 nframes() (gsd.pygsd.GSDFile property), 45
GSD_TYPE_INT32 (C var), 52
GSD_TYPE_INT64 (C var), 52 O
GSD_TYPE_INT8 (C var), 52 open() (in module gsd.fl), 36
GSD_TYPE_UINT16 (C var), 52 open() (in module gsd.hoomd), 42
GSD_TYPE_UINT32 (C var), 52 orientation (gsd.hoomd.ParticleData attribute), 40
GSD_TYPE_UINT64 (C var), 52
GSD_TYPE_UINT8 (C var), 52 P
gsd_upgrade (C function), 51 pairs (gsd.hoomd.Snapshot attribute), 42
gsd_version (gsd.fl.GSDFile attribute), 29 pairs/group (data chunk), 65
gsd_version() (gsd.pygsd.GSDFile property), 45 pairs/N (data chunk), 64
gsd_write_chunk (C function), 49 pairs/typeid (data chunk), 64
GSDFile (class in gsd.fl), 29 pairs/types (data chunk), 64
GSDFile (class in gsd.pygsd), 43 ParticleData (class in gsd.hoomd), 40
particles (gsd.hoomd.Snapshot attribute), 41
particles/angmom (data chunk), 60
94 Index
GSD Documentation, Release 2.1.1
Index 95