GNU datamash
GNU datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files.
Examples: calculate the sum and mean of values 1 to 10: $ seq 10 | datamash sum 1 mean 1 55 5.5 group text file by one column and calculate mean and sample standard deviation on another, with automatic sorting and header line processing: $ datamash --sort --headers groupby 2 mean 3 sstdev 3 < scores_h.txt GroupBy(Major) mean(Score) sstdev(Score) Arts 68.94 10.42 ... file validation for pipeline automation and troubleshooting: $ datamash check < snp147Common.txt && echo ok || echo fail 15189820 lines, 26 fields ok $ datamash check < tmp2.txt && echo ok || echo fail line 3816 (7 fields): chrY 9544432 9552871 NR_001534 0 - 0.5 line 3817 (6 fields): chrY 9544432 9552871 NR_003592 0 - datamash: check failed: line 3817 has 6 fields (previous line had 7) fail
Downloading datamash
Datamash is runs on a wide variety of UNIX platforms, Windows, and MacOS.See the download section for more details.
Documentation and Help
- Usage Examples
- Alternative one-liners and more examples
- Online Datamash Manual
-
Brief help screen:
datamash --help
-
Usage details and examples:
man datamash
-
For the complete manual in
info
format run:info datamash
- Please send questions, suggestions, patches and bug reports to [email protected]
- Searchable archive of questions and discussions at: https://lists.gnu.org/archive/html/bug-datamash/ .
- Subscribe at: https://lists.gnu.org/mailman/listinfo/bug-datamash
Source Code
- Stable source releases: https://ftp.gnu.org/gnu/datamash/ (FTP).
- Development snapshots: https://alpha.gnu.org/gnu/datamash/ .
- Miscellaneous downloads (e.g. pre-compiled binaries): https://download.savannah.gnu.org/releases/datamash/ .
- View GIT Code Repository: https://git.savannah.gnu.org/gitweb/?p=datamash.git .
-
Clone GIT repository:
git clone git://git.sv.gnu.org/datamash.git
Development
Development of
Datamash,
and GNU in general, is a volunteer effort, and you can contribute. For
information, please read
Datamash
is currently being maintained by
Assaf Gordon
and Tim Rice.
GNU Datamash
is free software; you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free
Software Foundation; either version 3 of the License, or (at your
option) any later version.Maintainer
For any questions, please send email to
[email protected].
Licensing