datamash - GNU Project - Free Software Foundation

Come build a better world with us!

Please don't scroll past this. We've been building a better world with free software since 1985. Today, we ask for your support. Only with your help can the FSF continue to be the cornerstone of a more just digital society! Donate to help us reach the goal of USD $400,000 by Dec 31.

Donate

$138,724

$400,000

GNU datamash

GNU datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files.

Examples:

calculate the sum and mean of values 1 to 10:

  $ seq 10 | datamash sum 1 mean 1
  55 5.5

group text file by one column and calculate
mean and sample standard deviation on another,
with automatic sorting and header line processing:

  $ datamash --sort --headers groupby 2 mean 3 sstdev 3 < scores_h.txt
  GroupBy(Major)  mean(Score) sstdev(Score)
  Arts            68.94       10.42
  ...

file validation for pipeline automation and troubleshooting:

  $ datamash check < snp147Common.txt && echo ok || echo fail
  15189820 lines, 26 fields
  ok

  $ datamash check < tmp2.txt && echo ok || echo fail
  line 3816 (7 fields):
    chrY  9544432 9552871 NR_001534 0 - 0.5
  line 3817 (6 fields):
    chrY  9544432 9552871 NR_003592 0 -
  datamash: check failed: line 3817 has 6 fields (previous line had 7)
  fail

Downloading datamash

Datamash is runs on a wide variety of UNIX platforms, Windows, and MacOS.
See the download section for more details.

Documentation and Help

Usage Examples
Alternative one-liners and more examples
Online Datamash Manual
Brief help screen: datamash --help
Usage details and examples: man datamash
For the complete manual in info format run: info datamash
Please send questions, suggestions, patches and bug reports to [email protected]
Searchable archive of questions and discussions at: https://lists.gnu.org/archive/html/bug-datamash/ .
Subscribe at: https://lists.gnu.org/mailman/listinfo/bug-datamash

Source Code

Stable source releases: https://ftp.gnu.org/gnu/datamash/ (FTP).
Development snapshots: https://alpha.gnu.org/gnu/datamash/ .
Miscellaneous downloads (e.g. pre-compiled binaries): https://download.savannah.gnu.org/releases/datamash/ .
View GIT Code Repository: https://git.savannah.gnu.org/gitweb/?p=datamash.git .
Clone GIT repository: git clone git://git.sv.gnu.org/datamash.git

Development

Development of Datamash, and GNU in general, is a volunteer effort, and you can contribute. For information, please read

Savannah Project Homepage: https://savannah.gnu.org/projects/datamash/

To translate Datamash's messages into other languages, please see the Translation Project page for datamash.

Send bug reports to [email protected].

Maintainer

Datamash is currently being maintained by Assaf Gordon and Tim Rice.
For any questions, please send email to [email protected].

Licensing

GNU Datamash is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

Please send general FSF & GNU inquiries to <[email protected]>. There are also other ways to contact the FSF. Broken links and other corrections or suggestions can be sent to <[email protected]>.

Please see the Translations README for information on coordinating and submitting translations of this article.

This page is licensed under a Creative Commons Attribution-NoDerivs 3.0 United States License.

Updated: $Date: 2022/05/24 21:49:32 $