Simple Data Processing

compute is command-line calculator

compute is command-line program which performs simple calculation (e.g. count, sum, min, max, mean, stdev, string coalescing) on input files. A simple example: sum up the values in the first column of any input:

# sum the values in the first column
$ seq 10 | compute sum 1

compute can sort and group the input, based on user specification.

Example: Find the average score in statistics course of college students, grouped by their college major. The input file has three fields: Name,Major,Score:

# The input file has three columns, and a header line
$ cat scores.txt
Name        Major            Score
Bryan       Arts             68
Isaiah      Arts             80
Gabriel     Health-Medicine  100
Tysza       Business         92
Zackery     Engineering      54

# Sort the input file and group by the second column (Major),
# then calculate the mean score (third column) and sample-standard-deviation.

$ compute --sort --headers --group 2 mean 3 sstdev 3 < scores.txt
GroupBy(Major)     mean(Score)   sstdev(Score)
Arts               68.9474       10.4215
Business           87.3636       5.18214
Engineering        66.5385       19.8814
Health-Medicine    90.6154       9.22441
Life-Sciences      55.3333       20.606
Social-Sciences    60.2667       17.2273

compute is perfect for interactive exploration of textual data, and for automating tasks in shell scripts. See the Examples section for real-world use-cases in system-administration, bioinformatics, shell scripting, and more.

compute has a rich set of statistical functions, to quickly assess information in textual input files. An example of calculating basic statistic (mean, 1st quartile, median, 3rd quarile, IQR, sample-standard-deviation, and p-value of Jarque-Bera test for normal distribution:

$ compute -H mean 1 q1 1 median 1 q3 1 iqr 1 sstdev 1 jarque 1 < FILE.TXT
mean(x)   q1(x)  median(x)  q3(x)   iqr(x)  sstdev(x)  jarque(x)
45.32     23     37         61.5    38.5    30.4487    8.0113-09

See the Statistics Examples page for more details about the statistic functions in compute.

compute is closely modelled after existing unix utilities (e.g. sort, join, awk, sed) and intergrates well with these programs. See the Manual for more details and examples.

compute is written in portable C, compiles on many platforms (Linux, Mac-OS-X, FreeBSD), and includes a comprehensive test-suit to ensure correctness and robustness. A single pre-compiled binary can be downloaded and used on most modern systems without the need to compile from source - See the Download section for more details.


compute is developed by Assaf Gordon.

compute is Free and Open Source program, distributed under GPLv3 or later.

compute is modelled after Aaron Quinlan's GroupBy program and other (unreleased) tools by Assaf Gordon.

This website was generated by Github Pages, based on the Architect theme by Jason Long.