POKI_PUT_TOC_HERE
mlr sort
on a CSV file against system sort
, it's not relevant
to say which is faster by how many percent — Miller will respect the
header line, leaving it in place, while the system sort will move it, sorting
it along with all the other header lines. This would be comparing the run times
of two programs produce different outputs. Likewise, awk
doesn’t respect header lines, although you can code up some CSV-handling
using if (NR==1) { ... } else { ... }
. And that’s just CSV: I
don’t know any simple way to get sort
, awk
, etc. to
handle DKVP, JSON, etc. — which is the main rreason I wrote Miller.
Implementations differ by platform: one awk
may be
fundamentally faster than another, and mawk
has a very efficient
bytecode implementation — which handles positionally indexed data
far faster than Miller does.
The system sort
command will, on some systems, handle
too-large-for-RAM datasets by spilling to disk; Miller (as of version 5.2.0,
mid-2017) does not. Miller sorts are always stable; GNU supports stable and
unstable variants.
Etc.
grep
, sed
, etc.