Swapping the rows and columns in a table on the command line would seem to be a fairly easy thing to do. For example, with this tab-separated table (called 'tabbed') as input:
get this tab-separated table ('swapped') as output:
Transposing rows and columns is indeed easy if you have the GNU datamash utility on your system:
The default separator for columns in datamash is the tab character, so if your table is comma-separated ('commas' table in the following example), you need to specify that separator with a -t option:
Note that datamash needs special instructions if there are blanks anywhere in the table — more on this later.
The mighty text-processing program AWK can also do transposing, but the only commands I've seen are far from simple. Here's a command used by 'fedorqui' in this Stack Overflow post. It puts the whole table in an array before printing the columns as rows:
AWK's default column separator is whitespace (including the tab character), so if the separator is a comma you need to specify that:
Let's build something from simple shell tools to do this transposing job. We can use the cut command to cut out the first column in the table:
and the paste command with its -s option ('single line') to make that column into a row:
We can loop through all 6 columns in the table with a 'for' loop, doing that cut-and-paste job on each:
The cut and paste commands both use the tab character as default separator, so if the table is comma-separated we need to specify that separator for each command with the -d option:
That 'for' loop is fine and good, but notice that it specifies the number of columns, 6, in the table. What if we don't know the number of columns, or can't be bothered counting them?
We can find the number of columns ('numc') by adding 1 to the number of separators with BASH arithmetic. The number can be found with grep and *wc** from the first line of the table, like this for 'commas':
The variable 'numc' can then be fed into the 'for' loop:
These two steps would go nicely in a shell script, here called 'trans'. I'll feed the name of the table to be transposed into the script as the the first argument ($1) for the 'trans' command:
#!/bin/bash numc=$(($(head -n 1 "$1" | grep -o , | wc -l)+1)) for ((i=1; i<="$numc"; i++)) do cut -d, -f"$i" "$1" | paste -s -d, done
What if the separator isn't a comma? No problem. We can feed the separator to a modified 'trans' as a second argument, $2:
#!/bin/bash numc=$(($(head -n 1 "$1" | grep -o "$2" | wc -l)+1)) for ((i=1; i<="$numc"; i++)) do cut -d "$2" -f"$i" "$1" | paste -s -d "$2" done
Note that in the improved 'trans', a tab character would be fed in as a quoted variable ($'\t'):
Three methods for transposing rows and columns, but which is fastest? To do a test, I'll use a tab-separated data table called 'table1' with 15 columns and 10004 rows. Not a huge table, but big enough to detect speed differences in the methods. Because there's whitespace within data items in 'table1', I'll need to specify a tab character as separator for AWK.
And here's the time test:
Looks like our shell script 'trans' is the winner. But is its output correct? Yes:
If there are blank entries in the table, as there are in 'table2', datamash complains:
The solution in datamash version 1.0.6 (the one I got from the Debian 8 repository) is to use the --no-strict option. This is supposed to replace each blank entry with a string, and the default string is 'N/A'. Unfortunately, this doesn't work:
This was a known bug in version 1.0.6 and is said to have been fixed in the next datamash version. Anyway, both AWK and 'trans' will handle blank entries without problems, so let's do a time test again:
And once again, AWK and 'trans' give the same output:
The shell script
I wrote 'trans' as throwaway code for transposing my own tables. It could be extended so that argument 2 specifies the input column separator, and argument 3 a different column separator for the output.
There are other ways to swap rows and columns in tables, for example using Perl, but for the time being I'm happy with the script!