The Linux Rain Linux General/Gaming News, Reviews and Tutorials

Gnumeric: a filter-and-export script

By Bob Mesibov, published 28/07/2015 in Tutorials

My wife likes to organise and store information in Gnumeric spreadsheets. Every now and then she needs to filter her data and save the result to a new spreadsheet. Gnumeric doesn't have a simple way to do that. It has a complicated way to do it ('Advanced Filter'), but nothing so straightforward as the 'Only Copy Visible Cells ' feature in LibreOffice Calc.

I wrote a shell script that makes the filtering a little easier, and that opens the filtered rows in a new Gnumeric spreadsheet, ready to be saved as a new file. The script is in copy-and-paste form at the end of this article. In what follows I show the script in action, but please note that it requires YAD dialog and the xclip utility.

Step one

My wife launches the script by clicking a panel icon on her desktop. The script puts a YAD dialog on top of the window with an open Gnumeric file. The '--on-top' YAD option ensures that the dialog box remains visible while the user interacts with the underlying spreadsheet.

The instructions in the dialog tell the user to get focus on the spreadsheet, then select all occupied rows and columns (Ctrl+Home, Ctrl+Shift+End), then copy those rows and columns to the clipboard (Ctrl+c). If the dialog is cancelled or closed, the script exits.

Step two

If 'OK' was clicked, the clipboard contents are sent to a temp file by xclip. By default, copy-pasting from a Gnumeric (or Calc, or Excel) spreadsheet gives a plain text file in which the cell contents in each row are separated by tabs.

The header line in the temp table is extracted and the tabs between column names changed to newlines. The resulting list is sent to YAD in a second dialog, this time asking the user to pick a column name.

Once again, there's a bail-out-of-the-script option if the YAD dialog is closed or cancelled. If a column-name string was chosen and 'OK' clicked, the script continues. In the image above, I've chosen the 'GENUS' column in the spreadsheet.

Step three

The script next needs to retrieve all the entries in the 'GENUS' column, so that one or more can be filtered. For this purpose the script needs the column (field) number, so an AWK command is used to find it and the column number is stored in a variable.

The chosen column is then extracted from /tmp/table, the column name stripped off and the entries sorted and 'uniq-ed' to get a list of the unique entries under 'GENUS'. The list is saved in a variable.

The list of entries is offered to the user in a third YAD dialog, and here I've allowed up to 3 choices to be made (with the '--multiple' option). In the image below, I've chosen the entries 'Cantua', 'Carpobrotus' and 'Casuarina'.

The YAD default is to output multiple choices with a trailing pipe character (|). These are put on a single line with the paste command and the final trailing pipe deleted with sed. The resulting string of entry choices is saved in a variable.

Step four

There's another bail-out-of-the-script option, but if 1, 2 or 3 entry choices were made, the individual choices are pulled out of the modified YAD output with a series of cut commands and stored in variables, one or two of which might be empty strings.

First, though, the original header line in /tmp/table is sent to the temp file /tmp/entry.txt, which is created by this redirection. The suffix '.txt' is needed here because this file is going to be opened directly by Gnumeric, and Gnumeric likes to know what MIME type it's being fed with.

All rows in /tmp/table which have the selected entry values ('Cantua', 'Carpobrotus', 'Casuarina') in the chosen column ('GENUS') are found by AWK and appended to /tmp/entry.txt, which is then opened by Gnumeric in a new window over the original spreadsheet.

Note that Gnumeric on the command line automatically opens a tab-separated text file with fields in the right places.

That's it

The Gnumeric entry.txt table can now be renamed and saved with its filtered results as a Gnumeric file, or the script can be launched again to do further filtering on entry.txt, to produce yet another entry.txt Gnumeric spreadsheet in a new top window. When entry.txt windows are closed, the script deletes the temp files and exits.

It's not a pretty hack, but it works fine for my wife's purposes and I hope it will keep her loyal to the excellent Gnumeric!

The script:


echo -e "1. Click anywhere in spreadsheet.\n2. Press Ctrl + Home.\n3. Press Ctrl + Shift + End.\n4. Press Ctrl + c.\n5. Click OK to continue, or Cancel to abort." | yad --text-info --center --on-top --width=400 --height=150

if [[ $? -ne 0 ]]; then
    exit 0
    xclip -selection clipboard -o > /tmp/table
    fieldlist=$(head -n 1 /tmp/table | tr '\t' '\n')
    field=$(echo "$fieldlist" | yad --list --column=Field --separator="" --center --width=300 --height=400  --text="Choose a field:")

    if [[ $? -ne 0 ]]; then
        rm /tmp/table && exit 0

        fieldno=$(awk -F"\t" -v FIELD="$field" 'NR==1 {for (i=1;i<=NF;++i) if ($i == FIELD) print i}' /tmp/table)
        entrylist=$(cut -f"$fieldno" /tmp/table | tail -n +2 | sort | uniq)
        entries=$(echo "$entrylist" | yad --list --multiple --column=Item --center --width=400 --height=400 --text="Choose up to 3 items:\n(Hold down Ctrl for multiple selections)" | paste -s -d "" | sed 's/|$//')

        if [[ -z $entries ]]; then
            rm /tmp/table && exit 0

            head -n 1 /tmp/table > /tmp/entry.txt
            entry1=$(cut -d "|" -f1 <(echo "$entries"))
            entry2=$(cut -d "|" -f2 <(echo "$entries"))
            entry3=$(cut -d "|" -f3 <(echo "$entries"))
            awk  -v ENTRY1="$entry1" -v ENTRY2="$entry2" -v ENTRY3="$entry3" -v NUM="$fieldno" 'BEGIN {FS=OFS="\t"} ($NUM == ENTRY1) || ($NUM == ENTRY2) || ($NUM == ENTRY3) {print}' /tmp/table >> /tmp/entry.txt
            gnumeric /tmp/entry.txt
            rm /tmp/table /tmp/entry.txt

exit 0

About the Author

Bob Mesibov is Tasmanian, retired and a keen Linux tinkerer.

Tags: tutorials scripting bash awk xclip gnumeric spreadsheets
blog comments powered by Disqus