9

What is the most efficient way to convert tab separated data such as this:

a   b   c   d   cat
NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    d   d
NULL    NULL    c   NULL    c
NULL    NULL    c   d   c; d
NULL    b   NULL    NULL    b
NULL    b   NULL    d   b; d
NULL    b   c   NULL    b; c
NULL    b   c   d   b; c; d
a   NULL    NULL    NULL    a
a   NULL    NULL    d   a; d
a   NULL    c   NULL    a; c
a   NULL    c   d   a; c; d
a   b   NULL    NULL    a; b
a   b   NULL    d   a; b; d
a   b   c   NULL    a; b; c
a   b   c   d   a; b; c; d

Something close to this:

a    | b    | c    | d    | cat
-----+------+------+------+-----------
NULL | NULL | NULL | NULL | NULL
NULL | NULL | NULL | d    | d
NULL | NULL | c    | NULL | c
NULL | NULL | c    | d    | c; d
NULL | b    | NULL | NULL | b
NULL | b    | NULL | d    | b; d
NULL | b    | c    | NULL | b; c
NULL | b    | c    | d    | b; c; d
a    | NULL | NULL | NULL | a
a    | NULL | NULL | d    | a; d
a    | NULL | c    | NULL | a; c
a    | NULL | c    | d    | a; c; d
a    | b    | NULL | NULL | a; b
a    | b    | NULL | d    | a; b; d
a    | b    | c    | NULL | a; b; c
a    | b    | c    | d    | a; b; c; d

Currently I use Notepad++ as follows:

  1. Convert tabs to spaces
  2. Align the data manually
  3. Use column mode to insert the pipes

The second step is the most tedious one and I would rather have at least this part automated.

Note: I use a browser when working and sometimes I have a text editor open alongside. The efficient solution is the one that requires least amount of effort. I can use:

  • Notepad++
  • Generic text editor with regexp find/replace support
  • JavaScript typed inside browser console
  • Online web service
  • PHP on command line (php -a)
5
  • 4
    What environment are you in? What tools do you have available? Which of those are you familiar with? Which ones are you willing - or unwilling - to use? How do you define "efficiency" for the purposes of this question? There are probably almost as many ways to do the job as there are people who want to do it; you need to provide additional information. See How to Ask a Good Question. Commented Dec 20, 2017 at 12:28
  • @JeffZeitlin I'll update question. Commented Dec 20, 2017 at 12:53
  • It's a simple awk script.
    – Barmar
    Commented Dec 20, 2017 at 17:30
  • @Barmar I am not using awk but I am sure someone else will find it useful. Commented Dec 20, 2017 at 20:30
  • ask a PCG question about it - lulz will ensue. Wait, it's already been asked... codegolf.stackexchange.com/questions/100613/… (note that TSV->CSV is only a single char difference... {{(⊃⍵)⍪⍉⍪↑¨↓⍉↑1↓⍵}s¨'⎕T'⎕T¨(s←1↓¨⊢⊂⍨⊢=⊃)¯1⌽⍵} seems nice enough to work on, eh?)
    – user201265
    Commented Dec 21, 2017 at 3:50

3 Answers 3

10

How can I convert tab separated values to an ASCII table?

I use Text Tables Generator for this kind of task.

I pasted your data on that page and it created the following table:

+------+------+------+------+------------+
| a    | b    | c    | d    | cat        |
+------+------+------+------+------------+
| NULL | NULL | NULL | NULL | NULL       |
+------+------+------+------+------------+
| NULL | NULL | NULL | d    | d          |
+------+------+------+------+------------+
| NULL | NULL | c    | NULL | c          |
+------+------+------+------+------------+
| NULL | NULL | c    | d    | c; d       |
+------+------+------+------+------------+
| NULL | b    | NULL | NULL | b          |
+------+------+------+------+------------+
| NULL | b    | NULL | d    | b; d       |
+------+------+------+------+------------+
| NULL | b    | c    | NULL | b; c       |
+------+------+------+------+------------+
| NULL | b    | c    | d    | b; c; d    |
+------+------+------+------+------------+
| a    | NULL | NULL | NULL | a          |
+------+------+------+------+------------+
| a    | NULL | NULL | d    | a; d       |
+------+------+------+------+------------+
| a    | NULL | c    | NULL | a; c       |
+------+------+------+------+------------+
| a    | NULL | c    | d    | a; c; d    |
+------+------+------+------+------------+
| a    | b    | NULL | NULL | a; b       |
+------+------+------+------+------------+
| a    | b    | NULL | d    | a; b; d    |
+------+------+------+------+------------+
| a    | b    | c    | NULL | a; b; c    |
+------+------+------+------+------------+
| a    | b    | c    | d    | a; b; c; d |
+------+------+------+------+------------+

You can then copy this output (the generator has done most of the hard work), paste into notepad++ and clean up as appropriate.

5

If you need a command-line solution, you can also use pandoc with the pandoc-placetable filter.

Place your table in foo.txt and execute:

pandoc-placetable --file=foo.txt --delimiter="\t" --header | pandoc -f json -t markdown-simple_tables-multiline_tables -o output.md

Which results in the following output.md:

| a    | b    | c    | d    | cat        |
|------|------|------|------|------------|
| NULL | NULL | NULL | NULL | NULL       |
| NULL | NULL | NULL | d    | d          |
| NULL | NULL | c    | NULL | c          |
| NULL | NULL | c    | d    | c; d       |
| NULL | b    | NULL | NULL | b          |
| NULL | b    | NULL | d    | b; d       |
| NULL | b    | c    | NULL | b; c       |
| NULL | b    | c    | d    | b; c; d    |
| a    | NULL | NULL | NULL | a          |
| a    | NULL | NULL | d    | a; d       |
| a    | NULL | c    | NULL | a; c       |
| a    | NULL | c    | d    | a; c; d    |
| a    | b    | NULL | NULL | a; b       |
| a    | b    | NULL | d    | a; b; d    |
| a    | b    | c    | NULL | a; b; c    |
| a    | b    | c    | d    | a; b; c; d |

To read from STDIN, leave out the --file argument. To print to STDOUT, leave out the -o argument.

4

ruslan’s idea of using the Unix/Linux column command is a good one, but the command line given in their answer doesn’t quite work. First of all, column doesn’t recognize \t (or \\t) on the command line as a tab.  If you have bash, you can do

column -t -s$'\t' foo.txt

Otherwise, you can do

column -t -s"$(printf '\t')" foo.txt

But even that doesn’t answer the question.  You can get the vertical bars by doing

column -t -s$'\t' -o' | ' foo.txt

which produces output like

a    | b    | c    | d    | cat
NULL | NULL | NULL | NULL | NULL
NULL | NULL | NULL | d    | d
NULL | NULL | c    | NULL | c
NULL | NULL | c    | d    | c; d
NULL | b    | NULL | NULL | b
NULL | b    | NULL | d    | b; d
NULL | b    | c    | NULL | b; c
NULL | b    | c    | d    | b; c; d
a    | NULL | NULL | NULL | a
a    | NULL | NULL | d    | a; d
a    | NULL | c    | NULL | a; c
a    | NULL | c    | d    | a; c; d
a    | b    | NULL | NULL | a; b
a    | b    | NULL | d    | a; b; d
a    | b    | c    | NULL | a; b; c
a    | b    | c    | d    | a; b; c; d

Adding the dash line after the header manually isn’t so tedious.


If you don’t have access to a full Unix/Linux system, you can use Cygwin or one of the other Unix-likes for this.

1
  • You didn't even comment on my answer to point out that it might not work. I was misled by the terminal output which aligned the text due to tabs being 8 chars by default (unlike my Vim set ts=4 setting).
    – Ruslan
    Commented Jan 6, 2018 at 18:36

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .