YAML indentation for array in hash

Question

I think indentation is important in YAML.

I tested the following in irb:

> puts({1=>[1,2,3]}.to_yaml)
--- 
1: 
- 1
- 2
- 3
 => nil

I expected something like this:

> puts({1=>[1,2,3]}.to_yaml)
--- 
1: 
  - 1
  - 2
  - 3
 => nil

Why isn't there indentation for the array?

I found this at http://www.yaml.org/YAML_for_ruby.html#collections.

The dash in a sequence counts as indentation, so you can add a sequence inside of a mapping without needing spaces as indentation.

apparently it does not need indentation when mapping a scalar to a sequence. — akonsu, Commented Jun 9, 2013 at 21:54
Both are valid. I agree with you that they should not be. Even The Official YAML Web Site has both... yaml.org — nroose, Commented Mar 20, 2019 at 22:38

Victor Schröder · Accepted Answer · 2022-12-09 02:44:30Z

The short answer is that both are valid because they are unambiguous for the YAML parser. This fact was already pointed by the other answers, but allow me to add some gasoline to this discussion.

YAML uses indentation not only for aesthetics or readability, it has a crucial meaning when composing different data structures and nesting them:

# YAML:         # JSON equivalent:
---             # {
one:            #   "one": {
  two:          #     "two": null,
  three:        #     "three": null
                #   }
                # }
                
---             # {
one:            #   "one": {
  two:          #     "two": {
    three:      #       "three": null
                #     }
                #   }
                # }

As we can see, the simple addition of an indentation level before three changes its nesting level and removes the previous null value assignment we had for two.

This behavior is, however, not consistent when it comes to lists, as they tolerate the removal of a level of indentation that we would naturally expect to occur (as anticipated by the OP), in order to reflect the correct nesting level of the items. It will still work the same way:

# YAML:         # JSON equivalent:
---             #
one:            #
  two:          #
    - foo       # {            
    - bar       #   "one": {   
                #     "two": [ 
                #       "foo", 
                #       "bar"  
                #     ]        
---             #   }          
one:            # }            
  two:          #
  - foo         #
  - bar         #

The second form above is somewhat unexpected and breaks with the idea that the indentation level is connected to nesting level, as it is very clear that both two (an object) and the nested list are written with the same indentation, but are placed at different nesting levels.

What is even worse, it won't work all the times, but only when the list is placed immediately under an object key. Nesting lists inside other lists won't allow freely dropping a level of indentation because, obviously, would bring the nested elements to the parent list:

# YAML:         # JSON equivalent:
---             # {
one:            #   "one": {
  two:          #     "two": [
    -           #       null,
    -           #       [
      -         #         null,
      -         #         null
                #       ]
                #     ]
                #   }
                # }
                #
---             # {           
one:            #   "one": {  
  two:          #     "two": [
    -           #       null, 
    -           #       null, 
    -           #       null, 
    -           #       null  
                #     ]       
                #   }         
                # }

I know, I know... Don't even start and say that the example above is a bit extreme and could be considered an edge case. They are perfectly valid data structures and prove my point. More complicated situations also happen when mixing objects and nested lists of objects, specially if they have a single key. Not only it may lead to errors in the data structure declaration, but also becomes extremely hard to read.

The following YAML documents are identical:

# YAML:             # JSON equivalent
---                 # 
one:                # {
  two:              #   "one": {
  - three: foo      #     "two": [
  - bar             #       {"three": "foo"},
  - four:           #       "bar",
    - baz           #       {
    five:           #         "four": ["baz"],
    - fizz          #         "five": ["fizz", "buzz"],
    - buzz          #         "six": null
    six:            #       }
  seven:            #     ],
                    #     "seven": null
---                 #   }
one:                # }
  two:              #      
    - three: foo    # 
    - bar           #
    - four:         #
        - baz       #
      five:         #
        - fizz      #
        - buzz      #
      six:          #
  seven:            #

I don't know about you, but I find the second one much easier to read and follow, specially in a very large document. It's very easy to get lost in the first one, specially when losing the visibility of the beginning of a given object declaration. There is simply no clear connection between the indentation level and the nesting level.

Keeping the indentation level consistently connected to the nesting level is very important to improve readability. Allowing the suppression of an indentation level for lists as optional sometimes is something you have to be very careful about.

It would be really helpful if you would add the JSON equivalent for the last example in the same way that you did for all the other examples. But yes I agree YAML sucks. — Neutrino, Commented Jul 14, 2022 at 8:16
Actually, I feel exactly the opposite. Take a look at the last example in your answer, especially after syntax highlighting is applied - the distance between "two" and "three" is double of that between "one" and "two", somehow making "three" feel like two levels down rather than one. It's like mixing 2-space indentation with 4-space. The idea is that the actual word should be indented (or aligned) rather than the dashes themselves. - is two characters wide, so are two spaces. — zypA13510, Commented Jul 24 at 5:58
@zypA13510 "somehow making "three" feel like two levels down rather than one." That's because IT IS two levels down: three is an element of an object that is in two's list. — Luke Nelson, Commented Jul 30 at 10:28
@LukeNelson No, it is not. It doesn't matter if "three" is an object. You can take the "bar" on the next line as an example; it is just one level below "two", yet there are four characters separating them. — zypA13510, Commented Aug 4 at 7:13

Darshan Rivka Whittle · Accepted Answer · 2013-06-09 22:54:54Z

36

Both ways are valid, as far as I can tell:

require 'yaml'

YAML.load(%q{--- 
1:
- 1
- 2
- 3
})
# => {1=>[1, 2, 3]}

YAML.load(%q{--- 
1:
  - 1
  - 2
  - 3
})
# => {1=>[1, 2, 3]}

It's not clear why you think there should be spaces before the hyphens. If you think this is a violation of the spec, please explain how.

Why isn't there indentation for the array?

There's no need for indentation before the hyphens, and it's simpler not to add any.

answered Jun 9, 2013 at 22:54

Darshan Rivka Whittle

33.9k7 gold badges96 silver badges113 bronze badges

68

while there is no need for spaces I find it to be more readable
– random-forest-cat
Commented Apr 5, 2015 at 23:49
3

quite the opposite, when you have objects like kubernetes specs, the more indentations - the less readable it is, due to extra whitespace\scrolling\wrapping
– 4c74356b41
Commented Dec 4, 2020 at 13:06
12

I dare to disagree. The suppression of an expected level of indentation for lists makes the document much harder to read, mainly for big files such as k8s specs as you mentioned. Keeping the indentation in sync with nesting level is gold.
– Victor Schröder
Commented Apr 22, 2022 at 12:24

Add a comment |

Narfanator · Accepted Answer · 2024-09-30 21:45:43Z

15

It's so you can do:

1: 
- 2: 3
  4: 5
- 6: 7
  8: 9
- 10
=> {1 => [{2 => 3, 4 => 5}, {6 => 7, 8 => 9}, 10]}

Basically, dashes delimit objects, and indentation denotes the "value" of the key-value pair.

That's the best I can do; I haven't managed to find any of the reasons behind this or that aspect of the syntax.

edited Sep 30 at 21:45

answered Jun 9, 2013 at 22:21

Narfanator

5,7934 gold badges42 silver badges77 bronze badges

14

hm but you can do that anyway ... the equivalent (indenting all lines bar the top by 2 spaces) is the same result
– Nick
Commented Aug 14, 2018 at 10:57
Unfortunately, this is not compatible with the Perl 5.18 (the version I am bound to) built-in YAML parser. Without indentation, I get "YAML Error: Invalid element in map". I am not sure if newer versions of Perl have adapted to this apparently legal syntax.
– Myles Prather
Commented Oct 31, 2019 at 17:20
Correction: If I 'use YAML::Syck;' in Perl, I am able to read Ruby's default flavor of YAML. The best thing about standards is that there are so many to choose from :).
– Myles Prather
Commented Oct 31, 2019 at 17:47

Add a comment |

Collectives™ on Stack Overflow

YAML indentation for array in hash

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged
yaml
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged yaml or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
yaml
or ask your own question.