• BASH > json

       

      $ cat json
      [
        {
          "item1": "value1",
          "item2": "value2",
          "sub items": [
            {
              "subitem": "subvalue"
            }
          ]
        },
        {
          "item1": "value1_2",
          "item2": "value2_2",
          "sub items_2": [
            {
              "subitem_2": "subvalue_2"
            }
          ]
        }
      ]
      arr=( $(jq -r '.[].item2' json) )
      printf '%s\n' "${arr[@]}"
      value2
      value2_2

       

      You’ve got two choices, then, either using "\u0000" (NULs) as separators, or @sh.

      Note that bash‘s readarray now supports readarray -td '' to read NUL-delimited data into an array.

      Yes, that’s the only approach so far that handles arbitrary values. jq even happens to support non-text values in its strings as an extension over standard json. It seems to be using the safest form of quoting ('...'). I note that it transforms a NUL byte to \0 and doesn’t quote numbers nor false/true though that should be fine. As usual, note that it may transform numbers (like change 1e2 to 100 or infinity to 1.7976931348623157e+308).

      If so, one could pass the data through tostring before letting @sh format them.

      I don’t expect it would here. There are context where not quoting numbers can be an issue (like in echo '2'>file vs echo 2>file), but no here I’d say.

      #!/bin/bash
      v1='a\nb  \n'
      v2='c'\''\"d  '   # v2 will contain <c'\"d  >
      printf '$v1=<%s>\n$v2=<%s>\n\n' "$v1" "$v2"
      
      >json printf "%s\n" "[ \"$v1\", \"$v2\" ]"
      printf 'JSON data: '; cat json
      printf '\n'
      
      eval "arr=( $( cat json | jq -r '.[] | @sh ' ) )"
      printf '$arr[0]:<%s>\n$arr[1]:<%s>\n\n' "${arr[@]}"
      
      set --
      eval "set -- $( cat json | jq -r '[.[]] | @sh ' )"
      printf '$1:<%s>\n$2:<%s>\n\n' "$1" "$2"
      
      { readarray -td '' arr2 && wait "$!"; } < <(
         cat json | jq -j '.[] | (., "\u0000") '
      )
      printf 'rc=%s\n$arr[0]:<%s>\n$arr[1]:<%s>\n\n' "$?" "${arr2[@]}"
      
      { readarray -td '' arr3 && wait "$!"; } < <(
         { echo x; cat json; } | jq -j '.[] | (., "\u0000") '
      )
      printf 'rc=%s\n' "$?"

      Output:

      $v1=<a\nb  \n>
      $v2=<c'\"d  >
      
      JSON data: [ "a\nb  \n", "c'\"d  " ]
      
      $arr[0]:<a
      b
      >
      $arr[1]:<c'"d  >
      
      $1:<a
      b
      >
      $2:<c'"d  >
      
      rc=0
      $arr[0]:<a
      b
      >
      $arr[1]:<c'"d  >
      
      parse error: Invalid numeric literal at line 2, column 0
      rc=4
      bash $ arr=( $(jtc -w '<item2>l+0' file.json) )
      bash $ printf '%s\n' "${arr[@]}"
      "value2"
      "value2_2"
      bash $
      

      explanation on -w option: angular brackets <...> specify search entire json, suffix l instructs to search labels rather than values, +0 instructs to find all occurrences (rather than just first one).

      LIENS

       

      https://unix.stackexchange.com/questions/tagged/jq

       

       

       

      Tutorial

       

      GitHub has a JSON API, so let’s play with that. This URL gets us the last 5 commits from the jq repo.

      curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5'

      Show result

      GitHub returns nicely formatted JSON. For servers that don’t, it can be helpful to pipe the response through jq to pretty-print it. The simplest jq program is the expression ., which takes the input and produces it unchanged as output.

      curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | jq '.'

      Show result

      We can use jq to extract just the first commit.

      curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | jq '.[0]'

      Show result

      For the rest of the examples, I’ll leave out the curl command - it’s not going to change.

      There’s a lot of info we don’t care about there, so we’ll restrict it down to the most interesting fields.

      jq '.[0] | {message: .commit.message, name: .commit.committer.name}'

      Show result

      The | operator in jq feeds the output of one filter (.[0] which gets the first element of the array in the response) into the input of another ({...} which builds an object out of those fields). You can access nested attributes, such as .commit.message.

      Now let’s get the rest of the commits.

      jq '.[] | {message: .commit.message, name: .commit.committer.name}'

      Show result

      .[] returns each element of the array returned in the response, one at a time, which are all fed into {message: .commit.message, name: .commit.committer.name}.

      Data in jq is represented as streams of JSON values - every jq expression runs for each value in its input stream, and can produce any number of values to its output stream.

      Streams are serialised by just separating JSON values with whitespace. This is a cat-friendly format - you can just join two JSON streams together and get a valid JSON stream.

      If you want to get the output as a single array, you can tell jq to "collect" all of the answers by wrapping the filter in square brackets:

      jq '[.[] | {message: .commit.message, name: .commit.committer.name}]'

      Show result


      Next, let’s try getting the URLs of the parent commits out of the API results as well. In each commit, the GitHub API includes information about "parent" commits. There can be one or many.

      "parents": [
        {
          "sha": "54b9c9bdb225af5d886466d72f47eafc51acb4f7",
          "url": "https://api.github.com/repos/stedolan/jq/commits/54b9c9bdb225af5d886466d72f47eafc51acb4f7",
          "html_url": "https://github.com/stedolan/jq/commit/54b9c9bdb225af5d886466d72f47eafc51acb4f7"
        },
        {
          "sha": "8b1b503609c161fea4b003a7179b3fbb2dd4345a",
          "url": "https://api.github.com/repos/stedolan/jq/commits/8b1b503609c161fea4b003a7179b3fbb2dd4345a",
          "html_url": "https://github.com/stedolan/jq/commit/8b1b503609c161fea4b003a7179b3fbb2dd4345a"
        }
      ]
      

      We want to pull out all of the "html_url" fields inside that array of parent commits and make a simple list of strings to go along with the "message" and "author" fields we already have.

      jq '[.[] | {message: .commit.message, name: .commit.committer.name, parents: [.parents[].html_url]}]'

      Show result

      Here we’re making an object as before, but this time the parents field is being set to [.parents[].html_url], which collects all of the parent commit URLs defined in the parents object.


      Here endeth the tutorial! There’s lots more to play with. Go read the manual if you’re interested, and download jq if you haven’t already.

       

 

Aucun commentaire

 

Laissez un commentaire