I’m a complete noob when it comes to bash, I extracted some timestamps from an xml file using xmlstarlet however its formatted as a space separated string rather than an array. I need them as an array so that I can use them in a for loop. As far as I can tell you can’t do that with xmlstarlet so I need to just convert the string.
I found this thread which says I can use IFS=’ ’ read -a arr <<< “$line” which works for a basic string but not the one I’m trying to use it with:
2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01:00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06-13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+01:00 2023-05-27T00:00:00+01:00
If I use the command on this string it only puts the first part in the array so I get 2023-06-19T00:00:00+01:00
I also tried arr=( $line ) which is also suggested in the thread but that does the same thing. Is there another way I can try to convert this, or a way to export from xmlstarlet straight to an array?
Can’t reproduce your issue:
$ line='2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01:00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06-13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+01:00 2023-05-27T00:00:00+01:00' $ arr=( $line ) $ echo "${arr[4]}" 2023-06-15T00:00:00+01:00
Are you sure it’s not working? Does
echo ${arr[@]}
return the whole array? Doesecho ${arr[1]}
return the second value? I think just echoing$arr
without specifying an index returns only the first value.Using Bash for this when source data is XML is simply wrong. It’s not the right tool for the job. You should consider learning basics of Python, which you could use to work with the source data.
Unless you already know how to do that in Python, and are simply interested in figuring out how to do that in Bash, in which case godspeed to you.
While I do agree that bash isn’t really the best tool for the job when working with XML data, there is a simple enough solution.
read -r -a arraytest <<< "2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01:00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06-13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+01:00 2023-05-27T00:00:00+01:00"
$ for ds in "${arraytest[@]}"; do echo "$ds"; done; 2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01:00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06-13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+01:00 2023-05-27T00:00:00+01:00
I need them as an array so that I can use them in a for loop
I don’t think you need an array then. Arrays are very cool but in general the default IFS for bash loop is a space already
$ your_xmlstarlet_command 2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01:00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06-13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+01:00 2023-05-27T00:00:00+01:00 $ for D in `your_xmlstarlet_command`; do echo "line begin - $D - line end";done line begin - 2023-06-19T00:00:00+01:00 - line end line begin - 2023-06-18T00:00:00+01:00 - line end line begin - 2023-06-17T00:00:00+01:00 - line end line begin - 2023-06-16T00:00:00+01:00 - line end line begin - 2023-06-15T00:00:00+01:00 - line end line begin - 2023-06-14T00:00:00+01:00 - line end line begin - 2023-06-13T00:00:00+01:00 - line end line begin - 2023-06-10T00:00:00+01:00 - line end line begin - 2023-06-03T00:00:00+01:00 - line end line begin - 2023-05-31T00:00:00+01:00 - line end line begin - 2023-05-27T00:00:00+01:00 - line end
If you really need an array
$ arr=(`your_xmlstarlet_command`) $ echo ${arr[0]} 2023-06-19T00:00:00+01:00 $ echo ${arr[1]} 2023-06-18T00:00:00+01:00 $ for D in ${arr[@]}; do echo "begin - $D - end"; done begin - 2023-06-19T00:00:00+01:00 - end begin - 2023-06-18T00:00:00+01:00 - end begin - 2023-06-17T00:00:00+01:00 - end begin - 2023-06-16T00:00:00+01:00 - end begin - 2023-06-15T00:00:00+01:00 - end begin - 2023-06-14T00:00:00+01:00 - end begin - 2023-06-13T00:00:00+01:00 - end begin - 2023-06-10T00:00:00+01:00 - end begin - 2023-06-03T00:00:00+01:00 - end begin - 2023-05-31T00:00:00+01:00 - end begin - 2023-05-27T00:00:00+01:00 - end
And just for completeness, when you are changing IFS good idea is to save it in OLDIFS and reset it after the loop. IFS is a special veriable and changing it changes how bash interprets the character it contains
OLDIFS=$IFS IFS=$'\n' # when you want to pass a list separated with newlines instead of spaces for ELEMENT in <list of elements separated with \n>; do <your commands>; done IFS=$OLDIFS
What exactly is your end goal? There might be other utilities that can get you to your wanted solution easier. I, the same as learnbyexample, did not have the problem.
[\ #27] line="2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01: 00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06- 13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+ 01:00 2023-05-27T00:00:00+01:00" [\ #28] IFS=' ' read -r -a array <<< "$line" [\ #29] for dt in "${array[@]}"; do echo "$dt"; done 2023-06-19T00:00:00+01:00 2023-06-18T00:00:00+01:00 2023-06-17T00:00:00+01:00 2023-06-16T00:00:00+01:00 2023-06-15T00:00:00+01:00 2023-06-14T00:00:00+01:00 2023-06-13T00:00:00+01:00 2023-06-10T00:00:00+01:00 2023-06-03T00:00:00+01:00 2023-05-31T00:00:00+01:00 2023-05-27T00:00:00+01:00