Parsing Data Formats (XML / JSON / YAML) using Python 3 – A little patch work but found some good demo’s!


One big note on this – It will need some updating at least in the YAML section as my training material I currently use teaches in Python 2, which a Cisco Instructor for DevNet has confirmed that DevNet exams will focus on Python 3.

So this article will need a bit of polishing up, but overall


 

NikoPalmLab4

This is an example of using Data Formatting (JSON) from a previous Python Lab!

This lab can be found here which uses Python NAPALM to “get_facts” from a host, and I demonstrate the output difference here from a Python Dictionary output to JSON format:

NikoPalmLab5

This is called both “Data Parsing” or “Serialization” of output using Python, which can be seen here NAPALM was returning its “get_facts” information in Python formatting, whereas by using JSON it made it immediately more human readable.

What I will be reviewing is using Python to Parse the files of just created of the different Data Formats to pull certain information from them, but I wanted to show a practical way above that Serialization can also be used, to Parse Router output into a more human readable format using JSON in my NAPALM script rather than Python ASCII output!

Now I’ll go through in the same order, starting with XML Parsing with Python!

For reference this is the final version of my XML file I wrote in a previous post:

ParsingXML1

This is what we will be “Parsing” with Python, and as will be seen you can use Python to essentially pick out pieces of this file that you need, by using “Python Libraries” for the different Data Formats that will understand which parts you are looking for.

To create my parsing script I will be using the following link from Python.org:

https://docs.python.org/3/library/xml.etree.elementtree.html

Also some help in printing the Sub-Elements from this web blogger:

https://www.guru99.com/manipulating-xml-with-python.html

^^^ We all have to use each others blogs for resources at times 🙂

The first Script I pieced together using several sources sort of kind of worked:

ParsingXML2

Which this is kind of smashed together, given the “Element” names are unique I don’t really need the Attribute (the ID=#) in my output, but I also was looking for a way to parse this a little nicer and found it:

ParsingXML3

The funny thing I was actually trying to see if I could get it print the actual “Root” name of “AwesomeFood” but it ended up putting this nice spacing between categories, however that little mash up at the end so it is sitting on top of the PowerShell prompt I made a small change to print a null space to give a line of separation:

ParsingXML4

Maybe not absolutely the best possible outcome, but I am happy with my progress that I made by piecing together a lot of different sources of information to create this tiny script, which I will review here line by line to finish of XML Parsing via Python 3:

ParsingXML5

You will want to review the two URL’s at the top of this segment to fully get this if I don’t explain it well or you want to know more, but essentially with all languages we are going to “import” their Python Library into the script to use functions from that Library (as seen I “import json” in my Python NAPALM script at the top of this post).

Then I define my tree as ET.parse(‘Filename.xml’) so it knows which XML file it will be parsing, and then I use a “for” loop to say “for child in root:” which essentially means for every “Element” in the “root” which we’ve defined as the tree.getroot() function to print or “Parse” out the following values that are within its XML Library functions.

Now the (child.x) variables I found on the Python site, but I had to use Guru99’s blog post to find that remaining “for subelem in child:” loop to print(subelem.text) statement to print all the sub-element values (I want to know the flavors and toppings dang it!).

The print(root.text) line in the original “for” loop was me just playing with the script to see what that would do, then of course I know I can print a null space with print().

The moral of the story – This is something you will want to practice, look at official documents like Pythons site which is the equivalent of Cisco White Papers, and other peoples blogs / youtube videos / reddit threads to understand the concept!

No training solution is 100% perfect, everyone seems to teach Python a bit differently, so its good to really dive in and look at different peoples examples to familiarize yourself!

With that out of the way, I will now see if I can work my way through JSON Parsing!

Again my final file called “LoopedBackEmployees” in JSON Data Formatting is below:

ParsingJSON1

In looking at different ways to print this in Python 3, I completely stole and give total credit for my initial python script to:

https://www.geeksforgeeks.org/read-json-file-using-python/

I just plugged in my own values, but copy and pasted his script into VSC, and it runs fine:

ParsingJSON3

However looking back at my NAPALM script, I toyed with this a bit to see if I could maybe Parse it a bit different or more readable:

ParsingJSON5

I tried to utilize the print (json.dumps) to make it a bit more readable, but it honestly just prints it out in JSON formatting really no matter what you put here.

The funny thing is the “sort_keys=true” on this statement is actually what causes it to print backwards, if removed:

ParsingJSON6

Now it is in the correct order, however printing it in JSON formatting it DOES require that indent=# as JSON format requires indentation as part of its format (even if the file has all sorts of white space and blank lines all over it), but the output will be properly formatted with indentation which must equal at least 1.

I may research this some more, but in looking at different resources, but knowing these two ways I feel is sufficient for exam day. Also there is a really good resource for the JSON values that map to Python Values below you may want to review:

https://realpython.com/python-json/#deserializing-json

^^^ This site has a TON of good info about writing to a JSON file, Serialization / De-Serialization (writing / reading), different formatting tricks and lots of good stuff!

Finally onto the last but not least Parsing in Python 3 Data Format – YAML!

NOTE – This does require a Linux OS to Parse or practice parsing, as you do need to “pip install pyyaml” via the lines Bash prompt, and I have not found a way to easily make this work on Windows – So I would spin up an Ubuntu 20.04 Desktop VM!

With that being said, below is my YAML Doc I’ve now loaded into my Ubuntu VM VSC:

Parsingyaml1

This is two documents inside one file, the second just being 3 lines of really I don’t know what (Key Pair Values), so I am interested how this will Parse, so lets get to it!

I actually just clipped off the secondary document to get it working in Linux, as it was throwing some error that had me beating my head against a wall, and this is what I got from the Python script used to Parse the YAML Document:

parsingyaml2

Which… is really ugly. I apologize for the stretched view, but wanted to show the output in my Visual Studio Code that ran the Py script DID show the output just in a very gross looking clump of information that looks worse than the formatting itself!

There are literally too many resources for my to start listing here that I gathered info from, YAML is a real pain in the grass to Parse due to need to the “pip install pyyaml” requirement to Parse it, so I am not sure how far I will get with this – But I will see if we can make that a little less ugly.

Here is the initial code from the Python Script written for Parsing:

parsingyaml3

And the more I am looking I am not finding a whole lot of better ways to parse it, and at this point I am spinning my wheels on this subject, so I will leave this as the Python 3 way of Parsing YAML as it surprisingly not documented much at all in Python.org or other semi-official sites (most information is contained in threads on StackOverflow).

One thing to note when I was loading multiple files from my original script into this YAML doc, it would throw an error as it was not expecting a second YAML file within the data it was given as shown here:

parsingyaml4

Note that both errors refer to the same thing, what I find odd is that it errors on Line #2 where I start my first file (below the — on Line #1), however it throws the same error on Line #24 where it is actually the beginning Notation of the next YAML file — in my file.

At this time I am not sure how to fix this but wanted to make anyone aware before I get a chance to ask my instructor of my 8-week DevNet bootcamp about this, that there is some way to open

Being this page will be linked in my DevNet Associate Blueprint page I have stickied, I will make sure this gets updated once I have a better understanding of YAML Parsing using Python 3 (where my study materials unfortunately are using Python 2).

That wraps up my review of Parsing the languages using Python!

That was a lot more work than I had anticipated being my main study source was presenting it in Python 2, and I am not teaching Python 2 here, as I was told from the horses mouth (A Cisco Instructor) not to bother studying Python 2 for the exam as it is sunset and Cisco DevNet will focus on using Python3.

Until next time my fellow geeks!!!

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s