DEVASC Python – Working with Data Types Raw Text / JSON / XML / YAML, tons of MUST KNOW details for DEVASC exam day! Deep Dive!

I feel I should warn up front, this is a very long, detailed post filled with things that will appears on the DEVASC exam questions – Do not miss these easy points!

All cute stuff aside lets get right into more Python – Working with Data Types

Python is able to use its native language to work with raw text files, however binary files will require modules to be imported to perform Serialization or De-Serialization (Difference explained in JSON Segment) – This requires the import of modules like json / xmltodict / PyYaml to use those modules functions to parse the data.

Going back to PEP8, lines should not be any longer than 79 characters, with a new line at the end of it with the character “\n” in the script which tells the script to move down a line.

When working with any data file it must be “open()” and “close()” as files will not close themselves, some different example of code you might see and their variables meaning:

readsomething = open(“filename.txt”, “r”) to open the file with read-only permission
writesomething = open(“filename.txt”, “w”) to open the file with write permission

Some other notable characters are to work with files are as shown below:

  • r = Open for reading the file
  • w = Open for writing to file, truncates the file first
  • x = Opens for exclusive creation but fails if file exists
  • a = Open for writing, Appending the file!
  • b = Open file in “binary” mode
  • t = Open in text mode (Default mode)
  • + = This operator is used to update files via r / w / a

Highlighted in red are the ones I would know for working with Data in Python for DEVASC day, “r”, “w”, “a”, “a+” and that “t” is text / Default data format file will open / create in.

To demonstrate this I jump on the Ubuntu VM with Python and a quick way to make txt files:

The file can also be read just using a built in Python Module as shown below:

Note that when you open a file, you need to “file.close()” it, or you could lock up the file within the Python script – For exam day it is important to note files MUST be closed!

Python contains a “context manager” that is referred to as a “with” statement!

A demo of the “with” statement below shows that this “Context Manager” performs a file open in the “with” statement and automatically closes the file after performing the action:

I highlighted the “…” to really underscore the fact that Python IDE goes into a “Function” configuration / coding mode, so the “with” or “with open” statement “as data” is a Context Manager that automatically open and closes the file after the statement runs its course.

We can also use this to append this file, by opening as “data:” with an “a+” to append it:

Note that when I finish appending it shows the # of chars in the line (79 chars max on a line per PEP8 guidance), I used a /n to kick down a line or it would have been printed on the immediate next line down, it would have appended directly to the original line had I not left a white line in the original text files when creating them.

I used “exit()” to back out of Python back to Ubuntu, and “cat writeme.txt” to verify that the appended text via Python is indeed in the txt file stored in the Home Directory in Ubuntu.

Remember – Manually opened files MUST be closed, however “with” statements are a ‘Context Manager’ that performs opening and closing for you, I would be amazed if this is not a question on your DEVASC exam!

Parsing Data using Python and Python Modules!

This will be a fairly quick review for exact syntax of parsing data properly for exam day, I have beat this topic over the head, but I’m interested in exact syntax from Cisco OCG!

Parsing CSV Files

A CSV file can be a database spreadsheet, a plain text spreadsheet like an Excel File, or it can be a completely plain text bulk export file that is simply plain-text separated by commas.

First things first, I created a .csv and a .txt to play with both equally to ensure both work:

As shown whether regardless of the CSV or TXT file extension the files are exactly the same format / content within them, so I want to manipulate both and prove they both work the same from an actual CSV extension file from a TXT extension file.

First I will go through just creating the objects / lists with “routers.csv” as there’s a lot there:

Here you probably need to zoom in the see the wording correlation though I tried to visualize it, so I will bullet point it quick to clarify the correlation of how things break down:

  • import csv – Imports the module to work with CSV files in Python
  • rtrsfile = open(‘routers.csv’) – Assigns variable to open my local CSV file on Ubuntu VM
  • rtrsreader = csv.reader(rtrsfile) – Assigns Variable to use this CSV module function on the initial Variable created that “opens” the initial CSV file on the Ubuntu VM
  • rtrslist = list(rtrsreader) – This creates a list object of all this to Parse through!
  • rtrlist – This command displays the contents of “routers.csv” in a Python List format!

I did the “rtrlist” first just to show that Python officially turns a CSC plain text file into a Python List of objects, so we have officially parsed a Data Format into Python Objects! 🙂

I then do a “rtr[0]” / “rtr[1]” / “rtr[2]” to demonstrate that these index 0-2 when I list out the items individually, this will be important later looking at the “with” statement!

To further demonstrate this by manipulating this CSV file in Python one step further:

In terms of the list I numbered 0-2 both downward and sideways by its list index position!

I then call out lists as objects with the first set of square brackets [ ], then I call out an object within that list with the second set of curly brackets, highlighting the list / object index # and highlighting those objects within their list above it in this picture to show how it works.

One big concept leading into “with” statements and parsing CSV files is the spacing, that the object in the index 0 spot does NOT have a space in front of it, while the other 2 objects indexed at 1 and 2 DO have leading space, which will need to be formatted!

Speaking of “with” statements to open / close, lets take a look at an example here:

There are a few key things in the above “with” statement that allows white space to be stripped from the above output, and to understand this you break down the statement:

  • with open – Opens the routers.csv file to run the with statement
  • variable = csv.reader(data) – This sets a value to represent the CSV reader function
  • for row in variable: – This defines another variable “row” to represent the list within the data in the file, allowing the device / location / ip_addr to be a String Data Type
  • You can now use the “.strip” method on the strings after index 0 that have a white space

I will demonstrate this with a series of slight differences in the “with” statement for overkill:

I was confused by this one, as “.rstrip” didn’t do anything at all, so I tried just using “.strip” :

This Method does strip white spaces both from front and back of this String, but it is not strip them for both the strings that have that leading white space as shown, so once more:

And there is the proper formatting, I am not sure why rstrip did nothing, but that is that.

What happens if we start mixing things up / printing in different order / assigning different index #’s to items in the “open” statement? Lets see! This is important to know! :

This is such a critical principal of 0 index, this is important to understand (below)!

Note that it still prints the “Lists” from 0-2 index as they appear in the file in this example, so changing THIS order / #’s will only impact where list objects appear in the statement, but it will print the lists in order from the original CSV file – However if we revisit this graphic:

THIS IS SO CRITICAL – When doing manual List object manipulation a single [#] is going to print lists in certain orders, two [#][#] are picking objects out of the list itself, but using the “with” statement this will iterate through lists in order of the file!

If nothing else, remember “open” or “open with” is a Context Manager, files otherwise need to be manually opened and closed to avoid jamming up the script, and both the Comma Separated Value items within both a CSV or TXT file (both are plain text) are separate lists in Python and each list is assigned and index # starting at 0 along with its objects it contains!

I didn’t touch the text file I created as this is getting drawn out, but it works the same as long as it is formatted the same way – The file extension csv or txt has no bearing on Parsing!


Working with JSON in Python / Parsing Data

JSON is a variation of JavaScript – JavaScript Object Notation – !!!!AND IT DOES NOT SUPPORT COMMENTS WITHIN THE FILE IF ASKED ON EXAM DAY!!!! NONE!

Again – If you are asked “What Data Format does not allow comments?” First answer is JSON, the second is XML (they can be added with tags but not #)!!!! Important!!

Like how XML has “legacy” support for SOAP API’s and XML only (more on that in API post), if you see a question on which Data Type is a variation of JavaScript it is JSON (like if you see which data type is used with SOAP it is XML).

JSON is by far the most widely used / Language of Web Services and Cisco Infrastructure, so this is possibly one of the most important to know for DEVASC, and it gets this designation as the Data Format king (essentially) because of both its human readability and its portable data structure that can be used with several different applications and programs.

I will go through a quick JSON formatting / Key:Value / Array / Nesting crash course.

If you are comfortable typing and reading JSON, go ahead and skip right over this part, I will mark down post where the crash course ends and DEVASC Parsing begins!

Below is an actual copy of my updated Resume I coded in JSON out of boredom:

To zoom in on this a bit more to see the bracket usage clearly I’ll zoom a bit here:

To add just a bit more context and clarity to JSON, here is the top of my Resume file:

There are a few huge things I want to illustrate in these graphics very clearly:

  1. This entire 113 line file is encapsulated by a Key:Value pair start with {My_Resume: on Line 1, and then I open a List (in Python terms) or Array in JSON terms, so this entire file is a single Key:Value pair that begins with {My_Resume: and then the Array continues downwards until the Array closes on Line 112 and the {My_Resume: closes on Line 113!
  2. Note that I use a simple Key:Value pair for “About Me” section, then in Job History I nest multiple Arrays as the Value to the Key like Work History and Job Duties
  3. When a Key:Value pair is closed, it requires a comma, unless it is the last Key:Value item
  4. For every Bracket opened, the bracket must be closed up to the final } of the file!

One thing I had to really look at there, is that an Array or square bracket is only used if the following list of items is the Value to the Key, otherwise you can use curly brackets if you are listing off several straight forward Key:Value Pairs – For example:

“Item1” : “Item1 Value”,
“Item2” : “Item2 Value”,
“Item3” : “Item3 Value”

The initial Key does not need to be on Line 1 with the beginning open curly bracket, however JSON does require it to open with a Curly Bracket, and end with a Curly Bracket.

Below is my first ever JSON file I had made about Loopedback Employees JSON file:

One big note with JSON Formattion – Standard Indentation is 4 spaces, however outside of “String Literals” shown here in Orange in VSC JSON ignores with space in the file!

^^^ This file shows that a curly bracket Key:Value pair can be actually nested as the Value of a Key:Value Pair as shown in the “Name” Values on Lines 5 / 11 / 17 in this file, but they must be separated by a comma as shown until the last Key:Value Pair or its not valid!

JSON accepts any Data Type for the purposes of DEVASC such as integers / floats / boolean / string literals, there is really only one catch, and that is dates cannot be numeric:

Dates must have quotes to make it a String Literal rather than a Numeric Value, this also goes for all dotted decimal # values in Network Devices like IP’s / Subnet Masks / Etc, they cannot be input to JSON with dots between the #’s and must be a String Literal with “”.

/End Crash Course on JSON Formatting, time to hit Parsing!

To start this off I want to rattle off some quick bullet points for main pointers:

  • “import json” – Imports JSON module and functions to work with JSON data
  • load() – Imports JSON Data and format into a Python Dictionary from a file
  • loads() – Import JSON Data from a String to Parse / Manipulate within program
  • dump() – Used to write JSON Data from Python Objects to a file
  • dumps() – Used to take JSON Dictionary data and convert it into a “serialized” string to be Parsed and Manipulated by Python

This introduces “Serialization” and “De-Serialization“which is a critical term to have straight for exam day and future studies in Python!

Serialization = Converting a Files Data Structure into a State which it can be shared and manipulated, but is recoverable back to its original Data Structure format

De-Serialization = Returning the Serialized Data to its original Data Structure Formatting for storage or later use

^^^^ That is interesting wording, while you have a file “open” and working in it that file is Serialized, and once you write / close the file it is De-Serialized.

I’ve always thought of Serialization / De-Serialization into a one command thing, but when leaving a file open in my IDE during labbing, it remains “Serialized” until closed!

Now if I were writing the DEVASC, I’d ask about load/loads/dump/dumps, so be sure to remember singular = whole file while plural = single objects / dictionaries.

To lab this up I just pasted the LoopedbackEmployees file into a file on the Ubuntu VM:

I then jump into the Python3 prompt and get to work, bring back the “with” statement:

Using a “with” statement as our Context Manager for opening and closing the file, I assign the variable “json_data” to open the file up and read it, then I assign the variable of “json_dict = json.loads(json_data)” then verify Python see’s this class as a Dictionary which the Python print function also shows by dumping out a Dictionary of output.

So if we leave off the s on loads, will it print the file not just as a string? Lets see:

Nope. I will get to this shortly, but it needs to be built into the “with” statement.

However I now have “json_dict” that now turned JSON into a Python Dictionary to manipulate, in this case I am going to take it easy on the employees and set FIRED!!! to False for all employees so they get their job back – I looked at changing the “Key” in the Key:Value pair in JSON via Python and that is beyond the scope of DEVASC so I will just re-hire them!

It turns out that Manipulating Arrays / Keys / Multiple Key:Pair values is very difficult!

I’ve looked into all of these different ways of splicing it apart, it involves importing multiple modules / making classes / making functions, and is well beyond the scope of DEVASC.

For this reason, I’ve whittled my JSON file down to a single employee to fire:

Not to give up, I will revisit this, but this is starting to drag so time to get this moving along:

What we are doing is calling out first the Key:Value pair of LoopedBack_Employees:, but then we call out the Key within that Value of “FIRED!!!” to change its boolean value to True!

I am sure there is a “for” loop or something to iterate through multiple JSON data Key:Value Pairs / Arrays, but that is so complex (not terribly but its not something you’d memorize), so I just wanted to make this one point of adjusting this Value.

Now we can print the Python Dictionary to verify, dump it back into its JSON file, and verify:

Note that when working with going from JSON file -> Python Dictionary it was “loads” to load a String to manipulate, whereas when “Serializing” it back to JSON the “open” statement is a bit different using the “w” modifier to write the file, and we ‘json.dump’ into the file.

At this point I am actually not entirely sure if white space will make or break JSON data format when working with it, so I would say on exam day that JSON requires a 4 indent in the data file (though you can google JSON white space to verify that for your exam day).

XML Parsing with Python

The webpage keyboard input is starting to slow down, so I’ll try to make XML and YAML Parsing a bit briefer sections here, with no crash course on formatting to finish up 🙂

HUGE NOTE – XML DOES support adding comments in the data file, but will not accept #comment, though empty tags can be used to add a non-attribute / comment

Also as shown in the file I made in Ubuntu, note the top line that is needed for XML:

This is the XML Data set I’ll be working with, really with XML the key things to remember of formatting is the Header at the top of the file, that it is HTML like and has “legacy” support for SOAP API, and that each <tag>requires a</tag> to be valid.

I don’t believe white space will break the file like a #comment would, but at this point I think its safe to assume 4 indents for just about every format except YAML which is fairly inline.

I hit a really odd snag I forgot I hit before after the “pip install xmltodict” in Python:

So at this point I am freestyling from Pythons 3.x site for XML, I will need to see if I can hammer out the “import xmltodict” issue however what it should look like:

I am not sure I will revisit this as there isn’t a lot to it, the “with” statement opens the file to read, xml_dict is the variable that makes it into a Python Dictionary and you use “xmltodict.unparse” along with “pretty=True” to print it in XML Format.

Then at the bottom instead of opening with “fh:” with JSON when writing the open with statement, and then you also “xmltodict.unparse” with xml_dict and “pretty=True” to write it back into the local file.

The site I am using :

I am kind of free-styling from this website in Python, however I’ll bullet point it:

  • “import xml.etree.ElementTree as ET” is the module being imported to Parse
  • tree = ET.parse(‘FileName.xml’) tells Python the XML Tree is the file specified
  • root = tree.getroot() assigns “root” as what I’d call the “Parent” of the entire XML Tree
  • root.tag shows ‘AwesomeFood’ as the Root and “.tag” is the Parent of its child objects
  • root.attrib shows that the attribute is defined as the data within the curly brackets

This has already gone off the rails enough for XML I will end this section here, as my input is barely moving on the page at this point, and tackle YAML to finish it off (if it works)!

YAML (YAML Ain’t Markup Language) Parsing with Python

YAML is pretty simple, similar indentation to Python, it doesn’t require 4 spaces exactly because it is so readable, however everything should be clean such as this YAML file:

  • Three hyphens defines the beginning of the YAML file, 3 periods end it!
  • Allows for #comments within the file and work fine
  • Indentation can be minimal, but does need to be on the same level
  • The values themselves do not care about white space, only indentation
  • This language is used to make Ansible Plays / Playbooks, and run in Python scripts
  • Uses Hyphens to define Elements within its Structure to define Data

Actually I have a second YAML demo in this same VSC to show some Hyphens as Elements:

This is a very simple formatting of YAML, though it actually very simple to work with, which is awesome when you are making Ansible files, so we love YAML!

I install PyYaml with “pip install pyyaml” and create the YAML file called “markuplanguage”:

It may be hard to see but there are just little hyphens for different “Elements” for each piece of this structure, again white space doesn’t matter outside of indentation, so when you see Ansible YAML files that are not THIS air tight its no problem.

And yes because I’m tired, study fried, and juvenile I named it markuplanguage.yaml 🙂

Now lets get to Parsing to land this plane and get some sleep for another day of study:

I hit a snag but saved me:

The YAML File is now loaded into Python as a Dictionary (Serialized), and its similar to manipulate attributes as the other languages, but due to me have only one level of elements and attributes attached to them this may be fairly watered down demo:

After working through error after error, I found doing a “yaml.load(file_var)” to view how Python see’s the Dictionary, and there is a Square Bracket right at the front, because my test file begins with an indent – I might come back to re-lab but at this point I am just going to bullet point important commands for exam day!

  • yaml.load(var_name) – This is a huge help to see how Python is interpreting the Data!
  • “data.write(yaml.dump,(yaml_dict, default_flow_style=False)) to write back to file
  • ^^^ That is the second line of the “with open(“file”, “w”) as data: statement

I will get back to YAML once I get into Ansible / Network Automation more, I apologize for that crappy ending for YAML Serialization, but after 9 hours of work and 8 of labbing this entire post my brain is fried like chicken! Might come back to clean that up.

I have been studying since the minute clocked out of work, so I am calling it here!

This should be more than enough of the hot topics for DEVASC to get those “gimme” points that are too easy to get wrong, I will have a shorter “Error Handling in Python” and “Code Testing” article but I am fried and this page is hardly moving when I type so I’m calling it.

I’d know both ways of XML for exam day, even though my xmltodict for some reason just didn’t work, most of these are fairly similar, you could probably just jot down key words and key notes from this page and be good for any Data Formatting with Python on exam day.

Anyways, I am rambling, until next time! The march to exam day is officially on! 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s