In my previous post, I showed how easy to import data from CSV, JSON, Excel files using Pandas package. Another popular format to exchange data is XML. Unfortunately Pandas package does not have a function to import data from XML so we need to use standard XML package and do some extra work to convert the data to Pandas DataFrames.
Here’s a sample XML file (save it as test.xml):
<customer name="gokhan" >
<customer name="mike" >
<customer name="john" >
<customer name="david" >
We want to convert his to a dataframe which contains customer name, email, phone and street:
name email phone street
0 gokhan firstname.lastname@example.org 555-1234 None
1 mike email@example.com None None
2 john firstname.lastname@example.org 555-4567 None
3 david None 555-6472 Fifth Avenue
As you can see, we need to read attribute of an XML tag (customer name), text value of sub elements (address/street), so although we will use a very simple method, it will show you how to parse even complex XML files using Python.