Python Tutorial 4: Data Structures

Python Tutorial 4: Data Structures


Data structures are basically just that – they are structures which can hold some data together. In other words, they are used to store a collection of related data. There are three built-in data structures we often use in Python – list, dictionary and set.

  1. Set
  2. Dictionary
  3. List

Detailed Explanation(optional)


Set

A set is an unordered collection with no duplicate elements.

This is the same ‘set’ as in math set theory – a collection of distinct objects. Shown with curly braces

In [4]:
my_distinct_set = {3,'two',3,1}
print(my_distinct_set)
{1, 3, 'two'}

As you may notice, they automatically sort in alphanumeric order (numbers first then string) and remove duplicates. They can also be made with the set inbuilt function

In [5]:
my_other_set = set(('a','l','e','x'))
print(my_other_set)
{'a', 'l', 'x', 'e'}
In [6]:
me = set('alex')
print(me)
{'a', 'l', 'x', 'e'}

Methods

Sets can be used to do set theory math operations like union, intersection, difference, symmetric difference

In [7]:
not_me = set('ting wei')
print(not_me)
{'g', 'w', 'n', ' ', 't', 'i', 'e'}

Difference – In my name but not in his name

In [8]:
difference = me - not_me
print(me - not_me)
{'a', 'l', 'x'}

Union – In mine or his or both

In [9]:
union = me | not_me
print(union)
{'x', 'g', 'w', 't', 'n', 'i', ' ', 'l', 'e', 'a'}

Intersection – In both mine and his

In [10]:
intersection = me & not_me
print(intersection)
{'e'}

Symmetric difference In either my name or his but not both

In [11]:
symmetricdifference = me ^ not_me
print(symmetricdifference)
{'x', 'l', 'n', 'g', 'w', 'i', ' ', 't', 'a'}

Dictionary

A dictionary is an unordered set of key: value pairs. You can essentially think of it similar to an actual dictionary, with a word and associate meaning/definition of the word

Shown with curly braces and a “:” separator

In [12]:
telephonebook = {'jack': 81244098, 'sape': 92344139}
print(telephonebook)
{'jack': 81244098, 'sape': 92344139}
In [13]:
print(type(telephonebook))
<class 'dict'>

Methods

In [14]:
print(telephonebook.items())
dict_items([('jack', 81244098), ('sape', 92344139)])
In [15]:
print(telephonebook.keys())
dict_keys(['jack', 'sape'])
In [16]:
print(telephonebook.values())
dict_values([81244098, 92344139])
In [17]:
telephonebook.clear()
print(telephonebook)
{}

List

Python knows a number of compound data types, used to group together other values. The most versatile is the list, which can be written as a list of comma-separated values (items) between square brackets. Lists might contain items of different types, but usually the items all have the same type. This is the most used data structure when you are doing analytics!

Again the builtin function can be used as an alternative to manually using square brackets

In [18]:
sq = [1,4,9,16,25]
print(sq)
[1, 4, 9, 16, 25]
In [19]:
sq2 = list((1,4,9,16,25))
print(sq2)
[1, 4, 9, 16, 25]

Elements can be accessed via indexing, with the first element with an associated index of 0

In [51]:
newlist = ["this","is","a","string"]
print(newlist[0])
this

Python conveniently allows for negative indexes, useful for long lists or to skip checking the length of a data structure

In [65]:
#The normal way of getting to the last item of the list
print(newlist[3])
#Alternatively, you could access from the back
print(newlist[-1])
string
						string

Elements can be edited easily within a list as well through normal assignment of value

In [66]:
newlist = ["this","is","a","string"]
newlist[0] = "that"
print(newlist)
['that', 'is', 'a', 'string']

Methods (click to see more!)

List has many useful methods that we use to process data

In [2]:
#Appending new elements to the list
lst1 = [1,2,3,4,5]
newnumber = 6
lst1.append(newnumber)
newstring = "a string"
lst1.append(newstring)
print(lst1)
[1, 2, 3, 4, 5, 6, 'a string']
In [3]:
#Inserting elements to specific position of the list
lst1.insert(5,"inserted item")
print(lst1)
[1, 2, 3, 4, 5, 'inserted item', 6, 'a string']

.insert above displaces the element in the fifth position and inserts a 6 into the list. In programming we start at 0. If you noticed in item assignment for lists above, when i called lst[0] = 12345654, it replaced the first element. The first element is in the zeroth position We can also just call list[index] to return the element at that index

 

Next: Python Tutorial 5 Basic Flow Control

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s