VBA Internship Tutorial 3: Basic Data Analysis

Welcome to the third part of our series. In this section, we will be covering the some basic VBA tools that can be used for simple Data Analytics in Excel.

While R and Python have emerged as being the tool of choice for many professional Data Analysts, it remains a fact that for many large organisations, especially those who have been in operation for decades or longer, Excel remains the primary tool for data management and analysis. This is due in part to its resilience and adaptability, but also to the complexities of R, Python and other data analysis tools.

As a result, many organisations still rely on Excel to analyse data, and it is important for us to understand how this can be done and automated with Excel VBA.

Slide7

Click to download the VBA Internship Tutorial – Working File

Areas covered in this tutorial:

  1. First Row/Column
  2. Do While Loops
  3. Filtering
  4. Nested If Statements
  5. And Statements
  6. Searching Within A Selected Range

Visual Guide

1. First Row/Column

Building upon our previous knowledge of defining the last used row or column within an Excel sheet, we will now look at how to identify a potential first column or row that we may desire to start with.

This is sometimes done because in certain cases, a database or spreadsheet may be incomplete or may not have all the necessary values needed among all its different categories of say, products, for us to make a meaningful comparison. Therefore, we may want to specify a first row or column that we will start our analysis on.

The format for how this is accomplished can be found below:

Dim firstrow As Long
firstrow = Range("LetterNumber").End(xlDown).Row

Dim firstCol as Integer
firstCol = Cells(RowNo,Columns.Count).End(xlToRight).Column

Here is an example:

First Row 1.PNG

In the code above, we have declared firstrow to be our variable that will store the location of the first row. Then, we specified it to be the first non-empty cell from the second row in column C. This brings us to row 5, as can be seen in the workbook:

First Row 2.PNG

2. Do While Loops

Now that we have specified our first row, we will now introduce a new form of variable loops using the “Do While” function. With this function, we are specifying that our code is to be executed until a certain condition is met. The format for doing this is as follows:

Do while Cells(RowNo, ColNo) Condition
Code to be Executed
Loop

In our example, we would want to replace all of the “Y” markers in our active column with the full word “Active”, so our code would then be:

Do While

In this manner,  we have specified that while the first row is to be row 5. In addition, from cell C5 till the first empty cell, all of the cells in column C will have the word “Active” printed into them. However, as you will see, this leaves a few entries at the bottom that are unchanged because between them exists a row with an empty cell, and this breaks the loop. This could pose an issue if we only want to analyse only those entries which have a full data set, which in this case means having a name and also being marked as active.

This is actually a rather common occurrence in real world datasets among organisations that have not yet fully integrated all their data into a centralized database, with different products and divisions often have different reporting requirements and features which leads them to having different data points. This brings us to our next learning point….

3. Filters

An easy way to resolve this issue is to apply filters, and then simply filter out those cells which are empty. The format for doing this is as follows:

Range("Range to be filtered").Select
Selection.AutoFilter
    ActiveWorkbook.Sheets("Worksheet Name").AutoFilter.Sort.SortFields.Add Key:= _
        Range("Range of Data"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption _
        :=xlSortNormal
    Selection.AutoFilter field:=ColNo, Criteria1:= Criteria to be fulfilled

Here is an example:

Range("A2:C13").Select
Selection.AutoFilter
    ActiveWorkbook.Sheets("Module 1").AutoFilter.Sort.SortFields.Add Key:= _
        Range("A2:C13"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption _
        :=xlSortNormal
    Selection.AutoFilter field:=3, Criteria1:="<>"

In the above example, we have specified that our data range is from cells A2 to C13, with the first row being the row with our data labels. Subsequently, we then filtered our data, specifying that it is to be sorted on ascending values, which by default is based on the data in the leftmost column, as well as removing those whom have a blank cell in column 3, as seen in the last line of code (“Autofilter field:=3).

We now have our full list of names that are active:

Filter.PNG

To disable filters, simply use the code “Selection.Autofilter” again to remove it.

4. Nested If Statements

Filtering is 1 way of working around incomplete datasets, however, sometimes we wish to fill in missing data with markers or placeholder data. In this case we can utilize nested if statements. The format is as follows:

If (condition) then
    (statement)
Elseif
    (statement)
Else
    (statement)
End If

In the above example, we specify that if the first condition is met, then we check if a second condition is also met. If it is, we execute a set of instructions, if not then we execute another set.

Here is an example:

Sheets("Module 1").Select
Dim i As Integer
i = 3
Do While Cells(i, 2).Value <> ""
    If Cells(i, 3).Value = "" Then
        Cells(i, 3).Value = "Reserve"
    ElseIf Cells(i, 3).Value = "Active" Then
        Cells(i, 3).Value = "Deployed"
    Else
        Cells(i, 3).Value = "Manager"
    End If
i = i + 1
Loop

In the above example, we specify we will loop our if statements until the cells in column B are empty. Meanwhile, we look in column C of the same row, and if the value isn’t empty, we input the word Reserve, if its Active then we change the word to Deployed, and if its anything else we input the value Manager.

Nested If.PNG

As we can now see, our new data table now reflects the active status of everyone.

5. And Statements

Sometimes, we want to specify 2 conditions that must be met. While this can be done using a nested if, it can also be done using the “And” function. The format for this is as follows:

If (condition) And (condition) then
    (statement)
End If

In the above example, we specify 2 conditions to be met before we execute our sequence of code. Since we are specifying both conditions on the same line, it also helps us to save space because unlike with nested if statements, they do not need to be on seperate lines. Observe:

ments. The format is as follows:

Dim i As Integer
i = 3
Do While Cells(i, 2).Value <> ""
    If Cells(i, 2).Value = "Soldat" And Cells(i, 3) = "Manager" Then
        Cells(i, 1).Value = "M01"
    End If
i = i + 1
Loop

Here we specify that we are looking specifically for 1 specific individual, and that if he/she is marked as a manager, then we change his/her ID number to M01.

And

This is a very simplistic example of what can be achieved using the “And” function. In reality, we can chain together multiple conditions to perform a variety of functions ranging from data formatting and analysis to error checking.

6. Searching Within A Selected Range

In certain cases, we may be looking to find a specific data entry for extraction. This could be due to a variety of reasons ranging from close monitoring of specific business expenses to the flagging of specific individuals for close review. The format is as follows:

Range(Specify Range).Select

Selection.Find(Specify Search Criteria)

Activecell.select

As you can see, we first have to specify a range for our search criteria to run against, after which we specify what is our search criteria before finally selecting the cell that meets it. A working example is as follows:

Range("B:B").Select
Selection.Find(What:="John", After:=ActiveCell, LookIn:=xlFormulas, _
        LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
        MatchCase:=False, SearchFormat:=False).Activate
ActiveCell.Select
Selection.End(xlToRight).Select
ActiveCell.Value = "Suspicious Transactions Detected"

Here, we specify that the column B is our search area, and that we are looking for 1 specific name, John, that we want to find. The search order is by rows and we search the column sequentially from top to bottom, looking only for entries that are an exact match.

These features are explored in greater detail in the working file, which you can use to gain a greater understanding of the topics covered in this section. In the next part of our series, we will be looking at our first assignment tutorial, which will require us to tie in everything that we have covered so far to create a periodic report.

This tutorial was put together by Goh Jun Hui.

Next: Practice Assignment

Previous: Tutorial 2 Data Consolidation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s