Excel With Yan: 2015

sábado, 19 de dezembro de 2015

The VBA Modules Workbook - what a deal!

Today I decided that I want to share all the general purpose modules I have created so far all at once. And to simplify even more, I decided to share a workbook that has all the modules.

Here is the link to download the workbook:

https://www.dropbox.com/s/pxxwgap8hl7m9xw/VBA_MODULES_43.xlsm?raw=1

I suggest you to download it, and check the module's lines of code. Most of them have comments (I intend to comment all the code in the following versions, but since this might take some time, I would rather share everything now).

What can be done with those modules one might ask:

1. Perform queries and commands in SQL DBMS such as SQL Server, Oracle Sever, MySQL, Sharepoint 2013 and Access using very few lines of code and with standardized input and output of data)

2. Send outlook e-mails as easily as it can be.

3. Handle 2-dimensional arrays in a very robust manner (something that VBA is poor at doing - Read the post "Explaining VBA Arrays (and its issues)" to see why).

4. Handling files more easily (opening, reading, writing etc.)

5. Generating log as easily as it can be done in VBA.

6. Handling data from the web and so on.

Although these modules are not something that will cause any revolution in the programmer's life, I'm sure it has many useful things for the daily activities since it simplifies basic stuff such as connecting to databases and handling arrays.

I intend to explain each module in detail in other posts. Until then, please be free to use everything.

domingo, 15 de novembro de 2015

Explaining the ACCESS_CLASS - And how to easily perform SQL Queries in VBA

Today I want to talk about SQL queries. Database management systems (DBMSs) are present in pretty much any company of the globe and it is very important to be able to fetch data from them (an to write on them as well) as easily as it can possible be.

Many people don't know how to get data from a DBMS such as the Access application using VBA. The reason for that is very simple: it is complicated to do it using VBA. You have to know the connection string, how to use the connection object, the record set object and to manipulate the output only to perform a simple query.

All this complication is reduced by using the ACCESS_CLASS which I have developed. It contains basic methods to connect to the database and executing commands on an access database file.

I have created an example workbook that gets data from an Access database. To use it, you need to download the Access database file (world.accdb) and copy it to the C:\TEMP folder and to download the ACCESS_EXAMPLE.xlsm workbook, that contains the example routine, and press the 2 buttons you will see when you open it. The link for both the files is displayed below. This example works in windows 2007+, Excel 2010+ and Access 2013 (you must be sure those are the programs you are using).

world.accdb file link:

https://www.dropbox.com/s/2vn10qa7p81nnqv/world.accdb?raw=1

ACCESS_EXAMPLE.xlsm file link:

https://www.dropbox.com/s/yqw5a7hydi6sgxz/ACCESS_EXAMPLE.xlsm?raw=1

The example routines (in the "MAIN" module) have many more lines of code than it is actually necessary to perform the task. The reason for that is that I commented the code using the log file, this is, the log is my substitute for the comments. I also have used many of the best-practices I talk so much in this blog to make the code as professional as it can be. So please don't be scared by the number of lines of code. Try to read them at least once and you will understand how easy is to perform SQL queries in Access files using the ACCESS_CLASS.

Another interesting point is that most of the methods present in the ACCESS_CLASS are also present in other classes to connect to other kinds of DBMSs such as Sharepoint 2013, MySQL, SQL Server and so on. So once you are familiarized with the methods (connect, execute, transaction etc.) you can use the same knowledge in other classes. The other classes differ basically only by their attributes. For instance: the access database requires only the file path to be set, while a SQL Server requires the driver name, the hostname, the user, the password and the database name. The Sharepoint, on the other hand, requires the url of the site and the GUID of the list to be passed as attribute.

domingo, 25 de outubro de 2015

Excel Tables - and how useful they are

Today we will talk about a topic that is useful for both VBA and Excel. The tables are one of the most useful Excel features and, we have noticed, are sometimes left aside by Excel users. Believing that the reason for that is the lack of real understanding of the potential of the tables, specially regarding scalability, we are writing this article to show their importance and how to use them.

Table is an Excel object that provide more tools to work on Excel ranges. The typical layout of an Excel table is shown in the Image 01 below.

Name	Age	Company
John	30	ABC
Peter	40	XYZ
Mary	24	XYZ
Frank	56	XYZ
Mark	34	ABC

Image 01

When a range is transformed into a table, it is automatically formated so it becomes similar to the Image 01. This formatting can be easily changed, but the layout in Image 01 is the most common.

Once a table like in Image 01 is created, some very useful tools become available. You can easily sort the table, filter it, show totals at the bottom of it (total count/total sum etc.). All this and other features are explained in many topics that are spread around the web, but those are not the foccus of this article. The foccus of this article is scalability.

Let's suppose we have the very same data of Image 01, but in the form of ranges. See Image 02

Name	Age	Company
John	30	ABC
Peter	40	XYZ
Mary	24	XYZ
Frank	56	XYZ
Mark	34	ABC
Paul	56	ABC

Image 02
And lets supose we have a code to add up the ages (we know this is something that could be easily done by the =SUM() function, but this is just an example). Let's see how the code could be:Image 02

Option Explicit

Public Sub Sum_Ages()

Dim i As Long
Dim aux As Double

For i = 2 To 6
aux = aux + int(Range("B" & i).Value2)
Next i
MsgBox aux

Exit Sub

End Sub

This is not a really terrible code. It does work and it does solve the problem. There are many other ways to solve this problem, but this code is as good (or as bad) as any other that do not use tables.
If the reader has not seen the issue yet, let's imagine that a new row is created. Let's say that the ranges is now like in the image 03 below.

Name	Age	Company
John	30	ABC
Peter	40	XYZ
Mary	24	XYZ
Frank	56	XYZ
Mark	34	ABC
Paul	56	ABC

Image 03

As it can be seen, a new row, was added. In that case, would the code above still give us the answer we want? The answer is NO. The code adds only the ages from row 2 to 6, so the new line, the 7th, would be left out. To solve this one would have to rewrite the code to get it - something we are keen to avoid. So what would be the solution for this problem? The answer: tables.

Let's now convert this very range into a table and name it "tbl_employee" (to name the table, after you convert the range, you should go to "Design" and then "Table Name" and put the name you want).

Now we's have a table like in the image 04.

Name	Age	Company
John	30	ABC
Peter	40	XYZ
Mary	24	XYZ
Frank	56	XYZ
Mark	34	ABC
Paul	56	ABC

Image 04

Again, the name of the table is "tbl_employee". With that in mind, let's write a new code that does the same of the previous one.

Option Explicit

Public Sub Sum_Ages2()

Dim myTable As Range
Dim m As Long
Dim i As Long
Dim aux As Double

Set myTable = Range("tbl_employee")
m = myTable.Rows.Count

For i = 1 To m
aux = aux + int(myTable(i, 2).Value2)
Next i
MsgBox aux

Exit Sub

End Sub

The difference between this code and the previous one is that now we do not fix the number "6" as the number of rows. Instead, we set the range object as the table we have just created and ask for the number of rows of it. So for as many rows as the table has, we are certain to get the right value. We have just made our table scalable vertically. The same worths, though, for horizontal scalability, since a new column could be easily captured by the ".Columns.Count" method of the range object.

As you can see, all we had to do was to convert the range into a table, name it and then use it in our code as a normal range. The outcome, though, was very different from not using a table, since we couldn't get the updated number of rows or columns that easily.

It is important to notice that the tables automatically understand new rows and new columns. A new value right below the last row is immediately understood as a new row. This makes adding new content to the tables extremely easy and, once again, all the objects related to the table are also automatically updated.

Now, again, this is a very VBA related subject. But using tables has many benefits that go beyond VBA. Once the range is converted into table, any object that references it will also become scalable by definition, since the references maps not the cells addresses (which are absolute and fixed) but the table's relative addresses, which are not fixed and are dynamic. Graphics, formulas and VBA are made scalable when using tables.

Bottom line: there is no reason for not to use tables instead of regular ranges. It is a powerful tool Excel provides and it should be used to make our projects more robust and easy to develop.

domingo, 18 de outubro de 2015

Excel & VBA Best Practices

Here is our list of Excel and VBA best practices. This is only a suggestion, not a definite set of rules. We believe using these tips can make Excel and VBA much easier.

Use the "Option Explicit" clause on the top of every module. This has been explained in another post and makes the programmer debugging much easier

Don't use the type of data as the prefix of a variable name (hungarian notation). This also has been said in this blog. Once you use the Option Explicit clause and have declared your variables, there is no need to use hungarian notation.

Handle Errors. Using the "On Error GoTo error_label" clause is not only useful, but fundamental. It makes sure your code will not have unexpected behavior. The "On Erro Resume Next" should be avoided though.

Raise Errors. Once the exception happened and the error handling code is executed, if the function is not the main sub, the error must be raised, otherwise the parent function will not know about the error and this may affect the rest of the execution.

Log. Having a log module is crucial. We provided the LOG_MODULE that is perfect for the task and logging should be made with no restriction (performance issues are easily addressed by turning the log off).

Be explicit in every possible way when programming. Another subject extensively explained in this blog, being explicit is very important for the sake of the quality of the code. Using full addresses of ranges, declaring if the routines are public or private and using the "ByVal" clauses are just some examples of explicit information that should be in your code.

Program with simple organization and indentation.

Avoid data manipulation in cells - use arrays whenever is possible.This also has been explained in this blog. Data manipulation in cells is slow and should be avoided.

Use modules. Modules are great to organize your code and to make it portable. Actually this is the core of this blog.

Use the "Private" and "Public" clauses. If the function or sub will be used only in its module, it should be private, otherwise it will be listed outside its module for no reason.

Be aware of the performance. VBA programs are single-threaded. This means you can not make anything in parallel. So keep in mind that small improvements in program performance add up and make your programs much faster.

Log the error description. Everytime an exception occurs and is handled, the description of the exception should be printed to the log file so this information is not lost. This is particularly useful for connections with backends. Err.Description is the attribute to be used.

Use full addresses of objects. It is important to uset, for example, "Thisworkbook.Worksheets("Sheet1").Range("A1")" instead of simply "Range("A1")". Implicit information is one of the most problematic things in VBA.

Use "Long" data type instead of "Integer"

Use "vbNullString" instead of "". Not only because it is more elegant, but more efficient in some cases.

Give every workbook object that you use a name. This means that every worksheet that you use, every table and every cell that needs to be refereced in VBA should receive an unique and descriptive name. This way it will be much easier to reference those objects in VBA instead of using, for example, the "A1" or "E3" address references - which can can change when we add or remove rows and columns.

Use hungarian notation with workbook objects names (objects that are not VBA variables). In the other direction of VBA variables, the objects spreaded around the workbook should be referenced with names that contain a tip about its nature. For example, a worksheet could be named "wks_main" or "wks_something" where "wks" is the prefix that categorize it as a worksheet. A table could be "tbl_something", a cell "cell_something" and so on. This, believe us, can make programming much easier .

Use tables. For sure one of the most useful Excel object, the tables as also known as dynamic ranges. They should be used everytime you have a list of related objects - a table! Once you have transformed the range into table (and, of course, named it), any update on it is reflected immediately on related objects such as pivot tables and graphics. Not only that, adding, for example, a new row changes the table size automatically, so you will always get the correct size of the table in VBA using the ".rows.count" range method. This is one of the best tips we have to give.

Exception Handling in VBA

This post is about something often ignored by VBA programmers. Many people think that handling exceptions in VBA is simply useless. It is much better, they say, to see the error, open the VBA Editor and go fix it right away.

For those we ask: would you do the same if it was a Java application? What about a C++ application?

Of course one of the main advantages of VBA is the fact that the programming environment is right there next to the application and ready to receive modifications on the fly. But our purpose here is to create programs that can run as stable as a python application, with no need of any kind of manual intervention on the code whatsoever.

One of the best ways to avoid your program to bizarrely crash is to use exception handling.

We don't want to spend time explaining what exception handling is because there are many other source on the internet that can explain it better than we. But for those who have no idea what exception handling is, let us try to explain it in our on way:

Exception Handling is the programmers response to an unexpected failure event of the system. In a more practical way: it is a part of code that is executed when another part of the code fails by any reason.

VBA has only one way that we know of dealing with exceptions: the On Error clause. Let us give an example of how it is used:

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Option Explicit

Public Sub Exception_Handling_Example()

Dim x as Double

On Error GoTo label_01

x = 1000 / 0

Exit sub

label_01:

Exit Sub

End sub

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

The code below, when compiled, will present no errors. But when executed, it should generate an exception because we are trying to divide a number by zero. But since we've put the "On Error GoTo" clause, instead of halting the program and showing an error message, the program will simply exit.

The process that was developed for this program to end without halting is the Exception Handling. The On Error GoTo label_01 clause tells the program to jump to the label_01 part of the code in the occurrence of an exception.

All the lines of code below the "On Error" clause will be affected by it until another "On Error" clause is used. This means that one "On Error" clause invalidates the previous one so you can use different labels for exceptions different parts of the code.

Now let us show you our suggestion of what a program with an Exception Handling should be.

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Option Explicit

Public Sub MAIN_Exception_Handling_Example2()

Dim function_name as String

function_name = "MAIN.MAIN_Exception_Handling_Example2"

LOG_MODULE.Log function_name

Dim x, y as Double

On Error GoTo error_1000

x = 1000

On Error Goto error_2000

y = 1000 / 0

LOG_MODULE.Log function_name, 0

Exit sub

error_1000:

LOG_MODULE.Log function_name, Err.Description

LOG_MODULE.Log function_name, "The x caused a problem"

LOG_MODULE.Log function_name, 1

err.Raise(1)

Exit Sub

error_2000:

LOG_MODULE.Log function_name, Err.Description

LOG_MODULE.Log function_name, "The y caused a problem"

LOG_MODULE.Log function_name, 1

err.Raise(1)

Exit Sub

End sub

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

The program above has elements that we use extensively in all of our codes. This is really the basic layout of pretty much all of the functions we created and that are in our modules. Let us try to explain what is going on.

The first thing one must notice is that this program generates log. This means that the LOG_MODULE must be imported (the LOG_MODULE is available for download in this blog).

The first thing the function does is to log its name. To do so we create an variable, "function_name" and give it the name of the function module plus the function name - the function is in the MAIN module. After that we call the "LOG_MODULE.Log" function and pass the name of the function as a parameter. This prints the parameter - the function name - to the log file, which is generated by the LOG_MODULE.Log function. This is the explanation of the 3 lines below.

Dim function_name as String

function_name = "MAIN.MAIN_Exception_Handling_Example2"

LOG_MODULE.Log function_name

After that, we declare 2 variables, x and y as doubles. After that we start our 1st error handling scope: we use the "On Error GoTo error_1000" clause which means that if any part of the code from that point on generates an exception, the program must jump to the label "error_1000". This is true until the second error handling scope, which starts with the "On Error GoTo error_2000" clause, which means that if any part of the code from that point on generates an exception, the program must jump to the label "error_2000".

This gives our function 2 possible outcomes in the presence of exceptions: the code below the error_1000 label and the code below the error_2000.

One important thing to notice is that this program WILL halt in the event of an exception. This is because we put the "Err.Raise(1)" method in the end of each error handling. This method "raises" the exception so that if the function is being called by another function, the parent function will get an exception. Had we not put this clauses, the function would never halt, it would just log the err.Description message, log the "ERROR" message and exit the function.

This might be a little confusing for those who are not used to handle exception. But think this way: sometimes we can not predict what will happen to our program. For example, we can ask for a database connection and the database management system might be out, so we would get an connection error - which would cause an exception. In that case, we might not want the program to enter debug mode and show an error message, but rather show us, say, a friendly message or even to try to connect again without showing any messages. We would also like to see what happened written in a log file, so we wouldn't have to go into the source code and execute the program step by step to debug it.

That is exactly what this way of handling exception does. It gives the tools for the programmer to decide what to do in the event of a exception. And we strongly recommend the logging, so that the exception messages are stored in he log file so that the programmer can understand what happened.

Keep in mind that all the modules we will publish have this structure. They all generate log and they all handle exceptions. This makes them as robust as, say, Java or Python libraries (of course, giving VBA's limitations) and this strucuture is ideal to create real VBA application that are designed to work like any other real application.

Explaining the "Option Explicit"

Many of you may have noticed the "Option Explicit" code written on the beginning of our programs. Some of you may know its purpose, many of you don't.

The Option Explicit is a "compiler directive" which is just a fancy name to say that this has nothing to do with the program itself (it will not become part of the program binary code) but rather will be only used during the time of compiling (the time where code will be transformed into machine code - binary numbers).

To understand why we use this directive, read the code below

public sub awarenessTest()

auxiliar = 30
auxiliar = auxiliar - 15
auxiliar = auxiliar + 20
auxiliar = auxiliar +8
auxiliar = auxiliar - 3
msgbox cstr(auxilliar)
exit sub

end sub

Now if we asked the reader to tell me what would be the value that would appear in the message box by the end of the execution of this program, unless you skipped 2nd grade classes, I'm quite sure your answer would be "40". But that is not the correct answer. The answer is "0".

The reason why the answer is "0" is because the message box does not print the "auxiliar" variable; it, instead, prints the "auxilliar" variable (with 2 letters "L"). This is a common "typo" mistake, hard to find in the code and that causes terrible crashes in programs. This kind of typo is so common that it, alone, justifies the use of the "Option Explicit".

The "Option Explicit" tells the compiler to only accept variables that are previously declared by the programmer (using the "Dim" word). So although declaring variables may seem like a wast of time, not declaring them (together with not using the "Option Explicit") may cause us a much greater wast of time debugging our code. If, in our example, we had used the "Option Explicit" and had declared the auxiliar variable, the message box line would make the compiler prompt an error message, since the variable "auxilliar" is not defined, making our mistake clear.

Another reason for using the "Option Explicit" is making the use of variable prefix unnecessary. Let us explain: some programmers like to put the type of the variable as a prefix of the variable name. For example, if we had a variable x as an integer, its name would be, say, "intx" or "int_x". Let us show a few examples of how this is done:

Dim dbl_age as Double
Dim obj_mail as Object
dim intX as Integer

This nomenclature - which even has a name: "hungarian notation" - they say, makes the program easier to be read. We strongly disagree.

Once you have declared your variable and told its type, there is no need to put its type again on its name. It is just... redundant. So if you use the "Option Explicit", you are obliged to declare it and therefore don't have to use hungarian notation.

We strongly recommend you to always use the "Option Explicit" code on the top of every module.

Why you should be explicit when coding

This is one of those subjects that people who suffer with VBA would ignore, since it seems like just a best practice (and who need those, right?), but that makes all the difference.

Being explicit is writing

''''''''''''''''''''''''''''''''''
dim x as String
x = Thisworkbook.Worksheets("Sheet1").Range("A1").value2
'''''''''''''''''''''''''''''''''''
instead of
'''''''''''''''''''''''''''''''''''
x = Range("A1")
''''''''''''''''''''''''''''''''''
The idea here is that while the 1st gives an exact and undoubtedly address for the cell, the 2nd makes assumptions: it assumes that the correct workbook and worksheet are active.Although the programmer might had made this true in the previous lines by activating the workbook and worksheet, the odds are the he or she didn't.

What usually happens is that we write a code today and we are certain that, for example, there will be only one workbook opened or that that worksheet will always be activated, so there is no need to specify the full address of the range. But, in a point of the future, we change the code without remembering this assumptions and the program crashes with no apparent reason - and this kind of frustration is one of the main causes people hate VBA.

The ".value2", which is related to the content of the cell, is also a form to avoid unpleasant surprises like getting a formatted value of a cell instead of its content. It also makes the code clearer : "x", in the example, is not a range object, but a string so, although the compiler is smart enough to understand that when assigning the range to x you actually meant its value, you shouldn't do that. This is one of the most important things we have to teach here: do not let the compiler make assumptions for you. This WILL cause problems in the future. It is important to always be 100% straight about what you are doing. So please do yourself a favor and use the full address of objects and make explicit that you want values instead of the object itself.

Another thing we would like to point it out can be seen in the two functions below:

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
function f1(x)
    y = x*2
    f1 = y
end function
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Option Explicit

public function f2(ByVal x as Double) as Double

    dim y as Double

    y = x * 2

    f2 = y

    exit function

end function
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Although both functions would work perfectly, it is clear that the f2 function was written with much more information. We know, by reading the code, that the function is public (which means it can be accessed from other modules), that x and y are doubles, that x is passed by value (which means that x has the same value of the argument passed when the function is called, but it is not the argument itself) and that the function returns a double value. All this information is implied in f1, which makes much harder to understand what the programmer really wanted - we mean: did the programmer actually wanted the f1 function to be public or did he simply forget to declare it as private?

Being explicit is one of the best ways to avoid small mistakes when programming in VBA. This goes in line with our statement in the MAIN post (the 1st post of this blog) which says that we do believe a 10% overhead to implement best practices in our code is rewarding. The reward is fewer errors and an easier to maintain code.

Sub or Function?

Now this is a topic you can easily find answers online. So why do we bother mention it? Well, we believe there are a few more things that must be said.

If you read the posts around the subject, you will see that the main difference of a sub from a function is that "functions return things and subs don't". This is in fact true, but it may not even be the principal difference.

In our opinion, the core difference between a function and a sub is that a sub can be executed alone while functions need to be called from other functions or subs.
This means that if you want to execute, for example, a program by pressing a button (assigning a macro to a shape, for example), the code must be inside a sub and not a function (the sub must also be declared as public). If you have a function without parameters and try to execute it (clicking the Run button or pressing F5) the code will not run (it will actually ask you for a sub to run), but a sub will run normally (if a sub has parameters, it can not be executed like that).

On the other hand, functions, when declared as public in a module, can be called in cells like any other Excel default formula (like "average" or "vlookup"). This means that if you have a public function named, say, "multiply3", declared in a module that returns a number multiplied by 3, you can call it, for example, by typing "=multiply3(A1)" in a cell.
This would pass the A1 cell as a parameter and execute the function, returning the value multiplied by 3 to the cell. This can not be done with subs. This is very useful when we have a big formulas combinations in a cell that has become too hard to understand and we can write a function that does exactly the same as the formulas combination, give it a nice name and use it in the cell instead. This is definitely a best practice and it avoids those unreadable formulas across the worksheets.

This would make us wonder then: what about when I have a problem to which the solution returns nothing and that will be called by another sub? This problem can be solved by both functions or subs. In that case, the programmer is free to choose from them, but if you want our opinion, in that case, we would use functions if the solution required parameters and subs otherwise. The reason for that is that subs with parameters have two syntax for being callled (with or without parenthesis around its parameters) and mixing those two syntax on a code is a little annoying - this is REALLY the only reason.

Bottom line: make sure you really understand the differences between functions and subs before coding. If you are comfortable with the concepts, you will be able to think in more solutions for individual problems such as big formulas combinations in a cell or calling programs from a button.

Why you should use arrays a lot

Although we will talk a lot about VBA and programming, this blog, as the name indicates, is about Excel - of course we intend to publish articles about VBA in other platforms such as Outlook (we have a great code for Outlook that you MUST see!) - and that what this post is about (we will have posts talking exclusively about Excel - with no VBA involved - but this is not the case yet).

Let's suppose that, in the worksheet named "Sheet1", you have the ranges A1 to E100000 filled with random numbers (this means you would have 500,000 cells filled with numbers).

Now lets suppose you would like to sum all the values and print the result in a message box. The code below would accomplish that.

option explicit

public sub MAIN_Sum_And_Message()

    dim myRange as Object
    dim i, j, m, n as Long
    dim value as Double

    value =0
    set myRange = thisworkbook.Worksheets("Sheet1").Range("A1:E100000")

    m = myRange.Rows.Count
    n = myRange.Columns.Count

    for i = 1 to m
        for j = 1 to n
            value = value + cDbl(myRange(i,j).value2)
        next j
    next i

    msgbox cStr(value)

    exit sub
end sub

This code, although it uses very good best practices like using the range full address and converting the values to double before adding them, would be very slow. The reason for that is because we are accessing the cells, one by one, to get their values - this means that for each cell value, we are going after the range object and reading its value.

Now let's try another approach.

option explicit

public sub MAIN_Sum_And_Message2()

    dim myArray as Variant
    dim i, j, m, n as Long
    dim value as Double

    value = 0
    myArray = Thisworkbook.Worksheets("Sheet1").Range("A1:E100000").Value

    m = ubound(myArray, 1)
    n = ubound(myArray, 2)

    for i = 1 to m
        for j = 1 to n
            value = value + cDbl(myArray(i,j))
        next j
    next i

    msgbox cStr(value)

    exit sub
end sub

The code above would be ridiculously faster than the previous one. The reason for that is that the ".Value" property of the range object is the most efficient way to read data from a range and, once the data is read and put in an array, the data is loaded in the computer memory and it becomes available in the fastest way possible.

Now this may all sound too extreme. Why would we bother with a few seconds more of execution? Well, this is one of the things that makes all the difference when we have operations in large sets of data. It may be the difference between a program be feasible to be executed instead of taking as long as hours to be executed.

So the bottom line here is to load the range values to arrays everytime you have more than a few lines to work on. This will spare you execution time and even make your code more elegant. Another reason for that is because when we introduce the ARRAY_MODULE (which simplifies the use of arrays), you will see that most of the functions receive arrays as parameters, so you better get used with arrays - we will use them all the time.

One more thing: if you think this is the only thing you can do to make your code run faster, your are wrong. We'll talk about performance best practices in another post and introduce the PERFORMANCE_MODULE module, that abstracts all the performance improvement best practices in a single function.

Explaining VBA Arrays (and its issues)

First of all, we'd like to make clear that this is not a blog to learn Excel and VBA basics. Our purpose is to give more tools to make Excel and VBA simple and easy.

We are saying this because now we will talk about a basic concept of the VBA language, the Array, but the reason for that is to explain the need of a whole module, the ARRAY_MODULE, to deal with arrays instead of simply using them without any treatment.

Array definition: a VBA data type which contains a group of variables of the same data type, which can be addressed by indexes.

Here is an example of an array declaration:

Dim myArray(2 to 12) as string

This line creates a uni-dimensional array with 10 strings (index goes from 2 to 12). So the following lines of code would work:

myArray(2) = "john"

myArray(12) = "peter"

myArray(6) = "0"

On the other hand, the following lines of code would NOT work:

myArray(1) = "mary"

myArray(13) = "yan"

The reason why this would not work is because the index is smaller than the lower bound (in the first line) and greater than the upper bound (in the second line).

This happened because we defined the upper bound and the lower bound when we declared the array. The "(2 to 12)" meant that. Another way to declare this array would be like this:

Dim myArray(11) as string

This would give us an array also with 12 strings, but with the index going from 0 to 11 (0 is the default lower bound - this means VBA uses it when it is not explicitly declared by the programmer).

Once you have an array, you can check its boundaries using the Ubound and the Lbound functions, which return the upper bound and the lower bound of an arrray dimension respectively. The code below would print 0 and 11 to the immediate window:

debug.print Lbound(myArray, 1)

debug.print Ubound(myArray, 1)

Notice the "1" parameter in the ubound and the lbound functions. This is the dimension of the array we are asking for the boundaries. Since we are using a 1-dimension array, we can only ask for the boundaries of dimension number 1. This means that the following line of code would give us an error:

debug.print lbound(myArray, 2)

This causes an error because we are trying to access the second dimension of a 1 dimension array. To declare a 2 dimensions array, we would have to do something like this:

Dim myArray2(0 to 99, 0 to 9) as string

This would give as an array with 100 rows and 10 columns. A 3 dimensions array could be defined as such:

Dim myArray3 (0 to 9, 0 to 9, 0 to 9) as String

Or, equivalently,

Dim myArray3(9,9,9) as String

This would give us a 10x10x10 = 1000 elements array. (0 to 9 accounts for 10 elements, 10 for each dimension of the 3 gives us 10x10x10 = 1000 elements)

(Please notice that we are using string arrays, but any other data type could be used.)

The following code could be used:

myArray3(5,5,5) = "test"

myArray3(0,0,0) = "something"

So what is the big deal with that?

Let's suppose we used a function that returned an array and we attributed its return to an array named, say, "misteryArray' , like in the line below:

mysteryArray = someFunctionThatReturnsAnArray()

and we don't know how many dimensions it has (the misteryArray would had to be declared as a variant - there are many blogs that explain what a variant data type is, so I don't think it is necessary to explain it here). If we tried, say

msgbox ubound(misteryArray, 2)

and the array is a 1 dimension array, we would get an error message (an exception).

A person who has programmed in VBA knows this is a recurrent and annoying issue because it is very common to get arrays as outputs of functions so we don't really know how many dimension they have. So when we try to get the boundaries of the array (and, consequently, its size), we can ask for the boundary of a dimension that does not exist and get an error - and this is terrible.

Another major issue is the lower boundary. By default, the lower boundary of an array is zero, but this is not always the case. If we, for example, get the values of a range and put them on an array using the "value" method, like in the code line below, the lower boundary is, by default, 1.

misteryArray= Range("A1:C10").value

debug.print lbound(misteryArray, 1)

The code above would print "1" to the immediate window. This means that our array would have 10 rows, indexed from 1 to 10 and 3 columns indexed from 1 to 3.

The problem with that is that we are never really sure whether a N elements array dimension is indexed from 1 to N or from 0 to N-1 or from k to N+k-1.

Those issues are some of the things that makes VBA programming more difficult than most of the other languages.

To address those issues, we have created the ARRAY_MODULE, which will standardize all the arrays as 2 dimension arrays with each dimension being indexed from 0 to N-1. So every array will become a N x M matrix that can have its boundaries asked with no risk of getting an error.

We have chosen 2 dimensions as default because the Excel cell ranges are organized as 2 dimensions matrices - so this will give us a straightforward mapping from ranges to arrays and from arrays to ranges.

The reason why we chose zero as the lower boundary (instead of 1) is because most of the other programming languages work this way (this will create some issues when dealing with ranges though - ranges are indexed from 1 to N).

We will introduce the ARRAY_MODULE in another post. It is a very interesting module (and with a lot of room for improvement), with many useful functions that clear the road for other powerful modules.

sábado, 17 de outubro de 2015

Sending an Outlook E-mail From Excel Using VBA

There are many posts around the web that explain how to send an e-mail from Excel using VBA. All of them use the same approach, which is to give a sub code where the user should edit some parameters and paste the code in his or her code. Our approach is a little different.

In our approach, the programmer will be able to use a single line of code of a function whose parameters are the obvious parameters of sending an e-mail: recipients, subject, message, cc, attachment file path and if you wish to send it automatically or wants to see the send e-mail screen.

The only line of code the programmer will be required to add to its code is the one below (please notice that this is not a full code of an application, but just a line of code to be added to a function or a sub):

Call EMAIL_MODULE.Send_Email(recipient, subject, message, cc, attachment, False)

The recipient, subject, message, cc and attachment parameters are strings variables with the desired information and the "False" means the program should not display the send e-mail screen (it should send it automatically - if True, a pop-up will appear).

But how can we achieve such high degree of simplicity (abstraction) to send an e-mail using VBA? For this line of code to work, all the programmer has to do is to import 3 modules: LOG_MODULE, EMAIL_MODULE, EMAIL_CLASS. The code of each module is published in this very blog and the explanation of how to import them are also in a post. So instead of learning the details of how to send an e-mail with VBA, simply import the modules to your project and add the line of code above to your function or sub (with your parameters) and an e-mail is certain to be sent (unless, fo course, you don't have Outlook running in your machine).

The important thing to notice is that once the LOG_MODULE, the EMAIL_CLASS and the EMAIL_MODULE modules are imported, this function (in the code line above) will work in any module of the user's VBA project. Not only that, but the function "Send_Email" can even be called as a regular excel function within the worksheets! This means that if you select a cell in a worksheet and type "=Send_Email(A1, A2, A3)" and put the recipients e-mail address, the subject and the message in A1, A2 and A3 cells respectively, the e-mail will be sent too (the outlook application must be running).

The details of why this works will be explained further, but the reason why this way of sending is, we believe, better than the other ways of sending the e-mail is exactly that: you don't have to learn the details - so it is simple and straightforward.

-----------------------------------------------------------------------------
DOWNLOAD :

File Name: SEND_EMAIL_v01.xlsm

Link for Download: https://www.dropbox.com/s/tkchcm2c0qeptk1/SEND_EMAIL_v01.xlsm?raw=1

Explaining VBA Modules

The main difference of this blog from the others, in a more technical explanation, is how we deal with vba modules. Instead of publishing lines of code of functions or macros, that solve individual problems, we publish full module lines of code, so one can import the module and use all the routines and functions available on it.

This is the same approach used in the other languages. All of them use this "library" model. Even Excel uses it too. Let me explain.

All the API's (application programming interfaces) a user can write in a vba program (this is, the name of the functions, variable types etc.) are defined in the VBA references. Those references are found accessing the VBA Editor, clicking in 'Tools" then in "References" (Image 01)

Image 01: VBA References

The references in the image that are checked are, in a simplistic way, files with the definition of the codes the user can type into the editor. If the user wants to, say, use a ADODB connection function to connect to a SQL Server, he or she must check one of the "Microsoft Active X Data Objects" libraries available to add this reference so the APIs for this kind of connection are available - otherwise the compiler will not find the functions the user coded, and a error message will be displayed (Image 02).

Image 02: A reference is missing

This is how VBA abstracts the programming for the user: it makes references available for the programmer so he or she can add to solve specific problems such as dealing with Outlook objects or Word objects.

By default - as Image 01 shows - not all the references are checked. This means that for some programs, the user must add the references before coding - an example is when dealing with Word objects, which require the "Microsoft Word" reference checked, which is not checked by default.

So those references are all the libraries the VBA Editor makes available for the programmer. So our idea is to give the programmer more references - more APIs -, but, instead of really using references, which require a special compiler and some complex work to be created, we use modules, that can be added almost as easily as references.

So let's start explainig what are modules. Modules are TEXT files (very important to know that) that contain regular VBA code but that can be imported from and exported to Microsoft applications (such as Excel).

For the sake of simplicity, for now we will assume that there are only 2 types of modules: CLASS MODULES and REGULAR MODULES (or simply MODULES). See image 03 below.

Image 03: modules and class modules

The only 2 real differences, from the user's perspective, of this 2 kinds of modules are the text file layout (presence of the header) and the file extension. The programming differences will be discussed in another post.

When we export a regular module (we'll teach how to do that in a minute), the file exported has the ".BAS" extension and, when you open this file in a text editor, you will find the vba code but with a first line header similar to this:

Attribute VB_Name = "THIS_IS_AN_EXAMPLE_MODULE"

On the other hand, when you export a CLASS MODULE, the file has the extension ".CLS" and a text header like this:

VERSION 1.0 CLASS

BEGIN

MultiUse = -1 'True

END

Attribute VB_Name = "ACCESS_CLASS"

Attribute VB_GlobalNameSpace = False

Attribute VB_Creatable = False

Attribute VB_PredeclaredId = False

Attribute VB_Exposed = False

As you can notice, these headers are not in italic because they ARE NOT VBA CODE. This information is only necessary to import and export modules using the import file/export file feature of the VBA Editor and they only appear in the files themselves, not in the text of the code after they are imported to the VBA editor.

IMPORTING/EXPORTING MODULES

There are 2 ways to import a module: manually or using the Import File feature.

Importing Manually: To import a module named, say, TEST_MODULE, manually, the user must create a new empty module (Insert - Module), change the module name to "TEST_MODULE", open its text editor and paste the code of it WITHOUT the "Attribute VB_Name" header (in case you are importing it from a .bas file or without the ".cls" header file (mentioned above) in case it is a ".cls" file. This is: paste only the VBA code, not those flags.

Importing using Import File: To import using the import file feature the ".bas" or ".cls", the user must click File - Import File and select the ".cls" or ".bas" file he or she wish to import which has the module code and the header - which will not be imported to the VBA Editor, since it is not VBA code.

To export manually, the user should copy the module code and save it on a text file.

To export using the Export File feature, the user should select the module to be exported by clicking on it, then File - Export File and saving it as a ".cls' or ".bas" file in case it is a class module or a regular module respectively.

Now that you know how to import and to export a module, let's make an example.

Open your excel 2013 and go to the VBA Editor. Create a new regular module (Insert - Module) and name it TEST_MODULE. Open its text editor and paste the code below.

Option Explicit

' This is an example routine

Public Function Module_Test()

MsgBox "I am a module test", vbInformation, "Module Test"

Exit Function

End Function

Now let's export this module by clicking on the module's name, then File - Export File and saving it with the name TEST_MODULE.bas.

Close the workbook without saving it.

Now open a new excel workbook. Create a new module named "MAIN" (just a best practice) and insert the code below.

' This is an example routine
Public Sub MAIN_Test()

Call TEST_MODULE.Module_Test

Exit Sub

End Sub

Run it (Run - Run Sub/User Form).

An error message should appear with the message "Variable not defined". This means that the compiler didn't fnd the TEST_MODULE module neither the Module_Test function. So let's import them.

Click File - Import File and select the TEST_MODULE.bas file. Click Open. This imports the TEST_MODULE module (which should appear in the list of modules in the Project Explorer).

Now run the MAIN_Test sub in the MAIN module again. A message box with the message "I am a module test" will be displayed. This means that the routine in the MAIN module was able to find the Test function in the TEST_MODULE module. You have just used a module as it should be used. Congratulations!