How to Connect to Your Data and Generated Values of Tableau
HOW TO CONNECT TO YOUR DATA
The first step to getting started with Tableau Desktop is to connect to the data you want to explore. There are several types of data you can connect to and several ways to connect to your data.
When you open Tableau, you are taken to the home page where you can easily select from previous workbooks, sample workbooks, and saved data sources. You can also connect to new data sources by selecting Connect to Data. Figure 2.1 displays the screen.
The option In a File is for connecting to locally stored data or file based data. Tableau Personal edition can only access Excel, Access, and text files (txt, csv). You can also import from data sources stored in other workbooks.
The options, listed beneath On a Server, link to data stored in a database, data cube, or a cloud service. Although all of these databases have very different ways of storing and looking up data, the pop-up window is very user friendly and requires little or no understanding of the underlying technology. Most of these databases will require you to install a driver particular to each tool. Installation normally requires a few minutes and you can find all the connectors at:
http:// www.tableausoftware.com/ support/ drivers.
Figure 2.1 Connect to data screen
If your database isn’t listed, try the other database connector (ODBC) that utilizes the open database connectivity standard. You will also see a list of saved data sources on the right. Saving data sources that you use frequently saves time. I’ll explain how to save a data source in the Tableau data source files section later in this chapter.
Saved data source files (.tds) are found on your computer’s hard disk in the datasources directory under the My Tableau repository. If you are logged into Tableau Server you may also see saved data sources on your server’s repository.
CONNECTING TO DESKTOP SOURCES
If you click on one of the desktop source options under the In a File list, you will get a directory window to select the desired file. Once you have chosen your file, you will be taken to the Connection Options window. There are small differences in the connection dialog depending on the datasource you are connecting to but the menus are self-explanatory. Figure 2.2 shows the connection window with the Superstore sample spreadsheet being the file that is being accessed.
Figure 2.2: The connection window
There are three tabs in the spreadsheet file. Tableau interprets these tabs the same way as it views different tables in a database. The same is true for text files stored within the same folder. If the tabs contain related information, Tableau can join these just like it can join tables in a database. Joining options are the same regardless of the file or database type. Join options will be covered in the Joining Database Tables with Tableau section later in this chapter.
Once you have selected and customized your data connection, you will be taken to the second Data Connection window where you must decide whether or not to create an extract. There are advantages to extract the data into the Tableau’s data engine, particularly when you are using Excel, Access, or text files as your data source. The nuances of data extracts will be covered in The Advantages of a Data Extract section later in the chapter as well. Clicking the OK button creates the connection and opens the workbook authoring environment.
CONNECTING TO DATABASE SOURCES
Databases have an additional level of security— requiring you to enter a server name and user credentials to access the data. The username and password you enter are assigned in the database, meaning the security credentials and the amount of access granted are controlled by the database— and not by Tableau. Figure 2.3 shows the connection window to a MySQL database.
Figure 2.3: Database connection window
The remaining steps in the connection window, guide you through the process of selecting the database, database tables, and defining the joins between the tables in the data source. The final step is to decide whether you want to directly connect to the data or to extract data from the database into the Tableau’s data engine. Following these steps completes the process of connecting to a database.
CONNECTING TO PUBLIC DATA SOURCES
The increasing quantity and variety of data available via the Internet fall into three categories:
- Public domain data set
- Commercial data services
- Cloud database platforms
For example, the United States Census provides free data via the Internet. The World Bank provides a variety of data, and many other government public data repositories have sprouted all over the world. This data can be accessed by downloading files and then connecting Tableau to those files.
There are also a growing number of commercial data sources. At this time, Tableau provides connectors to several, including:
Figure 2.4: Tableau Public
- Google Analytics
- Google Big Query
- Amazon Redshift
- Open Data Protocol (ODATA)
- Windows Azure Marketplace
The Google Analytics connector can be used to create customized click stream analysis of web pages. Google Big Query and Amazon Redshift connectors allow you to leverage the computing capacity of Google and Amazon. Both are designed to allow you to purchase petabyte-scale database processing capacity for a fee. There is also a connector for the popular cloud-based CRM tool— Salesforce. Microsoft supplies data over the web via the Windows Azure Marketplace and ODATA. Tableau’s own free cloud service— Tableau Public— allows you to create and share your workbooks and dashboards on the web. Figure 2.4 shows an example dashboard published on Tableau Public that was created for and embedded in a blog post.
Tableau Public is a great way to embed live/ interactive dashboards on the web. Be careful not to publish proprietary data there, as it is available to everyone without restriction.
Tableau has built-in fields that make difficult tasks easier. These are found on the left side of the screen at the bottom of the dimensions list and at the bottom of the measures list. When you perform an operation (such as double-clicking on a geographic field) these Tableau generated fields are automatically added to the design window. Generated values include:
- Measure Names and Measure Values
- Longitude and Latitude
- Number of Records
Measure Names, Measure Values and Number of Records are always present. If your dimensions include standard geographic place names, Tableau will also automatically generate center-point geocodes.
MEASURE NAMES AND MEASURE VALUES
Measure Names and Measure Values can be used to quickly express all the different measures in your data set or to express multiple measures on a single axis. These fields are automatically generated when you connect to a data source and can be used in a variety of ways. When you connect to a data source, Tableau automatically creates a field that contains all of the measure names and measure values.
In Figure 2.5, you can see that the two measures are shown, SUM (Profit) and SUM (Sales). These are shown as separate columns in the same bar chart. The generated value, Measure Names, is used in the column shelf to separate the bars. Measure Names are also used on the marks card to distinguish color and on the filters shelf to limit the number of measures shown in the view. Measure Values contains the data and this is shown as rows as you would expect from this type of bar chart.
The side-by-side bar chart in Figure 2.5 was created by multi-selecting one dimension Container and two measures Profit and Sales. Using the Show Me button, the side-by-side bar chart was selected. Tableau automatically applies Measure Names to the column shelf and separates the two measures being plotted on the horizontal axis. The Measure Names Quick Filter was exposed by right-clicking on the Measure Names dimension on the Filter Shelf. Other measures can be added to the axis using the Quick Filter.
The view could also be created by dragging Container to the column shelf, dragging the Sales Measure to the row shelf, then dragging Profit on to the left axis and dropping the measure when a light green ruler appears. The Measure Names and Measure Values pills will automatically appear when the second measure is placed on the vertical axis.
Figure 2.5 Measure values bar chart
If your data includes standard geographic fields like country, state, province, city, or postal codes – denoted by a small globe icon – Tableau will automatically generate the longitude and latitude values for the center points of each geographic entity displayed in your visualization. If for some reason, Tableau doesn’t recognize a geographic dimension, you can change the geographic role of the field by right-clicking on the field and selecting the appropriate geographic role. Figure 2.6 shows a map created by using country, state, and city, then using Show Me to display the symbol map.
The Map Option menu seen on the left was exposed from the map menu, Map Option Selection. The marks in the map were styled from the Color button— changing the color transparency and adding a black border. Overlapping clusters of marks are easier to see. Hovering over any mark exposes the Tooltip that includes the geographic entities exposed in the marks card. The summary card was exposed in the view so that you can see that 1,726 marks are plotted.
If Tableau failed to recognize any location, a small gray pill would appear in the lower right of the map. Clicking on that pill would expose a menu that would help you identify and correct the geocoding. Chapter 5 will cover Tableau’s mapping capabilities in detail.
Figure 2.6 Longitude and latitude generated measures
NUMBER OF RECORDS
When you build a view using Measure Names as a filter to allow the user to select from a subset of measures. You want “Number of Records” in the tooltip, but NOT in the filter values.
The final generated value provided is a calculated field near the bottom of the measures shelf called Number of Records. Any icon that includes an equals sign denotes a calculated field. The number of the records calculation formula includes only the number one. This is how Tableau generates record counts. The bar chart in Figure 2.7 displays the record count for each customer segment and grand total.
Number of Records help you understand the row count in your data set. It is particularly helpful when you begin to join other tables. Monitoring how the record count changes, helps you to understand data quality issues or design challenges that you may need to address.
Figure 2.7 Number of records