Using performance recorder to improve performance
This section includes information on creating performance recordings, and suggestions for improving Tableau’s performance. One of the important things to understand is that Tableau is only as fast as your data source. So if your data source responds slowly to queries, then Tableau must wait for the data source before displaying results.
Earlier, at the conclusion of “bringing it all together with dashboards” post, you have learned how to use tableau’s performance in tableau desktop. There is also a discrete performance recorder that allows you to record and view information about tableau server performance at the workbook level.
Prior to tableau version 8, this data had to be collected and scrutinized manually from log files or via a third party application that was created by interworks. The performance recorder basically generates a tableau workbook of your tableau workbook’s performance. Information about the following events is captured and displayed visually:
Figure 9.10 Enabling performance recording
Performance recorder is disabled on tableau server, by default. To use it you must enable it on a per site basis. To activate performance recorder on the server, navigate to the administration-sites page and check the site which you wish to enable. Click edit. In the edit site dialog, check the allow performance recording check box and click OK. Figure 9.10 shows the edit dialog box which is being properly checked.
To use the performance recorder on a view you must affix the following code
? : record _performance=yes
You can see a show performance recording command in the view status bar. Clicking this link will open a view that is generated from the recorded performance data. This view does not update automatically. To see the most current data, close and open the view again. The performance recording will continue capturing data about interactions with the view until the user navigates away or removes the string from the URL. Figure 9.11 displays an example of the information available in the performance recording summary display.
Figure 9.11 A performance summary workbook
The example, recording was taken while interacting with one of the sample tableau server workbooks supplied with your server license. The dashboard that the performance recorder generator contains three views:
Time line Gantt chart
The timeline Gantt chart is displayed by workbook, dashboard, or worksheet when each event occurs. Event start time is indicated by the bar’s horizontal position, and the duration of each event is indicated by the individual bar length. Gantt bar mark uses days as the time unit. If your data is in smaller increments, you can use a calculated field to convert the time period so that the data is represented in days. These are a great way to visual the timeline of a project, allowing you to see the duration of separate tasks within a project.
The uppermost view in a performance recording dashboard shows the events that occurred during the recording, arranged chronologically from left to right. The bottom axis shows elapsed time since Tableau started, in seconds.
In the Timeline view, the Workbook, Dashboard, and Worksheet columns identify the context for the events. The Event column identifies the nature of the event, and the final column show each event’s duration and how it compares chronologically to other recorded events:
The events sorted by time
This section of the workbook shows the duration of recorded events in descending order. This is useful for observing the execution time of each event that occurs during the performance recording. This will help you identify any lengthy events that may be the cause of performance problems.
Events with longer durations can help you identify where to look first if you want to speed up your workbook.
Different colors indicate different types of events. The range of events that can be recorded is:
Alternatively, the workbook also displays the query text for any specific event that you want to examine in detail. You can access the detail by clicking on any of the green executing query events in the bar chart. This is a handy feature which allows you to review any query text that may be of interest without having to leave the tableau performance summary dashboard.
If you click on an Executing Query event in either the Timeline or Events section of a performance recording dashboard , the text for that query is displayed in the Query section. For example:
Sometimes the query is truncated and you’ll need to look at the Tableau log to find the full query. Most database servers can give you advice about how to optimize a query by adding indexes or other techniques. See your database server documentation for details.
Sometimes for efficiency, Tableau intelligently combines multiple queries into a single query against the data. In this case, you may see an Executing Query event for the Null worksheet and zero queries being executed for your named worksheets.
The performance summary report generated by the performance recorder advises you about the specific events that may be responsible for slow performance. Once you understand the events, most affecting performance try the following tactics to address the performance problem.
Query execution represents the time that it takes for the data source to execute a query and redeem the data requested by the worksheet. If the data sources is a database, it is very helpful to see the queries issued by tableau in order to identify inefficiencies. Common issues include poor indexing strategies, fragmented indexes, database contention, insufficient database resources, and inefficient SQL queries. If the data sources is the tableau data engine, there are fewer trouble shooting options.
Tableau Desktop recognizes a set of geographic roles that can be used to automatically geocode your data and create map views. For example, Tableau Desktop recognizes country names, state/province names, city names, and area codes.
It represents the time tableau needs to locate geographical dimensions. If this event type is consuming too much time, consider geocoding your records in the source data set and passing a pre-calculated latitude and longitude to tableau rather than having tableau generate the geocodes while rendering the map view. Once you have created a .Csv file with custom geocoding you can import that file into Tableau.
Connecting to the data source
Connecting to the data source represents the time required for tableau to connect to the data source. This event is typically not a large percentage of total worksheet time. In rare cases, there might be a network or data source which issues the extended connection times. To rule out these issues, examine the network topology between the tableau server and the data source server.
“Computing view layout” can take forever sometimes. Until Tableau Software fixes this, at least it must allow the user to toggle automatic view layout computing off, so that multiple changes (e.g., axis settings) can be made before the view layout is re-computed. This will avoid time-consuming screen updates that are useless, when the user needs to make further edits anyway.
This is the time needed for TABLEAU SERVER to compute the visual layout of the worksheet in the layout computation event. This can be influenced by server resource contention as well as worksheet complexity. The more marks that are visualized within the workbook, the more time the workbook will require to load and refresh. It is necessary to restrict the number of marks simultaneously displayed through techniques such as actions, filters, and aggregation. Large crosstabs can be particularly costly, and doesn’t have any improvement in good noticeable visuals, it requires additional resources to be provided to the server.
Extracts are saved subsets of a data source that you can use to improve performance or to take advantage of Tableau functionality that is not available in the underlying data. When you extract your data to create an extract, you can reduce the total amount of data by defining filters and limits. After you create an extract you can refresh it with data from the original data. When refreshing the data, you have the option to either fully refresh the data, which replaces all of the extract contents, or you can incrementally refresh the extract, which only adds rows that are new since the previous refresh.
The amount of time that the data engine spends generating an extract is called the generating extract event. The size of the data source (the numbers of rows and columns) along with the time the tableau spends in compressing and sorting the data are the major factors affecting the time required to generate extract files.
If your extract file is taking too long to refresh in your environment, it may be possible to speed up the process by removing unnecessary columns from the extract. This will reduce the time required for generating, sorting, and compressing the remaining columns. Even if the problem persist, you may want to ensure that all fields have the appropriate data type assigned to them in the underlying database. Inadequately defined field types in the source database can affect the performance performed against the extract file.
If extract generation speeds are still not good enough, try running more data engine processes or placing them on their own worker instance.
The amount of time that tableau server spends performing data blends is the blending data event. This event can take a long time while working with larger amounts of data from the blended data sources. Filtering before the blend at the data source level can be more effective. If possible, consider moving data into a single data source so that joins can be used instead of blending.
The amount of time that tableau server spends rendering the computed layouts into a format to send it to the client browser is the server rendering event. The time it takes to complete this event can be impacted by the load on the vizQL processes as well as the complexity of the layouts refers to the computing layouts event for guidance.
Whether specifically mentioned or not, most of these events can be quickened by restricting the amount of data visualized through filtering or aggregation. This can also be achieved by using faster hardware or adding more resources on tableau server. As far as workbook performances is considered, if it doesn’t perform well in tableau desktop, it won’t perform well in tableau server either. For this reason you should use the performance recorder on the desktop to troubleshoot performance issues there itself before publishing an under-performing workbook to the server.
|Data Visualization and Dashboarding Fundamentals|
Get Updates on Tech posts, Interview & Certification questions and training schedules