For today’s blog post I wanted to tackle how to practice using business intelligence tools without spending any money. There are certain restrictions with Tableau Desktop Public (the free version of Tableau), but there are workarounds to practice using SQL, Python, and Tableau for free.
Free Option for SQL: MySQL
One good option to practice SQL and build your own scripts or create views based on the tables you’re playing around with is MySQL. MySQL is an open source relational database management system. Through a client-server model you’re able to perform various data operations from your computer (client), which are then directed to the server to obtain results. To apply this, you will be installing MySQL Workbench and MySQL Server. The best option is to go with the MySQL Installer in MySQL Community downloads:
Once downloaded, in setup type you will opt for the custom option. This option allows you to select MySQL Workbench under Application->MySQL Workbench->MySQL Workbench version number and move it to the right side. Same goes for the server, to do so click on MySQL Servers->MySQL Server->the version you prefer and add it to the list on the right side. In the next section unless you face an issue, you shouldn’t be prompted to download additional software or have “manual” indicated under status. The rest of the installation and configuration is pretty straightforward, just remember to leave “Start the MySQL Server at System Startup” checked in the Windows Service section or you will have to setup the server manually every time you use it. Now when you want to work with SQL, all you need to do is open MySQL Workbench and enter the login information you established:
Free Option for Python: Jupyter Notebook
You’ve probably heard of Jupyter before as it is one of the best free options to work with Python by allowing you to edit your code through a web browser. Code is saved in a file with extension .ipynb and you can easily view the corresponding markdown, code and output of what you’ve written. One useful option is to install Anaconda, which is a toolkit that is not only handy to work with Python but also R since it contains RStudio.
Free Option for BI Tool: Tableau Desktop Public
Tableau is a business intelligence tool that allows to load data and play around with it to create intuitive charts. Unfortunately, Tableau has a paid version and an unpaid version with some limitations. The unpaid version is called Tableau Desktop Public and your data is limited to one million rows. You can only save your file on a public server (so you shouldn’t be working with confidential information since anyone can stumble upon your visualizations) and you can connect to data sources such as Microsoft Excel, text, JSON, Microsoft Access, PDF, Google Sheets, and even web data connectors. What you can’t do is connect to a live feed or database, which means you only work with extracts. It should be mentioned that if you work with Google Sheets it updates once a day. To download:
So how do I practice for free?
Given the intent is to only be able to practice using the tools and not build a business setup you can create a workflow from SQL to Python to Tableau through the use of CSV files:
It’s true that there are ways such as using TabPy to use Python code in Tableau or working with SQL from Python by establishing a connection, but the way illustrated above allows to easily focus on using the tools without complications or dealing with restrictions due to Tableau’s free version.
For the SQL part you can either download a SQL text file that contains the script to create a database structure with various tables or copy paste and build your own as instructed in these great examples:
There are also some random data generators like https://www.mockaroo.com/ or http://filldb.info/. The former focuses on table creation, so you would need to nest them under your personalized database that you can build with the CREATE DATABASE statement. The latter one allows you to upload your own schema to start from there.
Once you have a database with data you need to devise a business question that would be answered with business intelligence, remembering that business intelligence looks at the analysis of past and present data to look at what happened and why did it happen. What will your practice consist of in regards to SQL? Writing the necessary queries to build a view (a virtual table containing the results from your SQL statements), which you can then save as an CSV file.
In Python you can then take that CSV file and load it as a dataframe using the pandas library. As easy as typing:
Import pandas as pd
dfName = pd.read_csv(‘fileName.csv’) <- if the file is located in a different directory from the notebook then you will have to type the whole address.
One way of ensuring you’re opening your notebook on the right location is using the Anaconda prompt and type:
\folder\folder> jupyter notebook
Python comes into play as the tool to do some exploratory data analysis (EDA) through plots to visually understand the data based on your business question. It will allow you to detect outliers, see its structure, test assumptions, and preliminary insights. For visualization you can use libraries such as seaborn or matplotlib. Once you gain the necessary understanding and maybe done additional processing you can proceed to saving the dataframe as a CSV file using:
Then you can load this saved CSV file to Tableau under the Connect options in the starting menu by clicking on Text file. This will load a preview of your table, in this step you can join additional columns from other files or use the Data Interpreter to clean the file if necessary.
To start working simply click on the sheet tab located at the bottom:
This will take you to the workspace where the dimensions and even some default measures have been established. Now you’re ready to practice and play around with Tableau. All of this without spending anything to get to know these tools.