Quick Start to Home Assistant Data Science for Home Assistant Core users
In this quick start guide, we're going to show you how to set up and use JupyterLab, a data science environment. JupyterLab is the tool of choice for data scientists around the globe. Using JupyterLab we will run some reports on your own data. All reports are editable so you can quickly start experimenting and exploring more!
This guide explains the installation and setup of JupyterLab and expects a fresh Ubuntu 18.10 installation. However, it should be easy to adapt it for most other platforms. In case you are using Home Assistant, please check out the quick start to Home Assistant Data Science.
#
Preparing the systemUpdating the system and installing the required dependencies.
#
Preparing the user environmentCreate a directory for your notebooks and checkout the Home Assistant data science notebooks.
Create a virtual environment and activate it.
Install the Python requirements.
#
Launch JupyterLabLaunch the JupyterLab server from the command-line.

If you are not using a separate database but the built-in one then you can skip the next section.
#
PostgreSQLYou need a user to access your database. This guide assumes a PostgreSQL database, but any other will do just as well.
If you are using PostgreSQL:
Run where your Home Assistant database lives:
The user needs sufficient rights to the Home Assistant database.
#
Running your first reportJupyterLab works with Jupyter Notebooks. Think of a notebook like a Word document that can also contain code to explore your data.
We have prepared a few notebooks for you that will help you get started. Let's start with the notebook GETTING STARTED.ipynb
that is available in the cloned repository. You can find it in the home-assistant-notebooks
directory. This notebook has been prepared by us to automatically read your Home Assistant data and generate a few interesting statistics about your data!
Open the GETTING STARTED.ipynb
notebook.

For PostgreSQL you have to modify the database connection settings.
db = HassDatabase("postgresql://datascience:my_secret_pa[email protected]/hass")
To run the report, press the button. This will get you step by step through the notebook.

If the connection is OK then you could click on "Run" in the top menubar and choose "Run All Cells".
The notebook will now generate the full report. Depending on the size of your database, this might take some time. The little square brackets with an asterisk ([*]
) to the left of Python code (a cell) will indicate what is currently being executed or about to be executed. Once executed, it will change to [<number>]
(number represents the order cells are executed).
The cool thing about these reports is that you can edit the Python code and execute it again to get the latest results, you don't even need to execute the whole report again to see most changes. After each change, just run the cell (by clicking the ▶️ button in the toolbar). Executing a cell will run the Python code and show the latest results.
#
What's nextYou now have all the tools available to you to do data science. If you want to see some more cool notebooks that people have created for Home Assistant then you can also check out the HASS Data Detective usage examples.
If you want to learn more about what data Home Assistant tracks, check out the data primer.