On this page, we’re going to start actually doing things with Python. Before we get started, let’s do a quick uv
refresher.
To work with Python, use our Python library, and call on dependencies, we need our Python virtual environment, which uv
will handle for us. To make a fresh environment, run uv venv
. To “activate” it/use anything installed in it, run source .venv/bin/activate
. To sync it up with any dependencies in pyproject.toml
, run uv sync
. Note also that whenever you use uv add
to add an additional dependency and uv remove
to do the reverse, uv
will automatically create the environment if it needs to and sync it up.
Quarto was developed by the RStudio company, now called Posit, to be a more broadly useful successor to RMarkdown notebooks. They still look mostly the same as RMarkdown, which is to say, the notebooks are valid markdown syntax, with code blocks that can be run, e.g.,:
Editors like RStudio, VSCode, and Positron will allow you to run each block on its own, almost exactly like a Jupyter notebooks, but by default, all code will be run each time the document is rendered, which guarantees that the final output is correct.
I mentioned Quarto is a successor to RMarkdown and also mentioned that it works like Jupyter. Why not stick with RMarkdown or Jupyter?
RMarkdown’s primary downside on its own, of course, is that it can only run R code (although I think it may also be able to run bash), though it’s great that its source code is just Markdown. Jupyter doesn’t share the same weakness, as it can run Julia, Python, and R (Jupyter is a portmanteau of those three languages’ names). That said, Jupyter’s source format is a very complicated JSON. It also embeds outputs in that JSON, which makes it very easy to break the reproducibility chain by updating a code block but forgetting to update its outputs. RMarkdown again wins here because everything is recomputed at render-time, and outputs are not included in the source format.
So, to summarize, RMarkdown is simpler in that it’s just markdown, more reproducible, but also less useful across the big three data science languages. Jupyter is a bit more flexible and well-supported across editors and languages, but also less reproducible and with a more complicated, git-unfriendly source format.
Quarto attempts to take the good parts of both systems, blend them together, and then add a huge number of additional features. It’s just Markdown, like RMarkdown, and it also adopts RMarkdown’s render-time reproducibility. But like Jupyter, it supports more languages, and in fact it supports way more languages than Jupyter: it can run Python, R, Julia, and JavaScript data visualization framework called Observable, and it can syntax-highlight most languages. That’s why in our setup tutorial, YAML and TOML code blocks are highlighted rather than plain text.
On top of that, Quarto extends RMarkdown’s rendering capabilities to virtually every format you might want to render to, including .epub
books, HTML, PowerPoint presentations, Word documents, reveal.js
presentations, wikis, etc. A full list of supported formats is here.
In this document, we’re going to use Python in Markdown with some of the dependencies we install with uv
to display some interactive dashboards. We’ll also show how to use a GitHub action to check for new data from labkey at a set interval.
For this demo, we’re going to reach for the Altair package for generating interactive data visualizations and itables
for browsing tables. Rather than using Shiny or some other dashboard layout system, we’re just going to use Quarto’s built-in layout system, which, surprise surprise, is just Markdown. It has functionality for tabs, dividing each page into rows and columns, sidebars, etc. In short, everything we need.
While I choose Altair here because it’s much faster than the competition and has a nice syntax, there’s smorgasbord of data viz options for Python users, the most popular of which is Plotly for interactive visualizations and Matplotlib for static visualizations.
To install these, run the following:
As you’ll see, I install pandas, geopandas, and palmerpenguins for demonstration purposes.
Note that the following tutorials from the Quarto docs are extremely helpful:
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year |
---|---|---|---|---|---|---|---|
Loading ITables v2.2.4 from the internet... (need help?) |