0. Prerequisites¶
Knowledge¶
"But you know what I like more than materialistic things? Knowledge." Tai Lopez
How much Rust do you need to know to write your own Polars plugin? Less than you think.
I'd suggest starting out with the Rustlings course, which provides some fun and interactive exercises designed to make you familiar with the language. I'd suggest starting the following sections:
- 00 intro
- 01 variables
- 02 functions
- 03 if
- 05 vecs
- 12 options
- 13 error handling
You'll also need basic Python knowledge: classes, decorators, and functions.
Alternatively, you could just clone this repo and then hack away at the examples trial-and-error style until you get what you're looking for - the compiler will probably help you more than you're expecting.
Software¶
To get started, please install cookiecutter.
Then, from your home directory (or wherever you store your Python projects) please run
When prompted, please enter (let's suppose your name is "Maja Anima", but replace that with your preferred name):[1/3] plugin_name (Polars Cookiecutter): Minimal Plugin
[2/3] project_slug (polars_minimal_plugin):
[3/3] author (anonymous): Maja Anima
minimal_plugin
.
Please navigate to it with cd minimal_plugin
.
Next, create a Python3.8+ virtual environment, and install:
polars>=1.3.0
maturin>=1.4.0
Finally, you'll also need to install Rust.
That's it! However, you are highly encouraged to also install rust-analyzer if you want to improve your Rust-writing experience by exactly 120%.
What's in a Series?¶
If you take a look at a Series such as
In [9]: s = pl.Series([None, 2, 3]) + 42
In [10]: s
Out[10]:
shape: (3,)
Series: '' [i64]
[
null
44
45
]
[null, 44, 45]
.
However, if you print out s._get_buffers()
, you'll see
something different:
s._get_buffers()["values"]
:[42, 44, 45]
. These are the values.s._get_buffers()["validity"]
:[False, True, True]
. These are the validities.
So we don't really have integers and null
mixed together into a single array - we
have a pair of arrays, one holding values and another one holding booleans indicating
whether each value is valid or not.
If a value appears as null
to you, then there's no guarantee about what physical number
is behind it! It was 42
here, but it could well be 43
, or any other number,
in another example.
What's a chunk?¶
A Series is backed by chunked arrays, each of which holds data which is contiguous in memory.
Here's an example of a Series backed by multiple chunks:
Chunked arrays will come up in several examples in this tutorial.