Package-level reference for beautifulsoup4 on PyPI — install variants, parser-backend selection (lxml/html5lib/html.parser), and alternatives.
Package-level reference for Dagster on PyPI — install variants, the dagster-* plugin family, version policy, and alternatives.
Package-level reference for duckdb — install, versioning, extensions, and gotchas. In-process columnar OLAP for Python.
Package-level reference for the jupyter meta-package on PyPI — install variants, what it pulls in, version policy, and alternatives.
Package-level reference for matplotlib on PyPI — install variants, backends, version policy, extras, and alternatives.
Package-level reference for modin — install, backend extras, versioning, and gotchas. Speeds up existing pandas code with a one-line import swap.
Package-level reference for numpy — install, versioning, ABI breaks, extras, and gotchas. The bedrock of the Python scientific stack.
Package-level reference for pandas — install, versioning, Python compatibility, extras, and gotchas. The de-facto DataFrame library on PyPI.
Package-level reference for Pillow on PyPI — install variants, format-specific native deps, version policy, and alternatives.
Package-level reference for polars — install, versioning, extras, and gotchas. The Rust-powered Arrow-native alternative to pandas.
Package-level reference for Prefect on PyPI — install variants, version policy, cloud-vs-OSS extras, and alternatives.
Package-level reference for scikit-learn — install, versioning, extras, and gotchas. The de-facto classical-ML library on PyPI.
Package-level reference for scipy — install, versioning, submodules, license caveats, and gotchas. Optimization, statistics, signal processing, and linear algebra.
Package-level reference for the streamlit framework on PyPI — install variants, version policy, extras, and alternatives.
Package-level reference for unstructured on PyPI — install variants, the huge extras tree, system-level dependencies, and alternative parsers.
Run SQL through SPUFI, drive Db2 with DSN subsystem commands, BIND packages and plans, schedule DSNTEP2 in JCL, query the SYSIBM catalog, and generate DCLGEN.
Slice, filter, map, and transform JSON data from the command line. Covers all essential filters, built-in functions, select, map, reduce, streaming, jq 1.7/1.8 additions, and real-world API response processing.
Comprehensive reference for qsv: count, headers, stats, moarstats, select, search, sort, dedup, frequency, join, sqlp, luau, apply, schema, validate, sample, split, MCP server, and more — with examples and outputs.
Parse, search, and mutate HTML/XML with BeautifulSoup 4. Covers parser choice (html.parser/lxml/html5lib), find/find_all/select, tree navigation, attribute access, and pairing with requests/httpx/playwright for end-to-end scraping.
Encode and decode JSON in Python with the stdlib json module. Covers dumps/loads, indent/sort_keys/separators, custom default= and JSONEncoder, object_hook decoding, JSONL streaming, and orjson/ujson/msgspec comparison.
Build classical ML pipelines with scikit-learn. Covers the estimator API, train_test_split, Pipeline, ColumnTransformer, cross-validation, metrics, and model persistence.
Build interactive web apps for data and ML in pure Python. Covers widgets, layout, session state, caching, multipage apps, and deployment patterns.
Build, schedule, and observe data pipelines as software-defined assets with Dagster. Covers assets, jobs, schedules, sensors, resources, partitions, and the Dagster UI.
Run fast analytical SQL queries in-process with DuckDB. Covers Python API, CSV/Parquet ingestion, pandas interop, Arrow, window functions, and persistent databases.
Speed up pandas workloads across all CPU cores with a one-line import swap. Covers Ray and Dask backends, config tuning, pandas interop, and when modin wins vs polars.
High-performance DataFrames with a lazy expression API. Covers read/write, select, filter, group_by, joins, LazyFrame, datetime, string operations, and pandas interop.
Build, schedule, and observe Python workflows with Prefect. Covers flows, tasks, retries, schedules, deployments, caching, concurrency, and Prefect Cloud.
Run interactive Python notebooks with Jupyter. Covers JupyterLab setup, cell types, keyboard shortcuts, magic commands, nbconvert export, and common pitfalls.
Create publication-quality 2-D plots with matplotlib. Covers pyplot basics, subplots, savefig, common chart types, and the show-vs-save pitfall.
Create and manipulate N-dimensional arrays with NumPy. Covers array creation, broadcasting, vectorized math, indexing, and matrix operations.
Load, filter, transform, and aggregate tabular data with pandas. Covers DataFrame creation, read_csv, groupby, merge, and the SettingWithCopy pitfall.
Open, resize, crop, convert, and save images with Pillow (PIL fork). Covers format conversion, filters, drawing, and EXIF handling.
Statistical distributions, optimization, integration, signal processing, and linear algebra with SciPy. Builds on NumPy arrays.
navigation
actions
cheat sheet pages