data (33)

beautifulsoup4 — HTML/XML Parsing Library

Package-level reference for beautifulsoup4 on PyPI — install variants, parser-backend selection (lxml/html5lib/html.parser), and alternatives.

pip package scraping html parsing

dagster — Asset-Oriented Data Orchestration

Package-level reference for Dagster on PyPI — install variants, the dagster-* plugin family, version policy, and alternatives.

pip package orchestration pipelines

duckdb — Embedded Analytical SQL

Package-level reference for duckdb — install, versioning, extensions, and gotchas. In-process columnar OLAP for Python.

pip package duckdb sql analytics

jupyter — Meta-Package for Interactive Notebooks

Package-level reference for the jupyter meta-package on PyPI — install variants, what it pulls in, version policy, and alternatives.

pip package notebooks interactive data

matplotlib — Foundational Python Plotting Library

Package-level reference for matplotlib on PyPI — install variants, backends, version policy, extras, and alternatives.

pip package plotting visualization data

modin — Drop-in Parallel pandas

Package-level reference for modin — install, backend extras, versioning, and gotchas. Speeds up existing pandas code with a one-line import swap.

pip package modin pandas parallel

numpy — N-Dimensional Array Foundation

Package-level reference for numpy — install, versioning, ABI breaks, extras, and gotchas. The bedrock of the Python scientific stack.

pip package numpy arrays scientific

pandas — DataFrames for Python

Package-level reference for pandas — install, versioning, Python compatibility, extras, and gotchas. The de-facto DataFrame library on PyPI.

pip package pandas dataframes data

Pillow — Friendly PIL Fork for Image I/O

Package-level reference for Pillow on PyPI — install variants, format-specific native deps, version policy, and alternatives.

pip package images imaging filesystem

polars — Fast DataFrames on Rust + Arrow

Package-level reference for polars — install, versioning, extras, and gotchas. The Rust-powered Arrow-native alternative to pandas.

pip package polars dataframes arrow

prefect — Python Workflow Orchestration

Package-level reference for Prefect on PyPI — install variants, version policy, cloud-vs-OSS extras, and alternatives.

pip package orchestration pipelines

scikit-learn — Classical Machine Learning

Package-level reference for scikit-learn — install, versioning, extras, and gotchas. The de-facto classical-ML library on PyPI.

pip package scikit-learn ml modelling

scipy — Scientific Algorithms on NumPy

Package-level reference for scipy — install, versioning, submodules, license caveats, and gotchas. Optimization, statistics, signal processing, and linear algebra.

pip package scipy scientific statistics

streamlit — Data Apps Framework on PyPI

Package-level reference for the streamlit framework on PyPI — install variants, version policy, extras, and alternatives.

pip package web dataviz ui

unstructured — Document Parsing for RAG Pipelines

Package-level reference for unstructured on PyPI — install variants, the huge extras tree, system-level dependencies, and alternative parsers.

pip package ai rag filesystem

Db2 SPUFI — Interactive SQL, DSN commands, BIND, and catalog queries on z/OS

Run SQL through SPUFI, drive Db2 with DSN subsystem commands, BIND packages and plans, schedule DSNTEP2 in JCL, query the SYSIBM catalog, and generate DCLGEN.

db2 sql spufi dsn zos mainframe web-researched

jq — JSON Processor

Slice, filter, map, and transform JSON data from the command line. Covers all essential filters, built-in functions, select, map, reduce, streaming, jq 1.7/1.8 additions, and real-world API response processing.

jq json data cli api scripting web-researched

qsv — CSV Toolkit

Comprehensive reference for qsv: count, headers, stats, moarstats, select, search, sort, dedup, frequency, join, sqlp, luau, apply, schema, validate, sample, split, MCP server, and more — with examples and outputs.

qsv csv cli data rust polars xsv tabular mcp web-researched

BeautifulSoup — HTML Parsing & Scraping

Parse, search, and mutate HTML/XML with BeautifulSoup 4. Covers parser choice (html.parser/lxml/html5lib), find/find_all/select, tree navigation, attribute access, and pairing with requests/httpx/playwright for end-to-end scraping.

python beautifulsoup scraping html parsing web

json — Stdlib JSON Encoder/Decoder

Encode and decode JSON in Python with the stdlib json module. Covers dumps/loads, indent/sort_keys/separators, custom default= and JSONEncoder, object_hook decoding, JSONL streaming, and orjson/ujson/msgspec comparison.

python stdlib json data serialization parsing

scikit-learn — Classical Machine Learning

Build classical ML pipelines with scikit-learn. Covers the estimator API, train_test_split, Pipeline, ColumnTransformer, cross-validation, metrics, and model persistence.

python scikit-learn ml data-science pipelines modeling classification

streamlit — Data Apps in Pure Python

Build interactive web apps for data and ML in pure Python. Covers widgets, layout, session state, caching, multipage apps, and deployment patterns.

python streamlit ui dataviz prototyping web frontend

dagster — Modern Data Orchestration

Build, schedule, and observe data pipelines as software-defined assets with Dagster. Covers assets, jobs, schedules, sensors, resources, partitions, and the Dagster UI.

python dagster orchestration data pipelines assets etl mlops

DuckDB — Embedded Analytics Database

Run fast analytical SQL queries in-process with DuckDB. Covers Python API, CSV/Parquet ingestion, pandas interop, Arrow, window functions, and persistent databases.

python duckdb sql analytics olap parquet pandas arrow

modin — Drop-in pandas at Scale

Speed up pandas workloads across all CPU cores with a one-line import swap. Covers Ray and Dask backends, config tuning, pandas interop, and when modin wins vs polars.

python modin pandas dataframes parallel ray dask data

polars — Fast DataFrames

High-performance DataFrames with a lazy expression API. Covers read/write, select, filter, group_by, joins, LazyFrame, datetime, string operations, and pandas interop.

python polars dataframes data csv parquet sql

prefect — Workflow Orchestration

Build, schedule, and observe Python workflows with Prefect. Covers flows, tasks, retries, schedules, deployments, caching, concurrency, and Prefect Cloud.

python prefect orchestration workflows scheduling etl pipelines

jupyter — Interactive Notebooks

Run interactive Python notebooks with Jupyter. Covers JupyterLab setup, cell types, keyboard shortcuts, magic commands, nbconvert export, and common pitfalls.

python jupyter notebooks data analysis interactive

matplotlib — Plotting

Create publication-quality 2-D plots with matplotlib. Covers pyplot basics, subplots, savefig, common chart types, and the show-vs-save pitfall.

python matplotlib plotting visualization charts graphs

numpy — Numerical Arrays

Create and manipulate N-dimensional arrays with NumPy. Covers array creation, broadcasting, vectorized math, indexing, and matrix operations.

python numpy arrays math data scientific

pandas — DataFrames

Load, filter, transform, and aggregate tabular data with pandas. Covers DataFrame creation, read_csv, groupby, merge, and the SettingWithCopy pitfall.

python pandas dataframes data csv analysis

Pillow — Image Processing

Open, resize, crop, convert, and save images with Pillow (PIL fork). Covers format conversion, filters, drawing, and EXIF handling.

python pillow pil images imaging graphics

scipy — Scientific Computing

Statistical distributions, optimization, integration, signal processing, and linear algebra with SciPy. Builds on NumPy arrays.

python scipy statistics optimization scientific math

g h	home
g p	Programming section
g p	Python section
g j	JavaScript section
g t	TypeScript section
g o	OS section
g l	Linux section
g w	Windows section
g z	z/OS section
g o	macOS section
g a	AI section
g c	Claude Code section
g c	Codex CLI section
g c	Claude API section
g p	Prompting section
g f	Frameworks section
g p	Packages section
g p	Pip (Python) section
g p	npm (Node) section
g p	Cargo (Rust) section
g p	Go modules section
g g	graph view
g t	tags index

⌘K / /	open search palette
t	cycle theme (dark → light → system)
?	toggle this panel

[ / ]	previous / next sheet in section
j / k	scroll down / up