PxApi

PxApi(
    url: str | KnownApi,
    language: str | None = None,
    disable_cache: bool = False,
    timeout: int = 30,
    max_workers: int | None = None,
)

A wrapper around the PxWeb API. Enables exploring available datasets interactively, getting table data, variables as well as other metadata.

Parameters

url : str | KnownApi

Either a shorthand name for a builtin API, e.g. “scb”. To check out avaiable APIs, use get_known_apis().

language : str | None = None

The language to be used with the API. You can check available languages using the .get_config() method.

disable_cache : bool = False

Disable the in-memory cache that is used for API responses.

timeout : int = 30

The timeout in seconds to use when calling the API.

max_workers : int | None = None

Maximum number of workers to use for parallel execution when fetching data. Set to 1 for sequential execution or set a fixed number to limit the amount of concurrent requests being sent. None uses automatic sizing.

Examples

Get the SCB PxWeb API using the shorthand:

>>> api = PxApi("scb")
>>> api
PxApi(url='https://api.scb.se/ov0104/v2beta/api/v2',
           language='sv',
           disable_cache=False,
           timeout=30)

Methods

Name Description
all_tables Get a list of all tables available with some basic metadata. Use .get_table_metadata() for extensive metadata about a specific table.
get_code_list Get information about a code list.
get_config Retrieve the configuration for the API.
get_paths List all paths available to explore. Use the ID to list tables on a specific path with .tables_on_path().
get_table_data Get table data that can be used with dataframes like polars or pandas. The query is constructed with the method parameters.
get_table_data_all Get table data that can be used with dataframes like polars or pandas. The query is constructed from the metadata.
get_table_metadata Get the complete set of metadata for a table.
get_table_variables Get the specific metadata for variables and value codes. Also includes information whether a variable can be eliminated as well as the available code lists.
search Search for tables.
tables_on_path List all the tables available on the path.

all_tables

PxApi.all_tables()

Get a list of all tables available with some basic metadata. Use .get_table_metadata() for extensive metadata about a specific table.

Returns

: list[dict[str, str]]

All tables with some metadata.

get_code_list

PxApi.get_code_list(code_list_id: str)

Get information about a code list.

Parameters

code_list_id : str

The ID of a code list.

Returns

: dict

The API response with the code list information.

Examples

By checking out the table variables with the .get_table_variables() method we can get available code lists.

>>> meta = api.get_table_variables("TAB638")

With the metadata, get the code lists available for “Region”.

>>> meta.get("Region").get("codelists")
[{'id': 'agg_RegionA-region_2', 'label': 'A-regioner'},
{'id': 'agg_RegionKommungrupp2005-_1', 'label': 'Kommungrupper (SKL:s) 2005'},
{'id': 'agg_RegionKommungrupp2011-', 'label': '...'},
{'id': 'vs_RegionKommun07', 'label': 'Kommuner'},
{'id': 'vs_RegionLän07', 'label': 'Län'},
{'id': 'vs_RegionRiket99', 'label': 'Riket'},
...]

Now we can look closer at a specific code list by using the method.

>>> api.get_code_list("vs_RegionLän07")
{
...     'id': 'vs_RegionLän07',
...     'label': 'Län',
...     'language': 'sv',
...     'type': 'Valueset',
...     'values': [
...         {'code': '01', 'label': 'Stockholms län'},
...         {'code': '03', 'label': 'Uppsala län'},
...         {'code': '04', 'label': 'Södermanlands län'},
...         ...
...     ]
... }

get_config

PxApi.get_config()

Retrieve the configuration for the API.

Returns

: dict

The API response containing the configuration.

Examples

>>> conf = api.get_config()

Check the languages available.

>>> conf.get("languages")
[{'id': 'sv', 'label': 'Svenska'},
 {'id': 'en', 'label': 'English'}]

get_paths

PxApi.get_paths(path_id: str | None = None)

List all paths available to explore. Use the ID to list tables on a specific path with .tables_on_path().

Parameters

path_id : str | None = None

A path.

Returns

: list[dict[str, str]]

Paths available.

Examples

>>> api.get_paths()
[
... {'id': 'AA', 'label': 'Ämnesövergripande statistik'},
... {'id': 'AA0003', 'label': 'Registerdata för integration'},
... {'id': 'AA0003B', 'label': 'Statistik med inriktning mot arbetsmarknaden'},
... {'id': 'AA0003C', 'label': 'Statistik med inriktning mot flyttmönster'},
... {'id': 'AA0003D', 'label': 'Statistik med inriktning mot boende'},
... ...
]

You can also inspect a subpath by supplying a path_id.

>>> api.get_paths("AM0101")
[
... {'id': 'AM', 'label': 'Arbetsmarknad'},
... {'id': 'AM0101',
...  'label': 'Konjunkturstatistik, löner för privat sektor (KLP)'},
... {'id': 'AM0101A', 'label': 'Arbetare: Timlön efter näringsgren'},
... {'id': 'AM0101B', 'label': 'Tjänstemän: Månadslön efter näringsgren'},
... {'id': 'AM0101C', 'label': 'Äldre tabeller som inte uppdateras'},
... {'id': 'AM0101X', 'label': 'Nyckeltal'},
]

get_table_data

PxApi.get_table_data(
    table_id: str,
    value_codes: dict[str, list[str]] | None = None,
    code_list: dict[str, str] | None = None,
    show: Literal['code', 'value', 'code_value'] | None = None,
)

Get table data that can be used with dataframes like polars or pandas. The query is constructed with the method parameters. An empty value code selection returns a default selection for the table.

Parameters

table_id : str

An ID of a table to get data from.

value_codes : dict[str, list[str]] | None = None

The value codes to use for data selection where the keys are the variable codes. You can use the .get_table_variables() to explore what’s available.

code_list : dict[str, str] | None = None

Any named code list to use with a variable for code selection.

show : Literal['code', 'value', 'code_value'] | None = None

Set to “code_value”, “code” or “value”, to specify what to show in the categorical columns.

Returns

: list[dict]

A dataset in a native format that can be loaded into a dataframe.

Examples

A simple query to get the population of 2024 for all the Stockholm municipalities using 5-year age groups.

>>> dataset = api.get_table_data(
...     table_id="TAB638",
...     value_codes={
...         "ContentsCode": ["BE0101N1"],
...         "Region": ["01*"],
...         "Alder": ["*"],
...         "Tid": ["2024"],
...     },
...     code_list={
...         "Alder": "agg_Ålder5år",
...         "Region": "vs_RegionKommun07",
...     },
... )

This dataset can then easily be turned into a dataframe, for example with polars.

>>> pl.DataFrame(dataset)
shape: (572, 5)
┌─────────────────────┬────────────────┬────────────────┬──────┬───────┐
│ region              ┆ ålder          ┆ tabellinnehåll ┆ år   ┆ value │
---------------
strstrstrstr  ┆ i64   │
╞═════════════════════╪════════════════╪════════════════╪══════╪═══════╡
0114 Upplands Väsby ┆ 0-4 år         ┆ Folkmängd      ┆ 20242931
0114 Upplands Väsby ┆ 5-9 år         ┆ Folkmängd      ┆ 20243341
0114 Upplands Väsby ┆ 10-14 år       ┆ Folkmängd      ┆ 20243237
0114 Upplands Väsby ┆ 15-19 år       ┆ Folkmängd      ┆ 20243083
0114 Upplands Väsby ┆ 20-24 år       ┆ Folkmängd      ┆ 20242573
│ …                   ┆ …              ┆ …              ┆ …    ┆ …     │
0192 Nynäshamn      ┆ 85-89 år       ┆ Folkmängd      ┆ 2024554
0192 Nynäshamn      ┆ 90-94 år       ┆ Folkmängd      ┆ 2024230
0192 Nynäshamn      ┆ 95-99 år       ┆ Folkmängd      ┆ 202451
0192 Nynäshamn      ┆ 100+ år        ┆ Folkmängd      ┆ 20247
0192 Nynäshamn      ┆ uppgift saknas ┆ Folkmängd      ┆ 20240
└─────────────────────┴────────────────┴────────────────┴──────┴───────┘

get_table_data_all

PxApi.get_table_data_all(
    table_id: str,
    show: Literal['code', 'value', 'code_value'] | None = None,
)

Get table data that can be used with dataframes like polars or pandas. The query is constructed from the metadata. This method tries to fetch all the data in the target table by sending in wildcards to all the value_codes.

Parameters

table_id : str

An ID of a table to get data from.

show : Literal['code', 'value', 'code_value'] | None = None

Set to “code_value”, “code” or “value”, to specify what to show in the categorical columns.

Returns

: list[dict]

A dataset in a native format that can be loaded into a dataframe.

get_table_metadata

PxApi.get_table_metadata(table_id: str)

Get the complete set of metadata for a table.

Parameters

table_id : str

The ID of a table to get metadata from.

Returns

: dict

The API response containing the metadata.

Examples

>>> meta = api.get_table_metadata("TAB638")
>>> meta.keys()
dict_keys(['version', 'class', 'href', 'label', 'source',
...         'updated', 'link', 'note', 'role', 'id',
...         'size', 'dimension', 'extension'])
>>> meta.get("label")
'Folkmängden efter region, civilstånd, ålder, kön, tabellinnehåll och år'

get_table_variables

PxApi.get_table_variables(table_id: str)

Get the specific metadata for variables and value codes. Also includes information whether a variable can be eliminated as well as the available code lists. The information returned is unpacked and slightly more easily navigated than the output from the .get_table_metadata() method.

Parameters

table_id : str

The ID of a table to get metadata from.

Returns

: dict

The API response containing the metadata.

Examples

>>> api.get_table_variable("TAB638")
{
...     'Region': {
...         'label': 'region',
...         'category': {'label': {'00': 'Riket', '01': 'Stockholms län', ...}},
...         'elimination': True,
...         'codelists': [{'id': 'vs_RegionKommun07', 'label': 'Kommuner'}, ...]
...     },
...     'Alder': {
...         'label': 'ålder',
...         'category': {'label': {'0': '0 år', '1': '1 år', ...}},
...         'elimination': True,
...         'codelists': [{'id': 'agg_Ålder5år', 'label': '5-årsklasser'}, ...]
...     },
...     'Tid': {
...         'label': 'år',
...         'category': {'label': {'2022': '2022', '2023': '2023', ...}},
...         'elimination': False,
...         'codelists': []
...     },
...     ...
}

search

PxApi.search(
    query: str | None = None,
    past_days: int | None = None,
    include_discontinued: bool | None = None,
    page_size: int | None = None,
)

Search for tables.

Parameters

query : str | None = None

A string to search for.

past_days : int | None = None

Return results where tables have been updated within n number of days.

include_discontinued : bool | None = None

Include any tables that are discontinued.

page_size : int | None = None

Number of results per page in the returning dict. Results will be paginated if they exceed this value.

Returns

: dict

The API response of the search query.

Examples

>>> api = PxApi("scb")
>>> search = api.search(query="arbetsmarknad", past_days=180)
>>> len(search.get("tables"))
4

tables_on_path

PxApi.tables_on_path(path_id: str)

List all the tables available on the path.

Parameters

path_id : str

A path.

Returns

: list[dict[str, str]]

All tables on the path.

Examples

>>> api.tables_on_path("AM0101C")
[
... {'id': 'TAB2566',
...  'label': 'Genomsnittlig månadslön för tjänstemän, privat sektor (KLP) efter näringsgren SNI2002 ...'},
... {'id': 'TAB2552',
...  'label': 'Genomsnittlig timlön för arbetare, privat sektor (KLP) efter näringsgren SNI2002 ...'},
... {'id': 'TAB386',
...  'label': 'Antal arbetare inom industrin efter näringsgren SNI92 ...'},
... {'id': 'TAB2565',
...  'label': 'Genomsnittlig månadslön för tjänstemän, privat sektor (KLP) efter provision och näringsgren SNI92 ...'},
... {'id': 'TAB2551',
...  'label': 'Genomsnittlig timlön för arbetare, privat sektor (KLP) efter näringsgren SNI92 ...'},
... ...
]