Examples

Note

This library provides a Python client for the PxWeb API, but is not affiliated with the PxWeb project. The examples do not go into detail about how the PxWeb API behaves or responds. For more information about the API itself, check out the official specification.

Basic setup and exploration

The first step is to set up a PxApi object to use.

from pxweb import PxApi

# Use the builtin known API instead of a URL
api = PxApi("scb")

api

PxApi(url='https://statistikdatabasen.scb.se/api/v2',
        language=sv,
        disable_cache=False,
        timeout=30,
        number_of_tables=5144)

We can check out information about the API, including languages supported, by using .get_config().

api.get_config()

{
    'apiVersion': '2.0.0',
    'appVersion': '1.0.0',
    'languages': [
        {
            'id': 'en',
            'label': 'English'
        },
        {
            'id': 'sv',
            'label': 'Svenska'
        }
    ],
    'defaultLanguage': 'sv',
    'maxDataCells': 150000,
    ... +7
}

If we want to change the language, we can do so by changing an attribute of the PxApi object like so: api.language = "en" This will change the response language for all subsequent queries to the API.

From here we can also browse around and get data. Checking out all tables available is doable with .all_tables(), but probably a bit overwhelming.

api.all_tables()

[
    {
        'id': 'TAB4707',
        'label': 'Antal pågående anställningar efter anställningstid'+39,
        'description': '',
        'updated': '2025-12-22T07:00:00Z',
        'firstPeriod': '2015M04',
        ... +8
    },
    {
        'id': 'TAB4714',
        'label': 'Antal pågående anställningar efter anställningstid'+58,
        'description': '',
        'updated': '2025-12-22T07:00:00Z',
        'firstPeriod': '2015M04',
        ... +8
    },
    {
        'id': 'TAB4718',
        'label': 'Antal pågående anställningar efter kön, region och'+30,
        'description': '',
        'updated': '2025-12-22T07:00:00Z',
        'firstPeriod': '2020M01',
        ... +8
    },
    {
        'id': 'TAB4723',
        'label': 'Antal pågående anställningar i näringslivet efter '+56,
        'description': '',
        'updated': '2025-12-22T07:00:00Z',
        'firstPeriod': '2020M01',
        ... +8
    },
    {
        'id': 'TAB4344',
        'label': 'Antal pågående anställningar i näringslivet efter '+65,
        'description': '',
        'updated': '2025-12-22T07:00:00Z',
        'firstPeriod': '2015M04',
        ... +8
    },
    ... +5139
]

Tables are organised into subjects and are categorised into different paths. To see all paths available, use .get_paths().

api.get_paths()

[
    {
        'id': 'AA',
        'label': 'Ämnesövergripande statistik'
    },
    {
        'id': 'AA0003',
        'label': 'Registerdata för integration'
    },
    {
        'id': 'AA0003B',
        'label': 'Statistik med inriktning mot arbetsmarknaden'
    },
    {
        'id': 'AA0003C',
        'label': 'Statistik med inriktning mot flyttmönster'
    },
    {
        'id': 'AA0003D',
        'label': 'Statistik med inriktning mot boende'
    },
    ... +833
]

It’s also possible to filter the paths, for example to get all paths related to a specific subject like “Befolkning”.

api.get_paths(path_id="BE")

[
    {
        'id': 'BE',
        'label': 'Befolkning'
    },
    {
        'id': 'BE0001',
        'label': 'Namnstatistik'
    },
    {
        'id': 'BE0001D',
        'label': 'Nyfödda – Äldre tabeller som inte längre uppdatera'+1
    },
    {
        'id': 'BE0001G',
        'label': 'Hela befolkningen – Äldre tabeller som inte längre'+11
    },
    {
        'id': 'BE0101',
        'label': 'Befolkningsstatistik'
    },
    ... +29
]

To get all tables that are in specific path you can use .tables_on_path(). Here we take a closer look at “Folkmängd”.

api.tables_on_path(path_id="BE0101A")

[
    {
        'id': 'TAB6471',
        'label': 'Folkmängden per månad efter region, ålder och kön.'+20,
        'paths': [
            [...]
        ]
    },
    {
        'id': 'TAB5444',
        'label': 'Folkmängden per månad efter region, ålder och kön.'+19,
        'paths': [
            [...]
        ]
    },
    {
        'id': 'TAB5890',
        'label': 'Folkmängden efter ålder och kön. År 1860-2024',
        'paths': [
            [...]
        ]
    },
    {
        'id': 'TAB638',
        'label': 'Folkmängden efter region, civilstånd, ålder och kö'+16,
        'paths': [
            [...]
        ]
    },
    {
        'id': 'TAB4537',
        'label': 'Folkmängden per distrikt, landskap, landsdel eller'+30,
        'paths': [
            [...]
        ]
    },
    ... +3
]

Searching for tables

It’s also possible to search for tables using the .search() method.

# Keeping it simple and just look for tables updated in the past 180 days matching the query string
results = api.search(query="energi", past_days=180)

# Checking how many tables there are in the results
len(results.get("tables"))

We can also get the labels and ID’s, or any other metadata, to find out more.

[
    {k: v for k, v in table.items() if k in ("id", "label")}
    for table in results.get("tables")
]

[
    {
        'id': 'TAB3859',
        'label': 'Utvinning av energi- och odlingstorv i 1000-tal ku'+22
    },
    {
        'id': 'TAB5179',
        'label': 'Investeringsutgifter för kommuner efter region och'+33
    },
    {
        'id': 'TAB5918',
        'label': 'Konsumentprisindex med fast ränta exklusive energi'+44
    },
    {
        'id': 'TAB6593',
        'label': 'Konsumentprisindex med fast ränta exklusive energi'+43
    },
    {
        'id': 'TAB3886',
        'label': 'Kostnader i tkr för inköpt energi inom mineral- oc'+52
    },
    ... +15
]

Getting table metadata

There are two methods to get table metadata. You can get the full metadata information by simply calling .get_table_metadata().

If you’re interested in the details about variables of a table you can also use .get_table_variables(). This method returns information in a more condensed way which may be easier to overview.

# Use the table ID
tab_vars = api.get_table_variables("TAB2706")

# Let's check out Region
tab_vars.get("Region")

{
    'label': 'region',
    'category': {
        'label': {
            '0114': 'Upplands Väsby',
            '0115': 'Vallentuna',
            '0117': 'Österåker',
            '0120': 'Värmdö',
            '0123': 'Järfälla',
            ... +318
        }
    },
    'elimination': True,
    'codelists': [
        {
            'id': 'vs_RegionKommun07+BaraEjAggr',
            'label': 'Kommuner och Bara kommun (1229) som upphörde 1976'
        },
        {
            'id': 'vs_RegionValkrets99',
            'label': 'Valkretsar'
        },
        {
            'id': 'vs_RegionValkretsTot99',
            'label': 'Totalt, alla redovisade valkretsar'
        }
    ]
}

As can be seen above elimination is True for "Region", so the variable can be skipped over. But there’s also a few code lists associated with the variable.

Code lists

Getting information about code lists can be done with .get_code_list().

# Fetching and unpacking 'values' of 'vs_RegionValkrets99'
api.get_code_list("vs_RegionValkrets99").get("values")

[
    {
        'code': 'VR1',
        'label': 'Stockholms kommuns valkrets',
        'valueMap': [
            'VR1'
        ]
    },
    {
        'code': 'VR2',
        'label': 'Stockholms läns valkrets',
        'valueMap': [
            'VR2'
        ]
    },
    {
        'code': 'VR3',
        'label': 'Uppsala läns valkrets',
        'valueMap': [
            'VR3'
        ]
    },
    {
        'code': 'VR4',
        'label': 'Södermanlands läns valkrets',
        'valueMap': [
            'VR4'
        ]
    },
    {
        'code': 'VR5',
        'label': 'Östergötlands läns valkrets',
        'valueMap': [
            'VR5'
        ]
    },
    ... +26
]

The codes can then be used in a query for a selection based on the code list. More on that in Getting table data.

Getting table data

To get data with .get_table_data() we need a few things.

A table ID
A selection of value codes from variables
A code list (optional)

# Getting some election results for specific regions, using a code list to match value codes
dataset = api.get_table_data(
    "TAB2706",
    value_codes={
        "ContentsCode": "ME0104B6",
        "Tid": "2022",
        "Region": ["VR2", "VR3"],
        "Partimm": [
            "M",
            "C",
            "FP",
            "KD",
            "MP",
            "S",
            "V",
            "SD",
            "ÖVRIGA",
            "OGILTIGA",
            "VALSKOLKARE",
        ],
    },
    code_list={"Region": "vs_RegionValkrets99"},
)

# A finished dataset looks like this
dataset

[
    {
        'region': 'VR2: Stockholms läns valkrets',
        'parti mm': 'Moderaterna',
        'tabellinnehåll': 'Antal röster',
        'valår': '2022',
        'value': 197466
    },
    {
        'region': 'VR2: Stockholms läns valkrets',
        'parti mm': 'Centerpartiet',
        'tabellinnehåll': 'Antal röster',
        'valår': '2022',
        'value': 60776
    },
    {
        'region': 'VR2: Stockholms läns valkrets',
        'parti mm': 'Liberalerna',
        'tabellinnehåll': 'Antal röster',
        'valår': '2022',
        'value': 48949
    },
    {
        'region': 'VR2: Stockholms läns valkrets',
        'parti mm': 'Kristdemokraterna',
        'tabellinnehåll': 'Antal röster',
        'valår': '2022',
        'value': 40207
    },
    {
        'region': 'VR2: Stockholms läns valkrets',
        'parti mm': 'Miljöpartiet',
        'tabellinnehåll': 'Antal röster',
        'valår': '2022',
        'value': 42284
    },
    ... +17
]

Loading into dataframes

The native format of the returned dataset can now easily be loaded into a dataframe.

For instance polars:

import polars as pl

pl.DataFrame(dataset)

shape: (22, 5)

region	parti mm	tabellinnehåll	valår	value
str	str	str	str	i64
"VR2: Stockholms läns valkrets"	"Moderaterna"	"Antal röster"	"2022"	197466
"VR2: Stockholms läns valkrets"	"Centerpartiet"	"Antal röster"	"2022"	60776
"VR2: Stockholms läns valkrets"	"Liberalerna"	"Antal röster"	"2022"	48949
"VR2: Stockholms läns valkrets"	"Kristdemokraterna"	"Antal röster"	"2022"	40207
"VR2: Stockholms läns valkrets"	"Miljöpartiet"	"Antal röster"	"2022"	42284
…	…	…	…	…
"VR3: Uppsala läns valkrets"	"Vänsterpartiet"	"Antal röster"	"2022"	19543
"VR3: Uppsala läns valkrets"	"Sverigedemokraterna"	"Antal röster"	"2022"	45237
"VR3: Uppsala läns valkrets"	"övriga partier"	"Antal röster"	"2022"	4134
"VR3: Uppsala läns valkrets"	"ogiltiga valsedlar"	"Antal röster"	"2022"	2410
"VR3: Uppsala läns valkrets"	"ej röstande"	"Antal röster"	"2022"	40954

But also pandas and pyarrow:

import pandas as pd

pd.DataFrame(dataset)

	region	parti mm	tabellinnehåll	valår	value
0	VR2: Stockholms läns valkrets	Moderaterna	Antal röster	2022	197466
1	VR2: Stockholms läns valkrets	Centerpartiet	Antal röster	2022	60776
2	VR2: Stockholms läns valkrets	Liberalerna	Antal röster	2022	48949
3	VR2: Stockholms läns valkrets	Kristdemokraterna	Antal röster	2022	40207
4	VR2: Stockholms läns valkrets	Miljöpartiet	Antal röster	2022	42284
5	VR2: Stockholms läns valkrets	Socialdemokraterna	Antal röster	2022	223056
6	VR2: Stockholms läns valkrets	Vänsterpartiet	Antal röster	2022	51623
7	VR2: Stockholms läns valkrets	Sverigedemokraterna	Antal röster	2022	144315
8	VR2: Stockholms läns valkrets	övriga partier	Antal röster	2022	13836
9	VR2: Stockholms läns valkrets	ogiltiga valsedlar	Antal röster	2022	7695
10	VR2: Stockholms läns valkrets	ej röstande	Antal röster	2022	176249
11	VR3: Uppsala läns valkrets	Moderaterna	Antal röster	2022	45457
12	VR3: Uppsala läns valkrets	Centerpartiet	Antal röster	2022	18040
13	VR3: Uppsala läns valkrets	Liberalerna	Antal röster	2022	12465
14	VR3: Uppsala läns valkrets	Kristdemokraterna	Antal röster	2022	14766
15	VR3: Uppsala läns valkrets	Miljöpartiet	Antal röster	2022	16750
16	VR3: Uppsala läns valkrets	Socialdemokraterna	Antal röster	2022	72499
17	VR3: Uppsala läns valkrets	Vänsterpartiet	Antal röster	2022	19543
18	VR3: Uppsala läns valkrets	Sverigedemokraterna	Antal röster	2022	45237
19	VR3: Uppsala läns valkrets	övriga partier	Antal röster	2022	4134
20	VR3: Uppsala läns valkrets	ogiltiga valsedlar	Antal röster	2022	2410
21	VR3: Uppsala läns valkrets	ej röstande	Antal röster	2022	40954

import pyarrow as pa

pa.Table.from_pylist(dataset)

pyarrow.Table
region: string
parti mm: string
tabellinnehåll: string
valår: string
value: int64
----
region: [["VR2: Stockholms läns valkrets","VR2: Stockholms läns valkrets","VR2: Stockholms läns valkrets","VR2: Stockholms läns valkrets","VR2: Stockholms läns valkrets",...,"VR3: Uppsala läns valkrets","VR3: Uppsala läns valkrets","VR3: Uppsala läns valkrets","VR3: Uppsala läns valkrets","VR3: Uppsala läns valkrets"]]
parti mm: [["Moderaterna","Centerpartiet","Liberalerna","Kristdemokraterna","Miljöpartiet",...,"Vänsterpartiet","Sverigedemokraterna","övriga partier","ogiltiga valsedlar","ej röstande"]]
tabellinnehåll: [["Antal röster","Antal röster","Antal röster","Antal röster","Antal röster",...,"Antal röster","Antal röster","Antal röster","Antal röster","Antal röster"]]
valår: [["2022","2022","2022","2022","2022",...,"2022","2022","2022","2022","2022"]]
value: [[197466,60776,48949,40207,42284,...,19543,45237,4134,2410,40954]]

Using wildcards

Wildcards are useful, and here is an example using wildcards for a larger query.

# Using wildcards here to get all the municipalities in Stockholm, all months of 2024, all genders and 5-year age groups.
# The somewhat cryptic ContentsCode represents count
population_data = api.get_table_data(
    "TAB5444",
    value_codes={
        "Alder": "*",
        "Region": "01*",
        "Tid": "2024*",
        "Kon": "*",
        "ContentsCode": "000003O5",
    },
    code_list={"Alder": "agg_Ålder5år", "Region": "vs_RegionKommun07"},
)

# This returns over ten thousand rows of data
len(population_data)

Large queries and batching

pxwebpy allows for very large queries by using automatic batching to stay within the rate limits of the API.

Consider the following query for population per year ("TAB1267"):

codes = {"ContentsCode": "BE0101A9", "Region": "*", "Alder": "*", "Kon": "*", "Tid": "*"}
lists = {"Region": "vs_RegionKommun07", "Alder": "vs_Ålder1årA"}

This query would produce over 1 million data cells, overshooting the data cell limit of the API (150 000 in this case).

To handle this pxwebpy will break up the query into several subqueries to stay within the limit of data cells while also respecting the rate limit of the number of queries allowed within a give time window. Calls are multithreaded to fetch results as fast as possible.

# Executing the large query
data = api.get_table_data("TAB1267", value_codes=codes, code_list=lists)

# And then loading the result into a dataframe
pl.DataFrame(data)

shape: (1_347_340, 6)

region	ålder	kön	tabellinnehåll	år	value
str	str	str	str	str	i64
"0114: Upplands Väsby"	"0 år"	"män"	"Antal"	"2002"	207
"0114: Upplands Väsby"	"0 år"	"män"	"Antal"	"2003"	218
"0114: Upplands Väsby"	"0 år"	"män"	"Antal"	"2004"	188
"0114: Upplands Väsby"	"0 år"	"män"	"Antal"	"2005"	201
"0114: Upplands Väsby"	"0 år"	"män"	"Antal"	"2006"	218
…	…	…	…	…	…
"2584: Kiruna"	"100+ år"	"kvinnor"	"Antal"	"2020"	2
"2584: Kiruna"	"100+ år"	"kvinnor"	"Antal"	"2021"	1
"2584: Kiruna"	"100+ år"	"kvinnor"	"Antal"	"2022"	1
"2584: Kiruna"	"100+ år"	"kvinnor"	"Antal"	"2023"	1
"2584: Kiruna"	"100+ år"	"kvinnor"	"Antal"	"2024"	2

In-memory caching

By default pxwebpy uses in-memory caching for API responses, which can be useful for exploration and iterative use. Caching both reduces the load on the API and speeds up execution. However it can be turned off if needed simply by setting the attribute disable_cache to True.

Debugging

Sometimes things don’t work as expected. To troubleshoot you can set the log level to DEBUG to get verbose output of the queries sent to the API.

import logging

# Set the log level to debug for the root logger which will affect all loaded modules, which can be quite noisy
logging.basicConfig(level=logging.DEBUG)

# Or set up a handler just for pxwebpy
handler = logging.StreamHandler()
handler.setFormatter(
    logging.Formatter("%(levelname)s:%(thread)s:%(name)s:%(message)s")
)
logging.getLogger("pxweb").addHandler(handler)
logging.getLogger("pxweb").setLevel(logging.DEBUG)

# Set up an API object with the logging enabled
ssb = PxApi("ssb")

DEBUG:140523950615424:pxweb.api:Setting up the client
DEBUG:140523950615424:pxweb._internal.client:Getting the API configuration
DEBUG:140523950615424:pxweb._internal.client:GET request prepared for https://data.ssb.no/api/pxwebapi/v2/config
DEBUG:140523950615424:pxweb._internal.client:Request with parameters: {'lang': None}
DEBUG:140523950615424:pxweb._internal.client:Sending request, attempt 1
DEBUG:140523950615424:pxweb._internal.client:Query size is limited to 800000 number of data cells
DEBUG:140523950615424:pxweb._internal.client:No rate limit is configured
DEBUG:140523950615424:pxweb.api:Getting the number of tables available
DEBUG:140523950615424:pxweb._internal.client:GET request prepared for https://data.ssb.no/api/pxwebapi/v2/tables?lang=no
DEBUG:140523950615424:pxweb._internal.client:Request with parameters: {'lang': 'no'}
DEBUG:140523950615424:pxweb._internal.client:Sending request, attempt 1