pyspark.pandas.json_normalize#

pyspark.pandas.json_normalize(data, sep='.')[source]#

Normalize semi-structured JSON data into a flat table.

New in version 4.0.0.

Parameters
datadict or list of dicts

Unserialized JSON objects.

sepstr, default ‘.’

Nested records will generate names separated by sep.

Returns
DataFrame

See also

DataFrame.to_json

Convert the pandas-on-Spark DataFrame to a JSON string.

Examples

>>> data = [
...     {"id": 1, "name": "Alice", "address": {"city": "NYC", "zipcode": "10001"}},
...     {"id": 2, "name": "Bob", "address": {"city": "SF", "zipcode": "94105"}},
... ]
>>> ps.json_normalize(data)
   id   name address.city address.zipcode
0   1  Alice          NYC           10001
1   2    Bob           SF           94105