pyspark.pandas.json_normalize#
- pyspark.pandas.json_normalize(data, sep='.')[source]#
Normalize semi-structured JSON data into a flat table.
New in version 4.0.0.
- Parameters
- datadict or list of dicts
Unserialized JSON objects.
- sepstr, default ‘.’
Nested records will generate names separated by sep.
- Returns
- DataFrame
See also
DataFrame.to_json
Convert the pandas-on-Spark DataFrame to a JSON string.
Examples
>>> data = [ ... {"id": 1, "name": "Alice", "address": {"city": "NYC", "zipcode": "10001"}}, ... {"id": 2, "name": "Bob", "address": {"city": "SF", "zipcode": "94105"}}, ... ] >>> ps.json_normalize(data) id name address.city address.zipcode 0 1 Alice NYC 10001 1 2 Bob SF 94105