Convert Pandas.DataFrame, Series And List To Each Other - Nkmk Note

This article explains how to convert between pandas DataFrame/Series and Python built-in lists.

Contents
  • Convert lists to DataFrame and Series
    • Convert lists to DataFrame and Series using pd.DataFrame() and pd.Series()
    • For lists containing labels
  • Convert DataFrame and Series to lists
    • Convert Series to a list using tolist() or to_list()
    • Convert DataFrame to a list using values and tolist()
    • Convert Series and DataFrame to lists including index and columns
    • Convert index and columns to lists

Although the term "convert" is used for simplicity, the process actually involves creating a new object of a different type, while the original object remains unchanged.

For conversions between DataFrame/Series and NumPy arrays (ndarray), as well as between DataFrame and Series, refer to the following articles.

  • Convert between pandas DataFrame/Series and NumPy array
  • pandas: Convert between DataFrame and Series

The pandas version used in this article is as follows. Note that functionality may vary between versions.

import pandas as pd print(pd.__version__) # 2.1.4 source: pandas_list.py

Convert lists to DataFrame and Series

Convert lists to DataFrame and Series using pd.DataFrame() and pd.Series()

By specifying a list as the first argument in the pd.Series() or pd.DataFrame() constructors, a Series or DataFrame is generated from the list.

  • pandas.Series — pandas 2.1.4 documentation
  • pandas.DataFrame — pandas 2.1.4 documentation
l_1d = [0, 10, 20] print(pd.Series(l_1d)) # 0 0 # 1 10 # 2 20 # dtype: int64 l_2d = [[0, 10, 20], [30, 40, 50]] print(pd.DataFrame(l_2d)) # 0 1 2 # 0 0 10 20 # 1 30 40 50 source: pandas_list.py

Specifying a one-dimensional list directly in pd.DataFrame() creates a single-column DataFrame. When specified as [one_dimensional_list], it creates a single-row DataFrame.

print(pd.DataFrame(l_1d)) # 0 # 0 0 # 1 10 # 2 20 print(pd.DataFrame([l_1d])) # 0 1 2 # 0 0 10 20 source: pandas_list.py

You can transpose a two-dimensional list (list of lists).

  • Transpose 2D list in Python (swap rows and columns)
print(pd.DataFrame(zip(*l_2d))) # 0 1 # 0 0 30 # 1 10 40 # 2 20 50 source: pandas_list.py

Specify row and column names: index, columns

Row names can be specified with the index argument, and column names with the columns argument.

print(pd.Series(l_1d, index=['X', 'Y', 'Z'])) # X 0 # Y 10 # Z 20 # dtype: int64 print(pd.DataFrame(l_2d, index=['X', 'Y'], columns=['A', 'B', 'C'])) # A B C # X 0 10 20 # Y 30 40 50 source: pandas_list.py

It is also possible to set or change the index and columns after creating a Series or a DataFrame.

  • pandas: Rename column/index names of DataFrame

Specify data type: dtype

The data type (dtype) of each column in a DataFrame, as well as that of a Series, is automatically determined based on the values in the list.

For example, if a column contains a mix of integers (int) and floating-point numbers (float), the data type of the column becomes float, and if it contains a mix of numbers and strings, the data type becomes object.

l_2d_multi = [[0, 0.0, 'abc', 123, 'abc'], [10, 0.1, 'xyz', 1.23, 100]] print(pd.DataFrame(l_2d_multi)) # 0 1 2 3 4 # 0 0 0.0 abc 123.00 abc # 1 10 0.1 xyz 1.23 100 print(pd.DataFrame(l_2d_multi).dtypes) # 0 int64 # 1 float64 # 2 object # 3 float64 # 4 object # dtype: object source: pandas_list.py

It is also possible to specify the data type using the dtype argument of pd.DataFrame() or pd.Series().

print(pd.DataFrame(l_2d, dtype=float)) # 0 1 2 # 0 0.0 10.0 20.0 # 1 30.0 40.0 50.0 source: pandas_list.py

For more details on data types (dtype) in pandas, refer to the following article.

  • pandas: How to use astype() to cast dtype of DataFrame

For lists containing labels

To create a Series from a list of label-value pairs, first decompose the list into labels and values, and then pass these to pd.Series().

  • Transpose 2D list in Python (swap rows and columns)
l_1d_index = [['X', 0], ['Y', 1], ['Z', 2]] index, values = zip(*l_1d_index) print(index) # ('X', 'Y', 'Z') print(values) # (0, 1, 2) print(pd.Series(values, index=index)) # X 0 # Y 1 # Z 2 # dtype: int64 source: pandas_list.py

To create a DataFrame from a list that includes labels and multiple values, first load the entire list into the DataFrame, and then set the index using the set_index() method.

  • pandas: Assign existing column to the DataFrame index with set_index()
l_2d_index = [['X', 0, 0.0], ['Y', 1, 0.1], ['Z', 2, 0.2]] df_index = pd.DataFrame(l_2d_index, columns=['idx', 'A', 'B']) print(df_index) # idx A B # 0 X 0 0.0 # 1 Y 1 0.1 # 2 Z 2 0.2 print(df_index.set_index('idx')) # A B # idx # X 0 0.0 # Y 1 0.1 # Z 2 0.2 source: pandas_list.py

If the original list also includes column names, use the first row for the columns argument and the rest of the rows (obtained by slicing) as the first argument.

  • How to slice a list, string, tuple in Python
l_2d_index_columns = [['idx', 'A', 'B'], ['X', 0, 0.0], ['Y', 1, 0.1], ['Z', 2, 0.2]] df_index_columns = pd.DataFrame(l_2d_index_columns[1:], columns=l_2d_index_columns[0]) print(df_index_columns) # idx A B # 0 X 0 0.0 # 1 Y 1 0.1 # 2 Z 2 0.2 print(df_index_columns.set_index('idx')) # A B # idx # X 0 0.0 # Y 1 0.1 # Z 2 0.2 source: pandas_list.py

Convert DataFrame and Series to lists

Convert Series to a list using tolist() or to_list()

Series can be converted to a list using the tolist() or to_list() methods.

  • pandas.Series.tolist — pandas 2.1.4 documentation
  • pandas.Series.to_list — pandas 2.1.4 documentation
s = pd.Series([0, 10, 20]) print(s) # 0 0 # 1 10 # 2 20 # dtype: int64 print(s.tolist()) # [0, 10, 20] print(s.to_list()) # [0, 10, 20] source: pandas_list.py

Convert DataFrame to a list using values and tolist()

As of pandas version 2.1.4, DataFrame does not have the tolist() or to_list() methods. To convert a DataFrame to a list, first convert it into a NumPy array (ndarray) using the values attribute, and then use the tolist() method of ndarray.

  • Convert between pandas DataFrame/Series and NumPy array
  • Convert between NumPy array and Python list
df = pd.DataFrame([[0, 10, 20], [30, 40, 50]]) print(df) # 0 1 2 # 0 0 10 20 # 1 30 40 50 print(df.values.tolist()) # [[0, 10, 20], [30, 40, 50]] source: pandas_list.py

Convert Series and DataFrame to lists including index and columns

To keep the index as part of the list, use the reset_index() method to reset the index and turn it into a data column.

  • pandas: Reset index of DataFrame, Series with reset_index()
s_index = pd.Series([0, 1, 2], index=['X', 'Y', 'Z']) print(s_index) # X 0 # Y 1 # Z 2 # dtype: int64 print(s_index.reset_index()) # index 0 # 0 X 0 # 1 Y 1 # 2 Z 2 print(s_index.reset_index().values.tolist()) # [['X', 0], ['Y', 1], ['Z', 2]] source: pandas_list.py df_index = pd.DataFrame([[0, 1, 2], [3, 4, 5]], index=['A', 'B'], columns=['X', 'Y', 'Z']) print(df_index) # X Y Z # A 0 1 2 # B 3 4 5 print(df_index.reset_index()) # index X Y Z # 0 A 0 1 2 # 1 B 3 4 5 print(df_index.reset_index().values.tolist()) # [['A', 0, 1, 2], ['B', 3, 4, 5]] source: pandas_list.py

As of version 2.1.4, DataFrame has no method to reset columns. To include both index and columns in the list, first apply reset_index(), then transpose using .T, apply reset_index() again, and finally revert the transposition with .T. A more efficient method may exist.

  • pandas: Transpose DataFrame (swap rows and columns)
print(df_index.reset_index().T.reset_index().T.values.tolist()) # [['index', 'X', 'Y', 'Z'], ['A', 0, 1, 2], ['B', 3, 4, 5]] source: pandas_list.py

Convert index and columns to lists

The index attribute of Series, as well as the index and columns attributes of DataFrame, are all of type Index. They can be converted to lists using the tolist() or to_list() methods.

  • pandas.Index.tolist — pandas 2.1.4 documentation
  • pandas.Index.to_list — pandas 2.1.4 documentation
s_index = pd.Series([0, 1, 2], index=['X', 'Y', 'Z']) print(s_index) # X 0 # Y 1 # Z 2 # dtype: int64 print(s_index.index) # Index(['X', 'Y', 'Z'], dtype='object') print(s_index.index.tolist()) # ['X', 'Y', 'Z'] source: pandas_list.py df_index = pd.DataFrame([[0, 1, 2], [3, 4, 5]], index=['A', 'B'], columns=['X', 'Y', 'Z']) print(df_index) # X Y Z # A 0 1 2 # B 3 4 5 print(df_index.index) # Index(['A', 'B'], dtype='object') print(df_index.index.tolist()) # ['A', 'B'] print(df_index.columns) # Index(['X', 'Y', 'Z'], dtype='object') print(df_index.columns.tolist()) # ['X', 'Y', 'Z'] source: pandas_list.py

Note that an Index allows direct iteration in a for loop to extract elements and supports using [] for specific index-based retrieval. Although slicing is possible, modifying elements directly within an Index is not. Thus, conversion to a list is unnecessary if you only need to access elements.

for i in df_index.columns: print(i, type(i)) # X <class 'str'> # Y <class 'str'> # Z <class 'str'> print(df_index.columns[0]) # X print(df_index.columns[:2]) # Index(['X', 'Y'], dtype='object') # df_index.columns[0] = 'x' # TypeError: Index does not support mutable operations source: pandas_list.py

Tag » A List Of Series To Dataframe