When joining two pandas dataframes with MultiIndex that are not "in order", my pandas installation makes python crash with error code 0xC00000FD. It took me a while to find my bug, and when I found it I got even more confused. Why did this happen, and how should I get better at spotting it?
Consider the following code:
import pandas as pd df = pd.DataFrame.from_records([ ("foo", 1, .1), ("foo", 2, .2), ("bar", 1, .3), ("bar", 2, .4) ], columns=["Level1", "Level2", "Value"]) df2 = df.set_index(["Level1","Level2"]) df3 = df.set_index(["Level2","Level1"]) combination = pd.merge(left=df2, right=df2, left_index=True, right_index=True) print("ok") combination2 = pd.merge(left=df2, right=df3, left_index=True, right_index=True) print("fail")
It gives the following output:
Level1 Level2 Value 0 foo 1 0.1 1 foo 2 0.2 Value Level1 Level2 foo 1 0.1 2 0.2 Value Level2 Level1 1 foo 0.1 2 foo 0.2 ok Process finished with exit code -1073741571 (0xC00000FD)
The problem arose when I had a table, made some pivots/melts, and wanted to go back to the original multiindex and join my results. At first I did not understand the error message. From this page I learned that the error code is a stack overflow. From this PR to pandas I learned that the recursion is somehow related to calculating group indices. This information made me able to hunt down my bug and also create a minimal reproducing example.
Now to my question:
EDIT The error as stated occurs in the latsest python and pandas to date, pandas 0.24.2 and python 3.7.3
User contributions licensed under CC BY-SA 3.0