Why does pandas join with mis-sorted multiindex give stack overflow?


When joining two pandas dataframes with MultiIndex that are not "in order", my pandas installation makes python crash with error code 0xC00000FD. It took me a while to find my bug, and when I found it I got even more confused. Why did this happen, and how should I get better at spotting it?

Consider the following code:

import pandas as pd

df = pd.DataFrame.from_records([
    ("foo", 1, .1),
    ("foo", 2, .2),
    ("bar", 1, .3),
    ("bar", 2, .4)
], columns=["Level1", "Level2", "Value"])

df2 = df.set_index(["Level1","Level2"])
df3 = df.set_index(["Level2","Level1"])

combination = pd.merge(left=df2, right=df2, left_index=True, right_index=True)
combination2 = pd.merge(left=df2, right=df3, left_index=True, right_index=True)

It gives the following output:

  Level1  Level2  Value
0    foo       1    0.1
1    foo       2    0.2
Level1 Level2       
foo    1         0.1
       2         0.2
Level2 Level1       
1      foo       0.1
2      foo       0.2
Process finished with exit code -1073741571 (0xC00000FD)

The problem arose when I had a table, made some pivots/melts, and wanted to go back to the original multiindex and join my results. At first I did not understand the error message. From this page I learned that the error code is a stack overflow. From this PR to pandas I learned that the recursion is somehow related to calculating group indices. This information made me able to hunt down my bug and also create a minimal reproducing example.

Now to my question:

  • Why does this happen? Shouldn't Pandas give some more informative error? My intuition says that it should be easy to spot.
  • Is there any way to programatically identify these errors instead of finding yourself with a stack overflow, uninformative errors and a broken debugger?

EDIT The error as stated occurs in the latsest python and pandas to date, pandas 0.24.2 and python 3.7.3

asked on Stack Overflow May 28, 2019 by LudvigH • edited May 28, 2019 by LudvigH

0 Answers

Nobody has answered this question yet.

User contributions licensed under CC BY-SA 3.0